bStats syntax for Skript
Dere-Wah opened this issue ยท 14 comments
Suggestion
I assume you all know what bStats is and how it works. It's basically the standard for plugin metrics data, and as a hobbyist Skript & plugins developer myself, seeing live-data about people using my work is really gratifying.
Now, this idea has been stuck in my head for a long time. But I am definetly am not the only one that thought of this. There is an old issue thread (from about 6 years ago) #1605 that already suggested this feature. This thread got dismissed because of the assumption that bStats is only made for Bukkit/Spigot plugins, and any attempt at a Skript implementation would not be likes by the bStats team.
Well, it turn out none actually tried to contact the bStats developer, to inquire about this. I actually asked them, and they were super helpful and down to make this integration a reality. They are 100% fine with making a new category for Skripts and sending feedback on Skript's impact on their system.
Now, Addon or non-Addon? Well, while I was thinking about this, I couldn't help but try and make an attempt at an addon implementation: https://github.com/Dere-Wah/Skript-bStats
You can read and view all the available syntax in the readme. Making this addon was a huge help in understanding how the metrics class works and how to implement it in Skript syntax.
What I learned by making the addon
For example, in Java Metrics you first register your addon and send custom charts data via runnable functions. Whenever bStats decides to send data updates, it makes the plugin run these functions and sends to its server the results. Also the plugin running instance is just one so you never are at risk of making multiple instances of the Metrics class for the same ID, unless you are deliberately doing so.
In skript this is different, having just an effect that creates a new Metrics class for each execution, even just on load, would result in a spam of data and rate-limit to the bStats servers. The idea is to have a basic Singleton class to store all of the Metrics that Skript starts, and afterwards close them (maybe when a single Script is disabled or when an effect is executed or when simply a server reload is performed). Here is my java approach, using a wrapper metric for the bStats metric (SkriptMetric)
-> register bstats metric with id x (If not already present, creates a SkriptMetric that is INACTIVE, and adds it to the singleton.)
-> add custom graphs tu metric with id x (If the SkriptMetric is present, and is NOT active yet, adds as many graphs as desired.)
-> start bstats metric (if the SkriptMetric is present, actually takes all the custom graphs etc. and creates a bStats Metric. The metric is now active and will send data periodically to the custom graphs).
This mutex-lock system is necessary to prevent sending duplicate data to custom graphs. Let's say we only allow 1 metric per ID, but we don't implement any duplicate graphs limit (since we want users to add as many custom graphs as they want).
users might have a system such as
-> register bstats metric (if not present, crates a bStats Metric that is live and sends data)
-> add custom graphs to metric
But when reloading, for each reload /sk reload, a new duplicate graph is gonna be added to the original instance of the Metrics class, ending up in duplicate data.
EDIT: sorry for the huge yapping in this section, this might not be super clear. If anyone has any doubt about this implementation, finds anything unclear or wants to suggest a different approach I'll gladly help!
The not-so-good part of making an addon, is, well, that owners have to download extra plugins just for metrics. While I believe people might do this to help developers, this is definetly not gonna help the spread of this feature, but is just gonna bind devs to having to rely on a separate addon that not everyone might want to use (because it only does this task) when distributing the skript. Having it inside of Skript would make all of this completely seamless and the metrics would be collected the same way as normal plugins.
Why?
This feature would be a huge boost in the motivation of developers that release their scripts for free online, as it will actually verify and show how many people are using their skripts.
Other
Now, is this actually a good idea?
My take at this is that, yes, as a developer seeing stats is extremely motivating. And no, the "Skript Developers are not real developers and are just unexperienced devs that have no ethics or morals" (quote) is just not a valid reason. Any dev should be able to track data about their work (anonymously, ofc), especially if there is a platform such as bstats that is willing to allow it.
Also, allowing this sort of syntax on a skript-level does not create any extra-security issue in any way (speaking of end-user usage). Any plugin can leak non-anonymous information such as IP-addresses if it's malicious, just as any Script can. The only difference is that Scripts are actually open source and easy to read, so as a owner it might be even easier to spot suspicious lines such as
`send value %ip of player% to bstats metrics.
In conclusion, I'm making this issue to see how the team might welcome this feature. If this is well received I hope to be able to begin working on a PR and have this added to Skript :)
Agreement
- I have read the guidelines above and affirm I am following them with this suggestion.
It seems rather impractical to me, perhaps it would be more beneficial as it is as an addon
It seems rather impractical to me, perhaps it would be more beneficial as it is as an addon
Mh, the point is having a separate addon and requiring server owners to download it specifically is even more impractical.
Also I'm sure we can figure out syntax to make the building process of a SkriptMetric class easy and intuitive.
My main worries are about the malleability of scripts. Plugins have the advantage of being rather gatekept in knowledge; you can't super easily go in and edit values unless you know what you're doing and have the tools. This means that the reported values will usually always be accurate. With scripts, you don't have this guarantee at all. I worry this will lead to a single stat id having dozens of slightly different versions out there, all reporting different or false information back. It's also super easy to poison stats that way.
I think if this were to be added, we'd probably be best off not adding custom metrics and keeping it to a fixed structure.
My main worries are about the malleability of scripts. Plugins have the advantage of being rather gatekept in knowledge; you can't super easily go in and edit values unless you know what you're doing and have the tools. This means that the reported values will usually always be accurate. With scripts, you don't have this guarantee at all. I worry this will lead to a single stat id having dozens of slightly different versions out there, all reporting different or false information back. It's also super easy to poison stats that way.
I think if this were to be added, we'd probably be best off not adding custom metrics and keeping it to a fixed structure.
While yes, Skripts are definitely easier and malleable, no real harm could be done to individual metrics. bStats has filter options in place where you can add your own filters to your graphs if you notice any weird data. This alone would cleanup any edited strings (such as script version strings). But again, owners usually access Skripts to edit Options for Messages, features or items types. Version strings edit or stuff like that doesn't even happen that often and is a problem as much as it is in the Java Plugin world.
bStats has filter options in place where you can add your own filters to your graphs if you notice any weird data
This is one of the issue Skript is trying to avoid โ prevention is better than cure. Even if bStats are happy with collaborating with Skript to implement such feature, these sort of data can be heavily abused or manipulated wrongly where bStats might retract the implementation in future after realising how wrong this will potentially go. Users should not worry about any security issues, although for this case is indirectly, but this is definitely one of the concern I can forsee from a POV as a SkriptLang member.
This is one of the issue Skript is trying to avoid โ prevention is better than cure.
bStats was created exactly with that in mind. Data manipulation, if you are a malicious user, is easy as hell, whether you're on Skript or on Java. They have security options in place, but in order to collect metrics that's how it has to be done, and they have experience handling the requests load.
Even if bStats are happy with collaborating with Skript to implement such feature, these sort of data can be heavily abused or manipulated wrongly where bStats might retract the implementation in future after realising how wrong this will potentially go.
Manipulating Metrics would only fake more users to a skript or just show no data at all, as if metrics were disabled. And if people are sending custom strings with offensive stuff or leaking data via custom charts, well, again that can be filtered out, but it can be done as easily via a malicious POST request.
They already gave the idea their blessing, so I don't see why we're still debating about whether or not they will like this.
The only case where we need to be cautious, is with the implementation: Scripts get reloaded much more often than Plugins and if we are not careful with how creating metrics classes works, we'll risk creating a new instance after each sk reload and end up duplicating data. This would be unintended behaviour and would spam requests without the user even knowing, and that's where we'd be at fault. (This is already handled in my Addon, for example.)
data can be heavily abused or manipulated
Although this is true, where users could easily
- Delete the lines of code, disallowing the information to be logged
- Editing it for the information to be logged in a different service / statistics
- Manipulating it to have higher statistics
These should be listed more of as a con rather than a write off.
The sole intention of this addition is to allow developers to view the statistics of their script(s), that they publicly release, to have a sense of gratisfaction. It is not to gloat or compete with others as options 2 and 3 benefit those. If users were to do either, I have no clue what they would gain from it besides a false sense of accomplishment.
Even though this feature would not be useful to all users of Skript, it would be a great addition to the ones that would not only use it, but use it correctly.
bStats might retract the implementation in future after realising how wrong this will potentially go.
Only time will tell I presume
I agree with the arguments that Skript doesn't seem suitable for such a Skript implementation due to Skript's flexibility in editing values.
Another argument I'd like to add to this conversation is utility. As far as plugins are concerned, I can see useful data for a plugin developer that would enable him to better understand how his plugin is used by its users. As far as Skript is concerned, I don't see what useful information could be sent as metrics, apart from a potential "script version".
Which brings me to another question: will it really be used? Do developers have a strong interest in such a metrics system? In fact, this is the first time I've seen anyone ask for such a feature, which leads me to wonder whether it's worth it, given the various counter-arguments that have been expressed above.
If such a feature were to be added to Skript, it would obviously be necessary to disable the sending of metrics in Skript only (either by default, or with Skript configuration) in my opinion.
What's the problem with stats?
We're not against the idea in principle, but we have some major concerns about its practicality.
Currently, our biggest concern is that the data will not provide any accurate or useful value.
We worry that the data will be, at best, inconsistent with reality and, at worst, completely innaccurate.
This is because scripts exist in a chop-and-change ecosystem where users are able to (read: encouraged to) download things and play with them or rewrite them entirely, at which point they're no longer a thing that should be counted in the original statistic.
As a result, numbers are likely to be significantly over-reported, and so the data will not provide any valuable knowledge to the developer.
Statistics are not there to be an ego boost. If the reporting is inaccurate (beyond a respectable error bar) then this whole idea is defunct.
Why are plugins different from scripts?
Plugin statistics are possible because a plugin is a concrete and identifiable thing. It has a namespace, a signature, a manifest. We have version numbers, Git hashes and all sorts of data zipped inside the plugin. It's really easy to tell what is and isn't your plugin. Of course, you can decompile and edit little bits, but you'd never do that on a large scale because it's impractical to do so rather than rebuilding your own plugin from its source.
Compiled plugins avoid the issues above by being annoying to edit, more than anything else.
A script is a text file: it's small, it's easy to edit, you can chop and change bits as you like, and users are encouraged to do so. Very few users download a script and don't go in and edit something. Once a user has downloaded and edited your script, it is no longer yours. It might be completely unrecognisable from yours: the user might delete half of it, or paste bits and pieces into their own scripts.
How do we recognise the identity of a text file? How do we know when it has changed incomparably from its original version?
Can this be resolved?
Sort of. The easiest way to correct for modifications would be to keep an eye on the hash of the file (which we already have) or something based on its constituent structures.
The hard part is determining when something has been unreasonably modified. Obviously, changing a few options at the top of the file shouldn't invalidate the metrics, and I don't see a problem with tweaking a few strings here and there (people like to add themed colours, or add their server name in messages), but once you start to mess with the actual functional content of the script (e.g. removing some commands? Adding your own function? Breaking the script up across multiple files?) then it should no longer report to the same metric.
Personally, I think a tree & branch hash is probably the best way to estimate this, but it'll probably take some experimenting to come up with a reasonable idea.
I think the easiest way of accomplishing this would be to have some kind of publishing command that takes in a script and generates its version-identity hash, which would need to be submitted to the metrics tracking site as an acceptable version.
The site would have to exclude requests that submitted an unexpected hash (i.e. had incomparably changed from a real version).
Security is a non-issue
There is practically no chance of syntax being included to collect custom graphs or charts.
We would provide the basic statistics through the plugin itself: Minecraft version, Skript version, server & player count, maybe some non-invasive insights into the script itself.
The script would just need to include a structure for the metric ID:
metrics id 17044
There would be no avenue for users to submit raw text or data beyond that.
More than this, I don't see what would be valuable to collect (without taking up half the script file with some kind of chart section).
What's the problem with stats?
We're not against the idea in principle, but we have some major concerns about its practicality.
Currently, our biggest concern is that the data will not provide any accurate or useful value. We worry that the data will be, at best, inconsistent with reality and, at worst, completely innaccurate. This is because scripts exist in a chop-and-change ecosystem where users are able to (read: encouraged to) download things and play with them or rewrite them entirely, at which point they're no longer a thing that should be counted in the original statistic.
As a result, numbers are likely to be significantly over-reported, and so the data will not provide any valuable knowledge to the developer. Statistics are not there to be an ego boost. If the reporting is inaccurate (beyond a respectable error bar) then this whole idea is defunct.
Why are plugins different from scripts?
Plugin statistics are possible because a plugin is a concrete and identifiable thing. It has a namespace, a signature, a manifest. We have version numbers, Git hashes and all sorts of data zipped inside the plugin. It's really easy to tell what is and isn't your plugin. Of course, you can decompile and edit little bits, but you'd never do that on a large scale because it's impractical to do so rather than rebuilding your own plugin from its source. Compiled plugins avoid the issues above by being annoying to edit, more than anything else.
A script is a text file: it's small, it's easy to edit, you can chop and change bits as you like, and users are encouraged to do so. Very few users download a script and don't go in and edit something. Once a user has downloaded and edited your script, it is no longer yours. It might be completely unrecognisable from yours: the user might delete half of it, or paste bits and pieces into their own scripts. How do we recognise the identity of a text file? How do we know when it has changed incomparably from its original version?
Can this be resolved?
Sort of. The easiest way to correct for modifications would be to keep an eye on the hash of the file (which we already have) or something based on its constituent structures.
The hard part is determining when something has been unreasonably modified. Obviously, changing a few options at the top of the file shouldn't invalidate the metrics, and I don't see a problem with tweaking a few strings here and there (people like to add themed colours, or add their server name in messages), but once you start to mess with the actual functional content of the script (e.g. removing some commands? Adding your own function? Breaking the script up across multiple files?) then it should no longer report to the same metric. Personally, I think a tree & branch hash is probably the best way to estimate this, but it'll probably take some experimenting to come up with a reasonable idea.
I think the easiest way of accomplishing this would be to have some kind of publishing command that takes in a script and generates its version-identity hash, which would need to be submitted to the metrics tracking site as an acceptable version. The site would have to exclude requests that submitted an unexpected hash (i.e. had incomparably changed from a real version).
Security is a non-issue
There is practically no chance of syntax being included to collect custom graphs or charts. We would provide the basic statistics through the plugin itself: Minecraft version, Skript version, server & player count, maybe some non-invasive insights into the script itself. The script would just need to include a structure for the metric ID:
metrics id 17044There would be no avenue for users to submit raw text or data beyond that. More than this, I don't see what would be valuable to collect (without taking up half the script file with some kind of chart section).
Thanks for the reply, it was really helpful and honestly, yeah, Skript's nature is a "chop off parts and mix in together functions" for a lot of players. And yeah, once a script gets mixed in with other parts, metrics become basically useless, as you're not tracking your work anymore but what is left of it.
I found your approach with hashes really interesting, and it might actually solve the problem above mentioned. I'll try to get here the author of bStats so that they can comment on this, and we can see if they are able to implement such feature when registering a Skript on the website.
And yeah, while usage of custom charts would be really valuable to most, actually adding it in a natural-language syntax such as skript might be hard (for example, to build a Drilldown Pie Chart you need to build a 3 dimensional map <string <string, int>> for the category, value and weight).
if in the end bStats is not able to add these features we might require to track down skripts, we could make our own platform for this type of tracking, and design it as we like.
Hey there, bStats author here ๐
I think there was a small misunderstand here, maybe caused by me not being familiar with the Skript ecosystem. From my understanding, one goal of Skript is to allow server owner to write individual scripts for their own server. This is what I had in mind when @Dere-Wah approached me with the idea: Allow server owners to write scripts that can collect stats for their own server, mainly by using custom charts.
For example, if a sever owner is interested in how many diamonds are mined on their server, they could create a custom line chart that tracks the "mining rate" of diamonds over time.
Edit: This ofc. only makes sense when server owners want to share the stats with their users. Otherwise an offline solution that just dumps the data into a CSV file or similar is most likely the better choice.
I'll try to get here the author of bStats so that they can comment on this, and we can see if they are able to implement such feature when registering a Skript on the website.
It may be possible (or necessary) to send Skript data via our own proxy in which we can do the filtering and such, but then we'd need to set our proxy up with bstats. We have a few ideas but I think it will take time to brainstorm.
And yeah, while usage of custom charts would be really valuable to most, actually adding it in a natural-language syntax such as skript might be hard (for example, to build a Drilldown Pie Chart you need to build a 3 dimensional map <string <string, int>> for the category, value and weight).
Yeah, my worry is that you will have to spend more of your file on setting up the chart than you actually have functional script.
if in the end bStats is not able to add these features we might require to track down skripts, we could make our own platform for this type of tracking, and design it as we like.
This is also a possibility and it's something that we explored about 3 years ago (alongside some kind of skript package manager), it's a lot of work to build and maintain though, and I couldn't find any suppliers who would be willing to sponsor the hosting/networking/proxy costs long-term.
We've got lots of options, we just need to find something that can work with our ecosystem.
This is what I had in mind when @Dere-Wah approached me with the idea: Allow server owners to write scripts that can collect stats for their own server, mainly by using custom charts.
After thinking about it some more, I'm not so sure how well suited bStats is for this use case. bStats is based on the assumption that there are many servers sending data to bStats, and that small "mistakes" (e.g. data not being sent due to server restarts) cancel out. This is not the case for sigle-server-statistics. A tailor-made solution for server owners could also do many things differently than bStats, which has to make some compromises due to its scale (no live data, etc.). So at best, bStats can cover this use case, but it will probably only be a suboptimal solution.
For example, if a sever owner is interested in how many diamonds are mined on their server, they could create a custom line chart that tracks the "mining rate" of diamonds over time.
@Bastian You got half of the idea there. Our fellow user here proposed a suggestion about they would like to create a (potentially) useful resource using Skript, which is an English-like scripting language, and allows monitoring their scripts usages through bStats, which it is not feasible due to the concerns mentioned all above.