Searching tag changes by a user? AKA ask before making major changes

Posted under General

TheXev has taken to tagging a number of non-loli images as loli, which is a problem. See for example post #136048 which is not loli.

Unfortunately I don't know of a good way to look at all his/her changes and revert all the problematic ones. I know tracking all a users changes like that might be hell on the servers, so is there some other good way to go about it?

Edit: And now that I'm seeing just how many he changed... Please refrain from making large-scale tag changes without consulting the mods, albert, or the forums. Particularly as regards the loli tag, because improperly applying it to something like 100 posts means most users suddenly cant see those images anymore.

Edit2: Since there's roughly 100 changes per site tag history page, I fixed almost 200 posts just in the nanoha tag alone. If anyone else finds other tags that were affected on a similar scale, either fix them if you're confident or let a mod know and we'll look at it.

Updated by 葉月

I think the group of changes around page ~13 in the tag history (that's 13 as of typing this message, it'll be significantly further back depending on when someone is reading this) was it, and they seem to be 90% Nanoha, so I think it's by and large taken care of for now.

Point stands for the future though.

I think it's an indicator of a bigger issue: mods don't have appropriate tools to policy and fix and (in particular) query things in a way that would let us react to events such as that.

The things we miss for sure:
- see all wiki edits by a user
- see all comments by a user
- see (and revert) all changes by a user
- see rating, source, parent changes on a post
- identify anonymous commenters (though that's partially taken care of by LaC's patch, even if I disagree somewhat on the ideal solution here)

Let's discuss what we need here, and convert that into a good set of tickets later.

NOTE: this discussion is intended for mods about mod tools. Unless you are a mod or believe you have a *really* good idea about what is needed, please don't post here. I will prune comments that don't follow that to keep the discussion focused.

葉月 said:
(葉月's entire post was quoted here, but you can just read it above)

I'm not sure if I can post here, I'm admin on other site *so I hope not to be in the way*, if so please forgive me.

I think a good implement would be displaying certain user ip to moderators/admins. It's true that there are a lot of dynamic ips out there, but there's still a lot of static ips which can be easily banned. (This would allow us searching for users using multiple accounts)

No need danbooru to be able to ban ips, since it can be easily done in apache/lighttpd/route and server-wide, but still I consider it would be nice.

Updated by Shuugo

- see all wiki edits by a user

Ok.

- see all comments by a user

Are bad comments really worth querying? I think there are too many individual bad commenters for this to be a useful statistic.

- see (and revert) all changes by a user

I think the post_tag_histories table has enough data for this to be possible.

- see rating, source, parent changes on a post

I could add a rating field to post_tag_histories and just rename the table to post_versions. I'm not convinced source and parent are worth recording, though.

Querying alone isn't enough. We need reporting. For example, aggregating tag changes by user by week, to spot people who make mass changes.

albert said: Are bad comments really worth querying? I think there are too many individual bad commenters for this to be a useful statistic.

There are times I wish I'd had it. It's good to know when considering an invite. That said, I don't consider it critical and feel I can do the mod thing fine without it.

- see (and revert) all changes by a user
I think the post_tag_histories table has enough data for this to be possible.

Are you saying "The data is there so just use post/tag_history" or "The data is there so I think I'll implement this?" Either way, as my top post indicates, I'd be for implementing this.

I could add a rating field to post_tag_histories and just rename the table to post_versions. I'm not convinced source and parent are worth recording, though.

I'm somewhat indifferent to all three, but having them wouldn't be "bad". I agree that source and parent are much less important though. They all have uses in specific situations, but you'd need to weigh what sort of server load this would add versus the number of cases where it would actually be useful.

How about recording histories in the form of deltas? eg:

albert|vocaloid hatsune_miku rating:q|
葉月|spring_onion|
TheXev|loli rating:x|
jxh2154|-loli zettai_ryouiki|
LaC|rating:q|

This would be more compact and possibly make it simpler to look for specific changes made by a user. The downside would be having to reconstruct the full intermediate states when displaying the data (by playing back the history).

Generating the delta would also be useful for the purpose of making the "recent tags" list work better.

albert said:
- see all comments by a user

Are bad comments really worth querying? I think there are too many individual bad commenters for this to be a useful statistic.

Yes, it's worth it. For inviting, and also for the situations where I go "oh God, that's a dumb comment.. Wait, haven't I seen him being stupid before?". Being able to see aggregated comments would be very useful for connecting dots, as stupidity tends to be amassed in certain special individuals.

- see rating, source, parent changes on a post

I could add a rating field to post_tag_histories and just rename the table to post_versions. I'm not convinced source and parent are worth recording, though.

Again, connect the dots thing. There have been past cases of heavy and annoying vandalism on those, and being able to see if there are any particular offenders could help with fighting that.

Querying alone isn't enough. We need reporting. For example, aggregating tag changes by user by week, to spot people who make mass changes.

Well, yes, of course. That was sort of implied, but I guess discussing the form of that information helps as well. Site-wide overview would of course be useful for spotting any anomalies.

LaC said:
This would be more compact and possibly make it simpler to look for specific changes made by a user. The downside would be having to reconstruct the full intermediate states when displaying the data (by playing back the history).

I don't see anything that would prevent making it a toggle. You could even calculate deltas client-side, it's not a particularly complex computation. And yes, deltas would be handy.

1