Guildbook

Guildbook

686k Downloads

Trying to help removing duplicates in GuildbookAvatarPickerMixin

Yarillo4 opened this issue ยท 5 comments

commented

Hello
I was browsing the source code and saw this TODO

https://github.com/stpain/guildbook/blob/master/Guildbook.lua#L1043

    -- there are lots of duplicates in these textures
    -- TODO: remove duplicates at some point

I found a tool called BLPNG over there on Github made by @izzyus that can transform BLP files into PNGs

Compiled it, used it to convert every avatar into a png.

I then used some website (https://www.similar.pictures) to find out what images were similar with each other. It found 67 clusters

Screenshot 2021-11-19 at 16-41-07 Duplicate image finder on disk

Then converted those paths into fileIDs

proof.md

Here are all the duplicates, minus one for each cluster, so that you can keep just one

{1066025,1066036,1066039,1066045,1066051,1066072,1066077,1066088,1066096,1066102,1066117,1066155,1066157,1066265,1066297,1066338,1066365,106642,1066423,1067180,1067197,1067232,1067241,1067256,1067264,1067265,1067276,1067279,1067284,1067292,1067295,1067299,1067301,1067302,1067305,106311,1067312,1067322,1067323,1067326,1067328,1067334,1067336,1067340,1067341,1067342,1067374,1067375,1067377,1067387,1067393,1067397,1067408,167410,1067411,1067412,1067416,1067417,1067419,1067423,1067444,1067454,1067455,1067459,1080907,1108820,1112914,1112927,1138400,1138403,1138413,1138418,1138419,1138420,1138421,1138422,1138424,1138425,1341729,1341751,1341752,1341766,1341794}

I could do a PR but you'll probably want to include the data in some data file somewhere and I don't know how you'd like to go about it. So there you go

commented

hi wow thats amazing! so are these fileID unique?

commented

so are those fileIDs the uniques fileIDs or the duplicates?

commented

Hey they are the duplicates

commented

right ok so should need to filter those IDs out, many thanks

commented

Indeed :) if you filter those ones out, it will just remove the duplicates. There is one picture left out for every set so that pictures are back to being unique after filtering