Bad hash function?
hayesj1 opened this issue ยท 8 comments
ImmersiveRailroading/src/main/java/cam72cam/immersiverailroading/proxy/ChunkPos:hashcode()
seems borked off the top of my head. Based on the math there, two chunks in the same dimension and with chunk1.x == chunk2.z
and chunk1.z == chunk2.x
should have the same hash value. Isn't that a significant problem? See Line 35 of ChunkPos.java
The hashcode() function should mirror the behavior of the equals() function. In this case, equals() has been overridden to compare the x, z, and dimension of chunks rather than their memory address, so it is proper that hashcode() reflects that. It is important for things like hashtables that objects which are considered equal have the same hashcode, even if they are not the same object.
right, I'm not disputing the fields used in the computation of hash code, but the computation itself; basically because of the laws of addition(specifically the Commutative Law), the current computation should result in collisions(within the very reasonable constraints I described in the original post) for Hash-based data structures. Which, if I read the code correctly, is exactly what is used in the ChunkManager class(a hashmap).
Oh I see, I misread your comment.
Now that I look closer, you're exactly right. And furthermore they don't even have to be cross-equal like that. As long as chunk1.x+chunk1.z == cunk2.x + chunk2.z (eg, along a world diagonal), the hashcodes will collide.
One of those values should probably be squared or something along those lines.
and that's not even considering the dimension part. It's probably not a bad idea to mod by 7 or some other prime number between steps too. Something like:
int hash = 0; hash = (hash + dim) % 7; hash = (hash + chunk.x) % 7; hash = (hash + chunk.y) % 7; return hash
and maybe throw in some other bit of randomness or square something to deal with the case you brought up.
IIRC Java's hash function does not need a collision free hash. It's just used to do quick map bucket partitioning. Once it has a collection of objects that match, it uses .equals for the final compare.
The Java Doc for HashMap mentions implementing Comparable to help reduce the performance cost of having collisions. I'd be happy implement that interface on ChunkPos if you think it's a good idea Cam.
After some more digging, it probably won't help that much unless a player is either A) purposely trying to break lag the game, or B) accidentally builds a railway in the fashion that would create collisions. For both cases, the track configuration would most likely be caused by two parallel, diagonal tracks that are also parallel to a line intersecting (0, 0) at a 45 degree angle. Otherwise there shouldn't be a significant number of colliding ChunkPos instances.
In rare event that case does crop up however, there could be big gains; apparently in Java 8 Hashmap switches over to balanced binary trees for it's buckets after a certain size threshold is reached. I'd think having ChunkPos not implement Comparable could push the worst case away from the theoretical O(log n).