Immersive Railroading

Immersive Railroading

4M Downloads

Bad hash function?

hayesj1 opened this issue ยท 8 comments

commented

ImmersiveRailroading/src/main/java/cam72cam/immersiverailroading/proxy/ChunkPos:hashcode() seems borked off the top of my head. Based on the math there, two chunks in the same dimension and with chunk1.x == chunk2.z and chunk1.z == chunk2.x should have the same hash value. Isn't that a significant problem? See Line 35 of ChunkPos.java

commented

The hashcode() function should mirror the behavior of the equals() function. In this case, equals() has been overridden to compare the x, z, and dimension of chunks rather than their memory address, so it is proper that hashcode() reflects that. It is important for things like hashtables that objects which are considered equal have the same hashcode, even if they are not the same object.

commented

right, I'm not disputing the fields used in the computation of hash code, but the computation itself; basically because of the laws of addition(specifically the Commutative Law), the current computation should result in collisions(within the very reasonable constraints I described in the original post) for Hash-based data structures. Which, if I read the code correctly, is exactly what is used in the ChunkManager class(a hashmap).

commented

Oh I see, I misread your comment.

Now that I look closer, you're exactly right. And furthermore they don't even have to be cross-equal like that. As long as chunk1.x+chunk1.z == cunk2.x + chunk2.z (eg, along a world diagonal), the hashcodes will collide.

One of those values should probably be squared or something along those lines.

commented

and that's not even considering the dimension part. It's probably not a bad idea to mod by 7 or some other prime number between steps too. Something like:
int hash = 0; hash = (hash + dim) % 7; hash = (hash + chunk.x) % 7; hash = (hash + chunk.y) % 7; return hash
and maybe throw in some other bit of randomness or square something to deal with the case you brought up.

commented

IIRC Java's hash function does not need a collision free hash. It's just used to do quick map bucket partitioning. Once it has a collection of objects that match, it uses .equals for the final compare.

commented

The Java Doc for HashMap mentions implementing Comparable to help reduce the performance cost of having collisions. I'd be happy implement that interface on ChunkPos if you think it's a good idea Cam.

commented

After some more digging, it probably won't help that much unless a player is either A) purposely trying to break lag the game, or B) accidentally builds a railway in the fashion that would create collisions. For both cases, the track configuration would most likely be caused by two parallel, diagonal tracks that are also parallel to a line intersecting (0, 0) at a 45 degree angle. Otherwise there shouldn't be a significant number of colliding ChunkPos instances.

In rare event that case does crop up however, there could be big gains; apparently in Java 8 Hashmap switches over to balanced binary trees for it's buckets after a certain size threshold is reached. I'd think having ChunkPos not implement Comparable could push the worst case away from the theoretical O(log n).

commented

Ok, let's leave it for now. It's simple and considering the size of the map (few hundred entries at worst) it should work just fine.