First you can document the exact algorithm. Look at DO_HASHMEM in pike_memory.h, called from low_do_hash in stralloc.c.
Then for the empirical approach, modify your pike to log all new strings and then run some big applications and see what the real performance in terms of collisions are compared to a theoretical optimum.
The analytical approach is more difficult, since you first have to create a probability model for how strings look and then run that model through the algorithm.
Sounds like fun. At the very least I would like to read the report...