You shouldn't mix up the current gc with the issue on whether or not to free things immediately. The current gc is incredibly inefficient compared to any reasonably modern gc algorithm.
To begin with it's not mark-and-sweep, but check-mark-and-sweep, i.e. there's an entire extra pass over the whole heap just to determine the external refs.
Aside from that, a decent mark-and-sweep gc doesn't go over the whole heap like the naive implementation in Pike does. Rather a tri-color scheme is used to avoid data that hasn't been touched since the previous gc run (see http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Basic_alg...).
So in any case, we should toss the current gc. It's just dog slow. It's a museum piece.
Using refcounts to free things immediately actually does not improve performance. The research paper shows that delaying the gc a bit can improve performance with as much as 40% in high-concurrency server apps. All that refcount twiddling to get immediate frees is a sizable performance hog on short-lived data; mark-and-sweep on it (and _only_ on that new data) is a lot more efficient. The gc would run much more often, of course, but it would do _way_ less work in each run. Remember that the gc work would only be proportional to the amount of change in the heap, not the size of it.
Aside from that, there are also arguments that freeing many things at the same time lessens fragmentation and improves locality.