is that actually making code faster or is it only to make the logic of the problem you are discussing simpler?
The latter. Just to avoid complicating the example with an irrelevant special case.
what about possible code that expects this operation to fail?
mixed error = catch{ a[x] = i; }; if (error) write("ooops, we reached the end\n");
Yes, that'd be a compatibility problem too, strictly speaking. Don't think it's very significant, so if #pike 7.8 solves it then it'd be enough, imho. Do you expect it to be a real problem?
Anyway, Per pointed out in a personal response another much more prevalent case, namely string concatenation:
s += "foo";
Considering this, it's almost a requirement to figure out a way to keep the single-ref-destructive-update optimizations for thread local things referenced from the stack.
I'm toying with an idea to introduce a single bit for synchronous refcounting: Consider a flag MULTI_REF which is set when a thing gets its second ref (regardless whether the ref is from the stack or elsewhere). Single-ref-destructive-update optimizations would then only be done if it's cleared. It'd only be possible to clear it from the gc, and it wouldn't be applicable if the thing is shared.