Huh, I think the guarantee we want to provide is:
“An API function called is guaranteed not to crash after the callback returns.”
This guarantee holds no matter what the callback function is set to, which corresponds to what you called “in any “stage”. Now, of course, this guarantee doesn’t say anything if the callback itself causes a crash (and the default callback does) because then the callback never returns, so there’s no “after the callback returns”.
Now that I’ve written this, I’m not sure what this actually means. In simple cases, the only thing the API function does after the callback returns is to return 0. This shouldn’t crash. But there are more complex cases:
- Some functions try to read more arguments from memory, e.g., 
secp256k1_ec_pubkey_cmp tries to read the memory at pubkey1 even after an invalid pubkey0 triggered the callback. This may crash if pubkey1 points to an invalid memory region. 
- Some functions write to their output arguments after the callback has returned (see #1736 for another rabbit hole…).
 
Perhaps what we should want to say here is something like this: “An API function called is guaranteed not to crash after the callback returns (unless the crash is unrelated to the violation that triggered the callback)”. But this is imprecise and ugly.
Perhaps we should drop this sentence entirely. It creates more confusion than it resolves. What about this?
“Should the callback return, the return value and output arguments of the API function call are undefined. Moreover, the same API call may trigger the callback again in this case.”
Note that all of this is about API guarantees, so the audience is API users. As a result, “debug stage” means “debug stage of a program using libsecp256k1”. (And as a side remark, note that the callbacks can be set at runtime, so strictly speaking, this is not (necessarily) about debug vs. release builds.) So if you develop a program (an application or another library) that uses the libsecp256k1 API, you may want to set a callback that does not crash for debugging purposes. I’m not entirely sure how useful this functionality is to API users, but this is what the docs here describe.
In practice, where this is useful is exactly in our internal tests, as you correctly point out. But what I’m trying to say is that these are our internal tests, where we may do everything because we control the implementation of the API functions. So we could also have an internal way of setting a counting callback that is not exposed through the public API.
If you ask me, the main purpose of setting callback functions is that you can control how the library crashes in production code. Maybe in your application, you don’t want abort() but a more graceful termination. Or you can’t use the default callback because it writes an error message to stdout but you don’t have stdout because you’re developing for a hardware wallet. (In the latter case, setting callbacks at runtime won’t help you because the program won’t compile in the first place. We’ve added compile-time overrides for this case.)
And on a last note, I don’t think all of this is optimally designed. I believe we’d do some things differently if we had a chance to start from scratch.