Handling of duplicate keys in JSON #20949

issue laanwj opened this issue on January 16, 2021
  1. laanwj commented at 11:47 AM on January 16, 2021: member

    Our handling of duplicate keys in JSON dictionaries is different from that of other languages like Python, Javascript (and most others I know of).

    While those languages parse the JSON and build hash tables, the last assignment of a key is what takes effect. e.g.

    >>> json.loads('{"a": "first", "a": "second"}')
    {'a': 'second'}
    

    Our univalue, though, stores every (key,value) pair in an array, and when a key is queried it returns the first associated value.

    #include <iostream>
    #include <univalue.h>
    
    int main() {
        UniValue val;
        val.read("{\"a\": \"first\", \"a\": \"second\"}");
        std::cout << val["a"].write() << std::endl;
        return 0;
    }
    
    $ g++ unitest.cpp -o unitest -Iunivalue/include -L src/univalue/.libs -lunivalue
    $ ./unitest 
    "first"
    

    This is not generally a problem but can create subtle interoperability issues. For example a proxy that validates the allowed JSON-RPC commands and lets through only some combinations of command/argument (and doesn't take this into account) could be misled in this way.

    Either reversing this order or adding a "strict mode" that rejects duplicate keys might make sense. I don't know.

    (issue originally reported by Florian Mathieu)

  2. laanwj added the label RPC/REST/ZMQ on Jan 16, 2021
  3. prusnak referenced this in commit a715d489dc on Jan 16, 2021
  4. prusnak referenced this in commit c79576dd75 on Jan 16, 2021
  5. prusnak referenced this in commit 14c735bde4 on Jan 16, 2021
  6. prusnak referenced this in commit 8e7fc4f871 on Jan 16, 2021
  7. prusnak commented at 6:36 PM on January 16, 2021: contributor
  8. prusnak referenced this in commit d492dd4eee on Jan 16, 2021
  9. luke-jr commented at 5:13 PM on January 17, 2021: member

    I think it would be best to error if duplicate keys are found where not supported.

  10. laanwj commented at 2:50 PM on January 19, 2021: member

    Right, detecting and failing on duplicated keys seems to be the preferred way forward here.

    Changing the behaviour in univalue is technically the most easy but I doubt there will ever be agreement on what is the right behavior, and it would break software that makes the (currently right) assumption.

  11. prusnak commented at 2:58 PM on January 19, 2021: contributor

    Wouldn't it make sense to drop UniValue completely if duplicate keys are unwanted? Even simple std::map with std::variant could suffice in that case, no?

  12. laanwj commented at 3:02 PM on January 19, 2021: member

    No, tere are plenty of places where the JSON type makes a difference. Dropping our JSON parsing library over this seems bizarre.

  13. ubbabeck commented at 4:32 PM on January 27, 2026: none

    Hi,

    I was investigating whether this issue still requires work and wanted to share my findings. I am fairly new to the bitcoin core codebase so excuse my ignorance.

    Current State

    Commit 6bd1d20b (PR #26628, ) added duplicate detection at the RPC layer in https://github.com/bitcoin/bitcoin/blob/27aeeff630147cca35488752c01aaf5750b69fbc/src/rpc/server.cpp#L380.

    However, UniValue itself still:

    • Accepts duplicate keys during parsing
    • Stores both values internally but only returns the first
    • Makes duplicate values inaccessible (silent data loss)

    Where Should Detection Happen?

    Option 1: Parser Level (UniValue)

    Pros: Complete protection everywhere, prevents data loss, consistent behavior Cons: Breaks test rpc_tests.cpp:96-97 which expects RPC-specific error, generic error messages, performance cost

    Option 2: RPC Layer (Current)

    Pros: Already implemented, specific error messages, non-breaking Cons: Doesn't protect nested objects like {"outer": {"key": 1, "key": 2}}, inconsistent behavior across contexts

    Questions for Discussion

    1. Should UniValue (general JSON library) enforce unique keys, or is this a consumer policy?
    2. Does the nested object vulnerability need parser-level protection?
    3. Should we add a strict parsing mode option to UniValue? as suggested in the issues description by @laanwj:

    Proposed Solutions

    I am not aware if there already exists a plan for the UniValue implementation, but here a two possible solutions I see.

    Short term: Document current behavior and limitations

    Long term: Consider adding optional strict mode

    val.read(json, UniValue::STRICT_MODE);  // Rejects duplicates
    

    This allows RPC to use strict mode while maintaining backward compatibility for other uses.

    Happy to work on whichever approach maintainers prefer. I have a working parser-level implementation but it breaks the existing test expectations, suggesting RPC-layer detection was the intended design.

  14. maflcko commented at 5:11 PM on January 27, 2026: member

    Is there a use-case for silently continuing anywhere? I haven't looked deeper, but I think univalue should just reject duplicates at the lowest level while parsing or constructing any dict.

  15. ubbabeck commented at 7:00 PM on January 27, 2026: none

    Whenever bitcoin core has any use-cases for silently continuing. I do not know, my assumption is no, but I will look further into this.

    When it comes to json, even the rfc8259 isn't clear on what is correct and states: The names within an object SHOULD be unique.. Though from how I understand and work with JSON it's mentally mapped as a MUST, so I agree with @maflcko's statement that univalue should reject duplicates.

    One use-case I found for duplicate keys is for merging of objects, but I have never encountered this in the wild during my career.

    When it comes to how to go about this. Do you think it makes more sense to thow a runtime error as in univalue_get.cpp? https://github.com/bitcoin/bitcoin/blob/c0e6556e4f51b454c3f6e9069044761c34e99f81/src/univalue/lib/univalue_get.cpp#L56

    Considering how to implement this, do you believe it makes more sense to throw a runtime error, similar to what is done in univalue_get.cpp? This approach could allow developers to identify and handle exceptions effectively, instead of returning a bool as is done today.

  16. maflcko commented at 7:58 PM on January 27, 2026: member

    I would have recommended to start by adding sanity checks to pushKV (and friends), because they are only used internally and I think assumed/expected to not add duplicate keys, so they could throw a runtime error on a violation.

    However, this does not seem so trivial, because duplicate checking could bring the calls back to O(n^2), which apparently makes a difference in some cases: 710a7136f93133bf256d37dc8c8faf5a6b9ba89d


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 15:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me