The current txindex uses the full 32-byte txid as keys, which takes up about 66 GB of disk space today on mainnet. Using a 5-byte key prefix instead drops the disk usage to 32 GB - cutting the size to less than half.
Using the full 32-bytes is unnecessary since a 5-byte salted siphash will produce collisions in about 1 in 1.1 trillion. Some collisions will occur, but the penalty is just an extra disk read, deserialization and hash.
The tx position can be appended to the key instead of used as a value, and a LevelDB iterator can seek to the prefix and then scan for the correct tx. This is an almost identical approach to txospenderindex.
If a tx is not found with this method, we fallback to looking up the legacy entry. With this method a user with an existing db can opt to erase the indexes/txindex folder and reindex, or keep the current index and new entries will be appended with the smaller footprint.
The time to index was faster on my machine with this method, 1h32m vs current 1h50m.
Lookups are roughly the same, around 0.2ms per lookup with getrawtransaction.
When testing on mainnet, I got ~860k 2-way collisions, and 1 3-way collision that worst case could cause an extra 2 false positives when reading.