* [bitcoindev] On (in)ability to embed data into Schnorr
@ 2025-10-01 14:24 waxwing/ AdamISZ
2025-10-01 22:10 ` Greg Maxwell
` (4 more replies)
0 siblings, 5 replies; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-01 14:24 UTC (permalink / raw)
To: Bitcoin Development Mailing List
[-- Attachment #1.1: Type: text/plain, Size: 1548 bytes --]
Hi all,
https://github.com/AdamISZ/schnorr-unembeddability/
Here I'm analyzing whether the following statement is true: "if you can
embed data into a (P, R, s) tuple (Schnorr pubkey and signature, BIP340
style), without grinding or using a sidechannel to "inform" the reader, you
must be leaking your private key".
See the abstract for a slightly more fleshed out context.
I'm curious about the case of P, R, s published in utxos to prevent usage
of utxos as data. I think this answers in the half-affirmative: you can
only embed data by leaking the privkey so that it (can) immediately fall
out of the utxo set.
(To emphasize, this is different to the earlier observations (including by
me!) that just say it is *possible* to leak data by leaking the private
key; here I'm trying to prove that there is *no other way*).
However I still am probably in the large majority that thinks it's
appalling to imagine a sig attached to every pubkey onchain.
Either way, I found it very interesting! Perhaps others will find the
analysis valuable.
Feedback (especially of the "that's wrong/that's not meaningful" variety)
appreciated.
Regards,
AdamISZ/waxwing
--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 2061 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ @ 2025-10-01 22:10 ` Greg Maxwell 2025-10-01 23:11 ` Andrew Poelstra 2025-10-03 13:24 ` Peter Todd ` (3 subsequent siblings) 4 siblings, 1 reply; 19+ messages in thread From: Greg Maxwell @ 2025-10-01 22:10 UTC (permalink / raw) To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List [-- Attachment #1: Type: text/plain, Size: 3193 bytes --] Intuitively it sounds likely, -- just in that the available values are a image on the curve and a value summed with a hash dependent on everything else. I think it would be hard to prove. But is it even really worth the analysis when grinding gets you a 12% embedding rate in that signature at not that significant cost? (because you can independently grind the nonce and signature itself, or nonce and pubkey) -- and when beyond the cost of the additional signature (making the output 3x its cost) requiring signing when forming the address completely kills public derivation, multisig with cold keys. etc? ... and then any of whatever spam concerns people have would likely be exacerbated by the spammers using more resources due to the embedding rate? Also re private key leaking an utxo set, well not so if it's part of an explicit multisig. E.g. 2 of 2 with leaked key and a secure one. On Wed, Oct 1, 2025 at 7:50 PM waxwing/ AdamISZ <ekaggata@gmail.com> wrote: > Hi all, > > https://github.com/AdamISZ/schnorr-unembeddability/ > > Here I'm analyzing whether the following statement is true: "if you can > embed data into a (P, R, s) tuple (Schnorr pubkey and signature, BIP340 > style), without grinding or using a sidechannel to "inform" the reader, you > must be leaking your private key". > > See the abstract for a slightly more fleshed out context. > > I'm curious about the case of P, R, s published in utxos to prevent usage > of utxos as data. I think this answers in the half-affirmative: you can > only embed data by leaking the privkey so that it (can) immediately fall > out of the utxo set. > > (To emphasize, this is different to the earlier observations (including by > me!) that just say it is *possible* to leak data by leaking the private > key; here I'm trying to prove that there is *no other way*). > > However I still am probably in the large majority that thinks it's > appalling to imagine a sig attached to every pubkey onchain. > > Either way, I found it very interesting! Perhaps others will find the > analysis valuable. > > Feedback (especially of the "that's wrong/that's not meaningful" variety) > appreciated. > > Regards, > AdamISZ/waxwing > > -- > You received this message because you are subscribed to the Google Groups > "Bitcoin Development Mailing List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to bitcoindev+unsubscribe@googlegroups.com. > To view this discussion visit > https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com > <https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAAS2fgQRz%3DEJ%2BNm2rxrB_SEpqroFbcc%2BhUhmghJJ1jrJc-WUDA%40mail.gmail.com. [-- Attachment #2: Type: text/html, Size: 4267 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-01 22:10 ` Greg Maxwell @ 2025-10-01 23:11 ` Andrew Poelstra 2025-10-02 0:25 ` waxwing/ AdamISZ 0 siblings, 1 reply; 19+ messages in thread From: Andrew Poelstra @ 2025-10-01 23:11 UTC (permalink / raw) To: Bitcoin Development Mailing List [-- Attachment #1: Type: text/plain, Size: 2288 bytes --] On Wed, Oct 01, 2025 at 10:10:16PM +0000, Greg Maxwell wrote: > Intuitively it sounds likely, -- just in that the available values are a > image on the curve and a value summed with a hash dependent on everything > else. I think it would be hard to prove. > > But is it even really worth the analysis when grinding gets you a 12% > embedding rate in that signature at not that significant cost? (because you > can independently grind the nonce and signature itself, or nonce and > pubkey) -- and when beyond the cost of the additional signature (making the > output 3x its cost) requiring signing when forming the address completely > kills public derivation, multisig with cold keys. etc? ... and then any of > whatever spam concerns people have would likely be exacerbated by the > spammers using more resources due to the embedding rate? > Some time ago, I talked to Ethan Heilman about this in the context of PQ signatures, and he made the interesting point that you can think of 12% embedding rate as representing an 8x discount for real signatures vs embedded data. And that maybe that's okay, incentive-wise. Needing to grind out portions of 32-byte blocks probably also reduces the risk from people trying to embed virus signatures or other malicious data. As for waxwing's original question -- I also intuitively believe that the only way to embed data in a Schnorr signature is by grinding or revealing your key ... and I'm not convinced you can do it even by revealing your key. (R is an EC point that you can't force to be any particular value except by making a NUMS point, which you then can't use to sign; and s = k + ex where e is a hash of kG (among other things) so I don't think you can force that value at all.) -- Andrew Poelstra Director, Blockstream Research Email: apoelstra at wpsoftware.net Web: https://www.wpsoftware.net/andrew The sun is always shining in space -Justin Lewis-Webster -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/aN21KbXTORgXAVH0%40mail.wpsoftware.net. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-01 23:11 ` Andrew Poelstra @ 2025-10-02 0:25 ` waxwing/ AdamISZ 2025-10-02 15:56 ` waxwing/ AdamISZ 0 siblings, 1 reply; 19+ messages in thread From: waxwing/ AdamISZ @ 2025-10-02 0:25 UTC (permalink / raw) To: Bitcoin Development Mailing List [-- Attachment #1.1: Type: text/plain, Size: 5725 bytes --] Hi Greg, Andrew, list, Answers to Greg then Andrew: > E.g. 2 of 2 with leaked key and a secure one. That's a very good point! I was narrowly focused on the signature scheme, but Bitcoin is more than a signature scheme! > But is it even really worth the analysis when grinding gets you a 12% embedding rate in that signature at not that significant cost? (because you can independently grind the nonce and signature itself, or nonce and pubkey) -- and when beyond the cost of the additional signature (making the output 3x its cost) requiring signing when forming the address completely kills public derivation, multisig with cold keys. etc? ... and then any of whatever spam concerns people have would likely be exacerbated by the spammers using more resources due to the embedding rate? I certainly don't think it's worth *doing* (hence my use of the term "appalling idea" :) ), as per the things you mention there. I wrote the document as a mostly academic investigation. It would be nice to be surer what the limits are, although I suspect we're all reasonably confident of what is/isn't possible. > 12% embedding rate Where do you get that number from? 33% for embedding 256 bits in (P, R, s) (but as per this discussion, according to me, at the cost of key leakage). If we include the other bytes in a (taproot anyway) utxo that's not much less, I guess 30% ish. I could try to guess but it'd be easier if you told me :) to Andrew: > As for waxwing's original question -- I also intuitively believe that the only way to embed data in a Schnorr signature is by grinding or revealing your key ... and I'm not convinced you can do it even by revealing your key. (R is an EC point that you can't force to be any particular value except by making a NUMS point, which you then can't use to sign; and s = k + ex where e is a hash of kG (among other things) so I don't think you can force that value at all.) Ah, I see what you're saying, it's a subtly different target. ECDSA allows that s be controlled, Schnorr doesn't, but I set up the game as "adversary must be able to publish a function f such that f(any published R, s, (e)) = data", i.e. not just f = identity function. That was why I wrote in the introduction (copied here for convenience:) "Data can effectively be embedded in signatures by using a publically- inferrable nonce, as was noted \href{https://groups.google.com/g/bitcoindev /c/d6ZO7gXGYbQ/m/Y8BfxMVxAAAJ}{here} and was later fleshed out in detail \href{https://blog.bitmex.com/the-unstoppable-jpg-in-private-keys/}{here} ( \textbf{note}: both these sources discuss nonce-reuse but it's worse than that: any \emph{publically inferrable} nonce can achieve the same thing, such as, the block hash of the parent block; this will have the same embedding rate and cannot be disallowed)." It may be a different target "politically" :) but I was only thinking technically, in terms of how people might end up using outputs. From a technical point of view it makes no difference if f is the identity or something more complex (as long as it's efficiently computable). Cheers, AdamISZ/waxwing On Wednesday, October 1, 2025 at 8:20:25 PM UTC-3 Andrew Poelstra wrote: > On Wed, Oct 01, 2025 at 10:10:16PM +0000, Greg Maxwell wrote: > > Intuitively it sounds likely, -- just in that the available values are a > > image on the curve and a value summed with a hash dependent on everything > > else. I think it would be hard to prove. > > > > But is it even really worth the analysis when grinding gets you a 12% > > embedding rate in that signature at not that significant cost? (because > you > > can independently grind the nonce and signature itself, or nonce and > > pubkey) -- and when beyond the cost of the additional signature (making > the > > output 3x its cost) requiring signing when forming the address completely > > kills public derivation, multisig with cold keys. etc? ... and then any > of > > whatever spam concerns people have would likely be exacerbated by the > > spammers using more resources due to the embedding rate? > > > > Some time ago, I talked to Ethan Heilman about this in the context of PQ > signatures, and he made the interesting point that you can think of > 12% embedding rate as representing an 8x discount for real signatures vs > embedded data. And that maybe that's okay, incentive-wise. > > Needing to grind out portions of 32-byte blocks probably also reduces > the risk from people trying to embed virus signatures or other malicious > data. > > As for waxwing's original question -- I also intuitively believe that > the only way to embed data in a Schnorr signature is by grinding or > revealing your key ... and I'm not convinced you can do it even by > revealing your key. (R is an EC point that you can't force to be any > particular value except by making a NUMS point, which you then can't use > to sign; and s = k + ex where e is a hash of kG (among other things) > so I don't think you can force that value at all.) > > -- > Andrew Poelstra > Director, Blockstream Research > Email: apoelstra at wpsoftware.net > Web: https://www.wpsoftware.net/andrew > > The sun is always shining in space > -Justin Lewis-Webster > > -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/2e366b25-f789-4c9d-acf9-b87149d6a796n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 10070 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-02 0:25 ` waxwing/ AdamISZ @ 2025-10-02 15:56 ` waxwing/ AdamISZ 2025-10-02 19:49 ` Greg Maxwell 0 siblings, 1 reply; 19+ messages in thread From: waxwing/ AdamISZ @ 2025-10-02 15:56 UTC (permalink / raw) To: Bitcoin Development Mailing List [-- Attachment #1.1: Type: text/plain, Size: 2892 bytes --] > > 12% embedding rate > Where do you get that number from? 33% for embedding 256 bits in (P, R, s) (but as per this discussion, according to me, at the cost of key leakage). If we include the other bytes in a (taproot anyway) utxo that's not much less, I guess 30% ish. I could try to guess but it'd be easier if you told me :) Thinking about it again: to publish data, you have to publish a transaction! I guess the most economical, paying taproot to taproot, is about 192 bytes with script path plus the posited extra 64 for the (R,s) in the output, so yeah that'd be 32 out of 256, 12.5%. Isn't the figure a bit different for key path though, because no control block? Well it hardly matters, it's some small fraction in that range. An interesting mechanical detail in this near-absurd scenario is that if you wanted to repeatedly publish off the same (presumably a few multiples of dust level) output, you couldn't also do the leak single key thing, since you'd lose control to re-spend. So that'd place us in the "explicit multisig" scenario that Greg mentioned, which I think would only make sense with legacy script? Kind of a different scenario, also it would be really weird to update legacy script to take into account a new "you must sign the pubkeys" rule. Though I guess in this fictional scenario, it might happen like that. If you did do it with legacy, you'd be publishing bare 2 of 2 multisig. If you did it with taproot due to how that works, the script is not published until the output is spent, so I think that's outside what I was considering ("data in utxo set"). (I guess you could also use something like a hash lock which might be more efficient). So anyway if you wanted to do this repeatedly and minimize cost, for whatever strange reason, you'd be adding another 50-100 bytes each time bringing that % down to like 10% or less. But that all became way too hypothetical to even analyze properly :) Anyway just to reemphasize I certainly wasn't advocating this sig-attaching system, but it seems important to know what the result of it would be: we would still not have changed the obvious reality that embedding data in witness gives more space for data, and is more economical, and we would only reduce by a big factor how much can be embedded in outputs (anything from 8% to 15% embedding rate seems possible depending on the hypothetical details), while having to screw up much of Bitcoin's functionality in the process. Cheers, AdamISZ/waxwing -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 3303 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-02 15:56 ` waxwing/ AdamISZ @ 2025-10-02 19:49 ` Greg Maxwell 2025-10-06 13:04 ` waxwing/ AdamISZ 0 siblings, 1 reply; 19+ messages in thread From: Greg Maxwell @ 2025-10-02 19:49 UTC (permalink / raw) To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List [-- Attachment #1: Type: text/plain, Size: 3950 bytes --] I just meant in the purely grinding non-key leaking case you could get 4 bytes into the nonce pretty easily and 4 bytes into either the pubkey or signature out of a 64 byte signature. Obviously the delivered embedding rate in a whole txn will be lower, but maybe not that much thanks to multisig outputs. On Thu, Oct 2, 2025 at 4:17 PM waxwing/ AdamISZ <ekaggata@gmail.com> wrote: > > > 12% embedding rate > > Where do you get that number from? 33% for embedding 256 bits in (P, R, > s) (but as per this discussion, according to me, at the cost of key > leakage). If we include the other bytes in a (taproot anyway) utxo that's > not much less, I guess 30% ish. I could try to guess but it'd be easier if > you told me :) > > Thinking about it again: to publish data, you have to publish a > transaction! I guess the most economical, paying taproot to taproot, is > about 192 bytes with script path plus the posited extra 64 for the (R,s) in > the output, so yeah that'd be 32 out of 256, 12.5%. Isn't the figure a bit > different for key path though, because no control block? Well it hardly > matters, it's some small fraction in that range. > > An interesting mechanical detail in this near-absurd scenario is that if > you wanted to repeatedly publish off the same (presumably a few multiples > of dust level) output, you couldn't also do the leak single key thing, > since you'd lose control to re-spend. So that'd place us in the "explicit > multisig" scenario that Greg mentioned, which I think would only make sense > with legacy script? Kind of a different scenario, also it would be really > weird to update legacy script to take into account a new "you must sign the > pubkeys" rule. Though I guess in this fictional scenario, it might happen > like that. If you did do it with legacy, you'd be publishing bare 2 of 2 > multisig. If you did it with taproot due to how that works, the script is > not published until the output is spent, so I think that's outside what I > was considering ("data in utxo set"). (I guess you could also use something > like a hash lock which might be more efficient). So anyway if you wanted to > do this repeatedly and minimize cost, for whatever strange reason, you'd be > adding another 50-100 bytes each time bringing that % down to like 10% or > less. > > But that all became way too hypothetical to even analyze properly :) > > Anyway just to reemphasize I certainly wasn't advocating this > sig-attaching system, but it seems important to know what the result of it > would be: we would still not have changed the obvious reality that > embedding data in witness gives more space for data, and is more > economical, and we would only reduce by a big factor how much can be > embedded in outputs (anything from 8% to 15% embedding rate seems possible > depending on the hypothetical details), while having to screw up much of > Bitcoin's functionality in the process. > > Cheers, > AdamISZ/waxwing > > -- > You received this message because you are subscribed to the Google Groups > "Bitcoin Development Mailing List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to bitcoindev+unsubscribe@googlegroups.com. > To view this discussion visit > https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com > <https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAAS2fgQtx_FnecKxpKryTq9o5HJfirY_Vyih6FXzHGHG2itmQQ%40mail.gmail.com. [-- Attachment #2: Type: text/html, Size: 4778 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-02 19:49 ` Greg Maxwell @ 2025-10-06 13:04 ` waxwing/ AdamISZ 0 siblings, 0 replies; 19+ messages in thread From: waxwing/ AdamISZ @ 2025-10-06 13:04 UTC (permalink / raw) To: Bitcoin Development Mailing List [-- Attachment #1.1: Type: text/plain, Size: 4873 bytes --] Yes, sorry, reading fail on my part (somehow missed that you were explicitly referring to grinding in the comment). Still don't think the 12% figure is a good one though? in (P,R,s) it's 8 out of 96 (and as discussed, worse if whole tx is (realistically) included), 1/4 the rate you get from direct key leakage. (Plus the perhaps trivial point that it does actually require work, which might conceivably matter at scale?). I'm not sure why one would not include P in the measure? Even an explicit multisig that does not sacrifice control of the output would be of the order of double the embedding rate, without having to do work. (P,R,s x 2 = 192 and embed 32 for a 1/6 rate; vs. grinding all 4 P,R values for a 1/12 rate). On Thursday, October 2, 2025 at 6:59:41 PM UTC-3 Greg Maxwell wrote: > I just meant in the purely grinding non-key leaking case you could get 4 > bytes into the nonce pretty easily and 4 bytes into either the pubkey or > signature out of a 64 byte signature. Obviously the delivered embedding > rate in a whole txn will be lower, but maybe not that much thanks to > multisig outputs. > > > On Thu, Oct 2, 2025 at 4:17 PM waxwing/ AdamISZ <ekag...@gmail.com> wrote: > >> > > 12% embedding rate >> > Where do you get that number from? 33% for embedding 256 bits in (P, R, >> s) (but as per this discussion, according to me, at the cost of key >> leakage). If we include the other bytes in a (taproot anyway) utxo that's >> not much less, I guess 30% ish. I could try to guess but it'd be easier if >> you told me :) >> >> Thinking about it again: to publish data, you have to publish a >> transaction! I guess the most economical, paying taproot to taproot, is >> about 192 bytes with script path plus the posited extra 64 for the (R,s) in >> the output, so yeah that'd be 32 out of 256, 12.5%. Isn't the figure a bit >> different for key path though, because no control block? Well it hardly >> matters, it's some small fraction in that range. >> >> An interesting mechanical detail in this near-absurd scenario is that if >> you wanted to repeatedly publish off the same (presumably a few multiples >> of dust level) output, you couldn't also do the leak single key thing, >> since you'd lose control to re-spend. So that'd place us in the "explicit >> multisig" scenario that Greg mentioned, which I think would only make sense >> with legacy script? Kind of a different scenario, also it would be really >> weird to update legacy script to take into account a new "you must sign the >> pubkeys" rule. Though I guess in this fictional scenario, it might happen >> like that. If you did do it with legacy, you'd be publishing bare 2 of 2 >> multisig. If you did it with taproot due to how that works, the script is >> not published until the output is spent, so I think that's outside what I >> was considering ("data in utxo set"). (I guess you could also use something >> like a hash lock which might be more efficient). So anyway if you wanted to >> do this repeatedly and minimize cost, for whatever strange reason, you'd be >> adding another 50-100 bytes each time bringing that % down to like 10% or >> less. >> >> But that all became way too hypothetical to even analyze properly :) >> >> Anyway just to reemphasize I certainly wasn't advocating this >> sig-attaching system, but it seems important to know what the result of it >> would be: we would still not have changed the obvious reality that >> embedding data in witness gives more space for data, and is more >> economical, and we would only reduce by a big factor how much can be >> embedded in outputs (anything from 8% to 15% embedding rate seems possible >> depending on the hypothetical details), while having to screw up much of >> Bitcoin's functionality in the process. >> >> Cheers, >> AdamISZ/waxwing >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Bitcoin Development Mailing List" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to bitcoindev+...@googlegroups.com. >> > To view this discussion visit >> https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com >> <https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/b486e5dd-d5b4-43f1-9d9a-20b772d3dc1bn%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 6492 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ 2025-10-01 22:10 ` Greg Maxwell @ 2025-10-03 13:24 ` Peter Todd 2025-10-04 2:39 ` waxwing/ AdamISZ 2025-10-07 8:22 ` Anthony Towns ` (2 subsequent siblings) 4 siblings, 1 reply; 19+ messages in thread From: Peter Todd @ 2025-10-03 13:24 UTC (permalink / raw) To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List [-- Attachment #1: Type: text/plain, Size: 1531 bytes --] On Wed, Oct 01, 2025 at 07:24:50AM -0700, waxwing/ AdamISZ wrote: > Hi all, > > https://github.com/AdamISZ/schnorr-unembeddability/ > > Here I'm analyzing whether the following statement is true: "if you can > embed data into a (P, R, s) tuple (Schnorr pubkey and signature, BIP340 > style), without grinding or using a sidechannel to "inform" the reader, you > must be leaking your private key". > > See the abstract for a slightly more fleshed out context. > > I'm curious about the case of P, R, s published in utxos to prevent usage > of utxos as data. I think this answers in the half-affirmative: you can > only embed data by leaking the privkey so that it (can) immediately fall > out of the utxo set. > > (To emphasize, this is different to the earlier observations (including by > me!) that just say it is *possible* to leak data by leaking the private > key; here I'm trying to prove that there is *no other way*). You can probably use timelock encryption to ensure that the leak of the private key only happens in the future, after the funds are recovered by the owner in a subsequent transaction. -- https://petertodd.org 'peter'[:-1]@petertodd.org -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/aN_OlgvB-Co1BL19%40petertodd.org. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-03 13:24 ` Peter Todd @ 2025-10-04 2:39 ` waxwing/ AdamISZ 0 siblings, 0 replies; 19+ messages in thread From: waxwing/ AdamISZ @ 2025-10-04 2:39 UTC (permalink / raw) To: Bitcoin Development Mailing List [-- Attachment #1.1: Type: text/plain, Size: 1252 bytes --] Hi Peter, > You can probably use timelock encryption to ensure that the leak of the private key only happens in the future, after the funds are recovered by the owner in a subsequent transaction. Another very interesting point, there, to get around the issue of key leakage ... albeit I don't see a usecase, maybe I'm just not imaginative enough, very possible. If someone wants to keep something in the utxo set "forever", it doesn't help. If they want the property of "immediately accessible in the utxo set" (like "deposit into some fancy system with a blob of data"; I emphasize "deposit" because that would explain why not "just put it in the witness", your current outputs don't support that; correct me if my reasoning is wrong here), then I guess they don't get that, either: the data is accessible "intermediate term" instead. Cheers, AdamISZ/waxwing -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/7b4296ca-50ed-4a8b-b853-0accff46abfbn%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 1627 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ 2025-10-01 22:10 ` Greg Maxwell 2025-10-03 13:24 ` Peter Todd @ 2025-10-07 8:22 ` Anthony Towns 2025-10-07 12:05 ` waxwing/ AdamISZ 2025-10-31 9:10 ` Tim Ruffing 2025-10-31 13:19 ` Garlo Nicon 4 siblings, 1 reply; 19+ messages in thread From: Anthony Towns @ 2025-10-07 8:22 UTC (permalink / raw) To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List On Wed, Oct 01, 2025 at 07:24:50AM -0700, waxwing/ AdamISZ wrote: > I'm curious about the case of P, R, s published in utxos to prevent usage > of utxos as data. I think this answers in the half-affirmative: you can > only embed data by leaking the privkey so that it (can) immediately fall > out of the utxo set. I think you can attack the setup here. If you allow scriptPubKeys in the utxo set whose spending conditions are HTLC/atomic-swap-like: (pubkey A and preimage reveal of X) OR (pubkey B and block height > H) then you either set H to be arbitrarily far in the future and reveal B's privkey, or choose an NUMS X with no known preimage, and reveal A's privkey. If you don't allow those things (eg, by requiring such constructions also have a (pubkey musig(A,B)) path) then I think you rule out NUMS-IPK constructions, and end up making things like vaults ("hotkey with delay, coldkey anytime") difficult to send to ("I have to sign with my cold key to request funds?"), or, depending on what the utxo R,s is signing, encourage key reuse. > (To emphasize, this is different to the earlier observations (including by > me!) that just say it is *possible* to leak data by leaking the private > key; here I'm trying to prove that there is *no other way*). That seems right to me. I think if the signature scheme supported pubkey recovery (ie, s*G = R + H(R,m)*P, and our "m" didn't commit to P as well), you could get around this by just having P be the data, with no one, including the "signer" able to recover the private key. > However I still am probably in the large majority that thinks it's > appalling to imagine a sig attached to every pubkey onchain. I think the only thing achieved by embedding data in the utxo set (vs an OP_RETURN output or witness data) is to bloat the utxo set; and if that's the goal, it can equally easily be done with spendable outputs that the attacker simply chooses not to ever spend. So that doesn't seem like a terribly interesting solution to anything. As far as embedding data in signatures goes, I think the following scheme would allow you to publish data in a cryptographically-secure way, with minimal lost funds: 0) Setup secret keys p and q, and a 32-byte secret k. H(a,b,..) is sha256 of a,b,.. concatenated. 1) Split your data into N 31 byte blocks, a1, a2, .., aN. 2) Calculate r0 as H(k*G). Calculate r1, .., rN as: r(i+1) = H(p, r(i)) + a(i) 3) Sign N+1 transactions in a chain spending pubkey p*G, using rN, r(N-1), .., r1, r0 as nonces. All but the final tx should pay to a p*G output to continue the chain; the final output should pay to q*G instead. 4) Once all transactions are sufficiently confirmed, spend the final output with k as the secret nonce (and hence R=k*G as the public nonce). Recover the data using the following process: 1) From the final transaction, recover R=k*G, and calculate r0 as H(R). Recover p from the previous transaction, p = (s0-r0)/H(r0*G, P,mi). 2) Recover ri from each signature; ri = si - H(Ri, P, mi)*p. Recover the data ai as ai = ri - H(p,r(i-1)). Dealing with the points being 32-bytes might require carrying over a sign-bit; but that should be possible in the spare ~7 bits since each block was only 31 bytes not 32 bytes. Left as an exercise for the reader, etc. I believe that the privkey p is secure prior to k*G being revealed, since all the nonces are distinct hashes seeded by that privkey; and q remains secure because k is never revealed. If you wanted to not reuse the pubkey p*G repeatedly, you could tweak it to be p0 = p, p(i+1) = p + H(k*G, p(i)), or similar. That would allow you to use an n-of-n multisig to get multiple blocks in a single transaction without seeming weird, eg. I believe the only way to distinguish this from a normal transaction pattern where a wallet has a change output, is via the final transaction that reveals k*G, and detecting the relationship between k*G and the spending conditions of the transaction that created the coin being spent. That's already somewhat expensive to check for every spend, but could be made more so by publishing k*G on some other medium (ie the data is in the blockchain, but you obtain the txid and key to find the data from elsewhere), or by revealing (k+x)*G where x is a random 20-bit (?) number, and a significant but tractable amount of grinding is needed to recover the desired k*G and decode the data -- the idea being that that is tractable for someone who knows there is data at that txid, but not tractable when performed on every signature in the blockchain in order to filter data publication. I think if you did 20 such transactions per block, each spending a single 20-of-20 tapscript multisig, you'd get 12400 bytes of data per block (without violating standardness constraints), at a cost of ~11800vb, so much less efficient than inscriptions, but slightly more efficient than OP_RETURN, and significantly less detectable than either. I think Knots default policy currently allows up to 50-of-50 multisig in tapscript, which would give you 31kB of data in ~26.6kvB of tx weight in a block. If you're regularly making payments from a particular wallet, I think that procedure would allow you to encode data in your change outputs at the rate of 32B/tx for no additional cost. Though the data would only be recoverable once complete, and it's probably worth noting that I haven't provided any security proofs... Cheers, aj -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/aOTNvteE8PCm6yDd%40erisian.com.au. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-07 8:22 ` Anthony Towns @ 2025-10-07 12:05 ` waxwing/ AdamISZ 2025-10-08 5:12 ` Anthony Towns 0 siblings, 1 reply; 19+ messages in thread From: waxwing/ AdamISZ @ 2025-10-07 12:05 UTC (permalink / raw) To: Bitcoin Development Mailing List [-- Attachment #1.1: Type: text/plain, Size: 8898 bytes --] Hi aj, Interesting points! Answers inline. On Tuesday, October 7, 2025 at 6:38:40 AM UTC-3 Anthony Towns wrote: On Wed, Oct 01, 2025 at 07:24:50AM -0700, waxwing/ AdamISZ wrote: > I'm curious about the case of P, R, s published in utxos to prevent usage > of utxos as data. I think this answers in the half-affirmative: you can > only embed data by leaking the privkey so that it (can) immediately fall > out of the utxo set. I think you can attack the setup here. If you allow scriptPubKeys in the utxo set whose spending conditions are HTLC/atomic-swap-like: (pubkey A and preimage reveal of X) OR (pubkey B and block height > H) then you either set H to be arbitrarily far in the future and reveal B's privkey, or choose an NUMS X with no known preimage, and reveal A's privkey. Yes. In the paper (and my OP email) I'm trying to narrow it down completely to a P, R, s structure. I guess if we try to be realistic about this "publish a signature in the output always" horrible scenario, it would have to just ditch the NUMS variant of taproot, and I agree, that is a very Bad Thing (TM). (uh sorry you discuss this in the next paragraph but, w/e). Alternative examples like multisig or hash lock in script to get the data leakage without losing control of the output (necessarily) have been mentioned but I like your 2-branch setup as a good flexible example. If you don't allow those things (eg, by requiring such constructions also have a (pubkey musig(A,B)) path) then I think you rule out NUMS-IPK constructions, and end up making things like vaults ("hotkey with delay, coldkey anytime") difficult to send to ("I have to sign with my cold key to request funds?"), or, depending on what the utxo R,s is signing, encourage key reuse. > (To emphasize, this is different to the earlier observations (including by > me!) that just say it is *possible* to leak data by leaking the private > key; here I'm trying to prove that there is *no other way*). That seems right to me. I think if the signature scheme supported pubkey recovery (ie, s*G = R + H(R,m)*P, and our "m" didn't commit to P as well), you could get around this by just having P be the data, with no one, including the "signer" able to recover the private key. Yes, basically. I discuss this in the paper w.r.t. ECDSA. Your description of the relevance of pubkey recovery is good, but there are some nuances. You can't quite (with ECDSA) get P to be the data and have a valid sig, but you can get 's' to be the data simply by backsolving for the private key x. Lack of "pubkey prefixing" in the very funky 'commitment to the nonce' in ECDSA causes that. And the second nuance, you did actually mention: you get "not leaking the key" for free, here. But it's still only a 32/96 bytes embedding rate though, the way I count it. > However I still am probably in the large majority that thinks it's > appalling to imagine a sig attached to every pubkey onchain. I think the only thing achieved by embedding data in the utxo set (vs an OP_RETURN output or witness data) is to bloat the utxo set; and if that's the goal, it can equally easily be done with spendable outputs that the attacker simply chooses not to ever spend. So that doesn't seem like a terribly interesting solution to anything. I think the logic of that is not quite right. Suppose I want to embed pictures into the unpruneable utxo set specifically (and not only 'in transactions'). The starting point here was me trying to write out how you can't embed data in known-privkey (Schnorr) P, R, s tuples. And not only pictures; as Andrew pointed out above, there's always the concern of some kind of virus-y "naughty" data. As far as embedding data in signatures goes, I think the following scheme would allow you to publish data in a cryptographically-secure way, with minimal lost funds: 0) Setup secret keys p and q, and a 32-byte secret k. H(a,b,..) is sha256 of a,b,.. concatenated. 1) Split your data into N 31 byte blocks, a1, a2, .., aN. 2) Calculate r0 as H(k*G). Calculate r1, .., rN as: r(i+1) = H(p, r(i)) + a(i) 3) Sign N+1 transactions in a chain spending pubkey p*G, using rN, r(N-1), .., r1, r0 as nonces. All but the final tx should pay to a p*G output to continue the chain; the final output should pay to q*G instead. 4) Once all transactions are sufficiently confirmed, spend the final output with k as the secret nonce (and hence R=k*G as the public nonce). Recover the data using the following process: 1) From the final transaction, recover R=k*G, and calculate r0 as H(R). Recover p from the previous transaction, p = (s0-r0)/H(r0*G, P,mi). 2) Recover ri from each signature; ri = si - H(Ri, P, mi)*p. Recover the data ai as ai = ri - H(p,r(i-1)). Dealing with the points being 32-bytes might require carrying over a sign-bit; but that should be possible in the spare ~7 bits since each block was only 31 bytes not 32 bytes. Left as an exercise for the reader, etc. I believe that the privkey p is secure prior to k*G being revealed, since all the nonces are distinct hashes seeded by that privkey; and q remains secure because k is never revealed. If you wanted to not reuse the pubkey p*G repeatedly, you could tweak it to be p0 = p, p(i+1) = p + H(k*G, p(i)), or similar. That would allow you to use an n-of-n multisig to get multiple blocks in a single transaction without seeming weird, eg. I believe the only way to distinguish this from a normal transaction pattern where a wallet has a change output, is via the final transaction that reveals k*G, and detecting the relationship between k*G and the spending conditions of the transaction that created the coin being spent. That's already somewhat expensive to check for every spend, but could be made more so by publishing k*G on some other medium (ie the data is in the blockchain, but you obtain the txid and key to find the data from elsewhere), or by revealing (k+x)*G where x is a random 20-bit (?) number, and a significant but tractable amount of grinding is needed to recover the desired k*G and decode the data -- the idea being that that is tractable for someone who knows there is data at that txid, but not tractable when performed on every signature in the blockchain in order to filter data publication. I think if you did 20 such transactions per block, each spending a single 20-of-20 tapscript multisig, you'd get 12400 bytes of data per block (without violating standardness constraints), at a cost of ~11800vb, so much less efficient than inscriptions, but slightly more efficient than OP_RETURN, and significantly less detectable than either. I think Knots default policy currently allows up to 50-of-50 multisig in tapscript, which would give you 31kB of data in ~26.6kvB of tx weight in a block. If you're regularly making payments from a particular wallet, I think that procedure would allow you to encode data in your change outputs at the rate of 32B/tx for no additional cost. Though the data would only be recoverable once complete, and it's probably worth noting that I haven't provided any security proofs... Very nice example. I am glad you took the trouble to write it out, because I agree that examples like that are worth working through because as you say they lean closer to being properly indistinguishable from ordinary transaction patterns. My analysis was narrower: output-side embedding (in a theoretical future of P,R,s outputs). But that's a little confusing because (P, R, s) is still there whether some of it is put in witness or not. So everyone seems to agree that privkey reveal is necessary for that, but everyone is also pointing out that with Bitcoin's actual consensus scripting system, that doesn't quite mean what it seems! And the embedding rate is not very good. In this framing, not much has changed in your "chained" example: once the privkey p is revealed, you get the k value per chain link, so it's still roughly a 1/3 ratio, or more realistically, as you mention (and I did upthread), it's per *transaction* which is a much lower rate. Your points about limits, standardness constraints are well taken; those are the kinds of things that do actually matter today, but I was not thinking about. -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/e4d271ad-9ea3-41e5-96e2-6cb0118943e4n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 10567 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-07 12:05 ` waxwing/ AdamISZ @ 2025-10-08 5:12 ` Anthony Towns 2025-10-08 12:55 ` waxwing/ AdamISZ 0 siblings, 1 reply; 19+ messages in thread From: Anthony Towns @ 2025-10-08 5:12 UTC (permalink / raw) To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List On Tue, Oct 07, 2025 at 05:05:24AM -0700, waxwing/ AdamISZ wrote: > Yes, basically. I discuss this in the paper w.r.t. ECDSA. Your description > of the relevance of pubkey recovery is good, but there are some nuances. > You can't quite (with ECDSA) get P to be the data and have a valid sig, but > you can get 's' to be the data simply by backsolving for the private key x. > Lack of "pubkey prefixing" in the very funky 'commitment to the nonce' in > ECDSA causes that. And the second nuance, you did actually mention: you get > "not leaking the key" for free, here. But it's still only a 32/96 bytes > embedding rate though, the way I count it. You've got 4x 32-byte values to play with: s, r, p and m. The verification equation determines one of these, reducing it to 3x. m isn't able to be freely chosen, reducing it to 2x. And being able to reverse the equation in order to calculate anything requires the receiver to know one of the secrets, which reduces it to 1x. (Grinding can bump that back up to a factor of 1.something) So that's the 32. On the other side, you need to transmit everything but m which is otherwise determined by the setup, so that's the 96. > I think the logic of that is not quite right. Suppose I want to embed > pictures into the unpruneable utxo set specifically (and not only 'in > transactions'). Sure, but then I'll also suppose your goal is to harm Bitcoin by bloating the utxo set. If that weren't one of your fundamental goals, you'd use other, cheaper and easier, ways of encoding the data. > Very nice example. I am glad you took the trouble to write it out, because > I agree that examples like that are worth working through because as you > say they lean closer to being properly indistinguishable from ordinary > transaction patterns. I think the (P,R,s) outputs could be an interesting design for a non-programmable system that was intended purely for payments -- a FEDwire/SWIFT replacement without the possibility of vaults, lightning, etc. Presumably more mimblewimble friendly etc too. Presumably the "R,s" values could also be a signature of P by the operator's well known pubkey, giving you a KYC/CBDC-like system too. You could get programmability back in this scenario by allow P to sign a script, which you then satisfy, rather than signing a payment directly (ie, the graftroot approach). Anyway, once you make the system programmable in interesting ways, I think you get data embeddability pretty much immediately, and then it's just a matter of trading off the optimal encoding rate versus how easily identifiable your transactions can be. Forcing data to be hidden at a cost of making it less efficient just leaves less resources available to other users of the system, though, which doesn't seem like a win in any way to me. > Your points about limits, standardness constraints are well taken; those > are the kinds of things that do actually matter today, but I was not > thinking about. Note that I mentioned the standardness constraints not because they're limits today, but rather because they reflect the form existing txs take, so mimicing that form would allow txs embedding data via this scheme to be difficult to distinguish from other txs, and hence equally difficult to censor/filter. Cheers, aj -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/aOXyvGaKfe7bqTXv%40erisian.com.au. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-08 5:12 ` Anthony Towns @ 2025-10-08 12:55 ` waxwing/ AdamISZ 0 siblings, 0 replies; 19+ messages in thread From: waxwing/ AdamISZ @ 2025-10-08 12:55 UTC (permalink / raw) To: Bitcoin Development Mailing List [-- Attachment #1.1: Type: text/plain, Size: 5579 bytes --] Answers inline. On Wednesday, October 8, 2025 at 5:45:06 AM UTC-3 Anthony Towns wrote: On Tue, Oct 07, 2025 at 05:05:24AM -0700, waxwing/ AdamISZ wrote: > Yes, basically. I discuss this in the paper w.r.t. ECDSA. Your description > of the relevance of pubkey recovery is good, but there are some nuances. > You can't quite (with ECDSA) get P to be the data and have a valid sig, but > you can get 's' to be the data simply by backsolving for the private key x. > Lack of "pubkey prefixing" in the very funky 'commitment to the nonce' in > ECDSA causes that. And the second nuance, you did actually mention: you get > "not leaking the key" for free, here. But it's still only a 32/96 bytes > embedding rate though, the way I count it. You've got 4x 32-byte values to play with: s, r, p and m. The verification equation determines one of these, reducing it to 3x. m isn't able to be freely chosen, reducing it to 2x. And being able to reverse the equation in order to calculate anything requires the receiver to know one of the secrets, which reduces it to 1x. (Grinding can bump that back up to a factor of 1.something) So that's the 32. On the other side, you need to transmit everything but m which is otherwise determined by the setup, so that's the 96. Yeah I think so, roughly. It's not 100% watertight deductions but it seems correct from where I'm sitting. (I would only nit that 'm' isn't in consideration as it's implicit, not published, in current signature usage; in a proposed signature-in-output, m would obviously be constrained to something with no wiggle room (and including P if we used ECDSA, but we wouldn't). > I think the logic of that is not quite right. Suppose I want to embed > pictures into the unpruneable utxo set specifically (and not only 'in > transactions'). Sure, but then I'll also suppose your goal is to harm Bitcoin by bloating the utxo set. If that weren't one of your fundamental goals, you'd use other, cheaper and easier, ways of encoding the data. But the goal can be simply this: my data is more marketable if I can plausibly claim that it's embedded into bitcoin nodes for eternity (whether true or not, it's marketable). AFAIK this is indeed a thing, in the real world. > Very nice example. I am glad you took the trouble to write it out, because > I agree that examples like that are worth working through because as you > say they lean closer to being properly indistinguishable from ordinary > transaction patterns. I think the (P,R,s) outputs could be an interesting design for a non-programmable system that was intended purely for payments -- a FEDwire/SWIFT replacement without the possibility of vaults, lightning, etc. Presumably more mimblewimble friendly etc too. Presumably the "R,s" values could also be a signature of P by the operator's well known pubkey, giving you a KYC/CBDC-like system too. You could get programmability back in this scenario by allow P to sign a script, which you then satisfy, rather than signing a payment directly (ie, the graftroot approach). I like this line of thought, and indeed I'd forgotten about graftroot and the whole delegation angle. (and just to repeat the point made earlier: we'd only need to sign over a message including P for ecdsa, but we wouldn't use that.) I guess if you're discussing a hypothetical permissioned system though it's a whole different world, so I'm going to sidestep that one. But it does sound interesting to do delegation and then ZkPOK outputs even in a Bitcoin world. Albeit it's a long way from where we are today. Of course we're firmly pie in the sky again here, but I think it helps inform thinking about Bitcoin as it is concretely today. Anyway, once you make the system programmable in interesting ways, I think you get data embeddability pretty much immediately, My main motivation in discussing this was indeed the extent to which you get embeddability even without any programmability; as we've established, it's not zero, and it's not restricted to grinding (exponential work). But in *pure* unprogrammable, ZkPOK outputs of form P, R,s and nothing else allowed, it *is*, I'm claiming, restricted to key leakage and doesn't surpass 33%. and then it's just a matter of trading off the optimal encoding rate versus how easily identifiable your transactions can be. Forcing data to be hidden at a cost of making it less efficient just leaves less resources available to other users of the system, though, which doesn't seem like a win in any way to me. > Your points about limits, standardness constraints are well taken; those > are the kinds of things that do actually matter today, but I was not > thinking about. Note that I mentioned the standardness constraints not because they're limits today, but rather because they reflect the form existing txs take, so mimicing that form would allow txs embedding data via this scheme to be difficult to distinguish from other txs, and hence equally difficult to censor/filter. I see. Good point. -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/323c2d13-e90f-49c5-bfe0-f161b8b8dbb4n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 7024 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ ` (2 preceding siblings ...) 2025-10-07 8:22 ` Anthony Towns @ 2025-10-31 9:10 ` Tim Ruffing 2025-10-31 13:09 ` waxwing/ AdamISZ 2025-10-31 13:19 ` Garlo Nicon 4 siblings, 1 reply; 19+ messages in thread From: Tim Ruffing @ 2025-10-31 9:10 UTC (permalink / raw) To: waxwing/ AdamISZ, Bitcoin Development Mailing List Hey Adam, I think something is wrong here. Assume a group of order n=p*2^t where p is a large enough prime such that the DL problem is hard. For example, Curve25519 has t=3 but the DL problem still hard. Or, assuming n+1 is also prime, work in the multiplicative group of integers modulo n+1 (which has group order n then). I'm not aware of any obstacles to constructing such groups for sufficiently large values of t. The crucial point is that, in these groups, the Pohlig-Hellman algorithm can be used to compute the t least significant bits of the discrete logarithm k of a group element R efficiently. So to embed t bits in a Schnorr signature (R, s), simply pick k such that its t least significant bits t are exactly these bits. Of course, this does not work in BIP340 because it uses the secp256k1 group for which t=0, i.e., the group has prime order. But it appears that the reasoning in your write up is not specific to prime-order groups. Thus I conclude that something must be wrong or insufficient in your argument. Let me clarify that I do not claim that data can be embedded in a BIP340 signature. I only claim that your arguments for why data can't be embedded do not appear to be sound. I believe any proof that data cannot be embedded in a Schnorr signature (or in a group element R) in a prime-order group must somehow exploit the fact that all bits of k are hard to compute from R; see Section 10 in Håstad-Näslund 2003 [1] for a proof that this is the case for prime-order groups. Best, Tim [1] https://www.csc.kth.se/~johanh/hnrsaacm.pdf On Wed, 2025-10-01 at 07:24 -0700, waxwing/ AdamISZ wrote: > Hi all, > > https://github.com/AdamISZ/schnorr-unembeddability/ > > Here I'm analyzing whether the following statement is true: "if you > can embed data into a (P, R, s) tuple (Schnorr pubkey and signature, > BIP340 style), without grinding or using a sidechannel to "inform" > the reader, you must be leaking your private key". > > See the abstract for a slightly more fleshed out context. > > I'm curious about the case of P, R, s published in utxos to prevent > usage of utxos as data. I think this answers in the half-affirmative: > you can only embed data by leaking the privkey so that it (can) > immediately fall out of the utxo set. > > (To emphasize, this is different to the earlier observations > (including by me!) that just say it is *possible* to leak data by > leaking the private key; here I'm trying to prove that there is *no > other way*). > > However I still am probably in the large majority that thinks it's > appalling to imagine a sig attached to every pubkey onchain. > > Either way, I found it very interesting! Perhaps others will find the > analysis valuable. > > Feedback (especially of the "that's wrong/that's not meaningful" > variety) appreciated. > > Regards, > AdamISZ/waxwing > > -- > You received this message because you are subscribed to the Google > Groups "Bitcoin Development Mailing List" group. > To unsubscribe from this group and stop receiving emails from it, > send an email to bitcoindev+unsubscribe@googlegroups.com. > To view this discussion visit > https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com > . -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/5c15c2c265c92d5527fe3da510ac76c2a6e8e0e4.camel%40real-or-random.org. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-31 9:10 ` Tim Ruffing @ 2025-10-31 13:09 ` waxwing/ AdamISZ 0 siblings, 0 replies; 19+ messages in thread From: waxwing/ AdamISZ @ 2025-10-31 13:09 UTC (permalink / raw) To: Bitcoin Development Mailing List [-- Attachment #1.1: Type: text/plain, Size: 5873 bytes --] Hi Tim, First, thanks for the considered reply! That is a very interesting point for sure. I guess I have 2 or 3 responses: First, my "theorem 1" was deliberately specific about BIP340. I am aware of the impact of Pohlig-Hellman on non prime order groups. However despite me being able to "defend the thesis" in that literal sense, I still think your overall critique is valid. I think the "framework" (at least in the updated version of the paper; the first couple of drafts were a bit incoherent) makes sense, but it's too vague in the most important part of the reasoning, namely the invertibility of the functions described. But w.r.t. the values P and R, throughout, I was assuming pseudorandomness (uncontrollable output-ness) [1] of the mappings x -> P = xG and k -> R=kG. That assumption was both explicit and implicit in several steps (or perhaps leaps) I took (see e.g. how I refer to the function f(P, R, s) and in at least one place basically "ignore" the P, R dependency because they are uncontrollable); in my head , that was justifiable based on it being a prime order group, but at the very least, I should have been explicit. > I believe any proof that data cannot be embedded in a Schnorr signature (or in a group element R) in a prime-order group must somehow exploit the fact that all bits of k are hard to compute from R; see Section 10 in Håstad-Näslund 2003 [1] for a proof that this is the case for prime-order groups. Nice reference, thanks! I definitely wouldn't have found that. As per above, I just assumed this without justifying it; so my end conclusion that there is a reduction to hash preimage resistance is I guess incomplete. [1] so .. k -> kG is kind of a pseudorandom function, or generator, right? If this is a DDH assumption, then perhaps that's what we should really reduce to (well, plus hash preimage resistance)? Cheers, Adam On Friday, October 31, 2025 at 7:51:48 AM UTC-3 Tim Ruffing wrote: > Hey Adam, > > I think something is wrong here. > > Assume a group of order n=p*2^t where p is a large enough prime such > that the DL problem is hard. For example, Curve25519 has t=3 but the DL > problem still hard. Or, assuming n+1 is also prime, work in the > multiplicative group of integers modulo n+1 (which has group order n > then). I'm not aware of any obstacles to constructing such groups for > sufficiently large values of t. > > The crucial point is that, in these groups, the Pohlig-Hellman > algorithm can be used to compute the t least significant bits of the > discrete logarithm k of a group element R efficiently. So to embed t > bits in a Schnorr signature (R, s), simply pick k such that its t least > significant bits t are exactly these bits. > > Of course, this does not work in BIP340 because it uses the secp256k1 > group for which t=0, i.e., the group has prime order. But it appears > that the reasoning in your write up is not specific to prime-order > groups. Thus I conclude that something must be wrong or insufficient in > your argument. > > Let me clarify that I do not claim that data can be embedded in a > BIP340 signature. I only claim that your arguments for why data can't > be embedded do not appear to be sound. I believe any proof that data > cannot be embedded in a Schnorr signature (or in a group element R) in > a prime-order group must somehow exploit the fact that all bits of k > are hard to compute from R; see Section 10 in Håstad-Näslund 2003 [1] > for a proof that this is the case for prime-order groups. > > Best, > Tim > > [1] https://www.csc.kth.se/~johanh/hnrsaacm.pdf > > > > On Wed, 2025-10-01 at 07:24 -0700, waxwing/ AdamISZ wrote: > > Hi all, > > > > https://github.com/AdamISZ/schnorr-unembeddability/ > > > > Here I'm analyzing whether the following statement is true: "if you > > can embed data into a (P, R, s) tuple (Schnorr pubkey and signature, > > BIP340 style), without grinding or using a sidechannel to "inform" > > the reader, you must be leaking your private key". > > > > See the abstract for a slightly more fleshed out context. > > > > I'm curious about the case of P, R, s published in utxos to prevent > > usage of utxos as data. I think this answers in the half-affirmative: > > you can only embed data by leaking the privkey so that it (can) > > immediately fall out of the utxo set. > > > > (To emphasize, this is different to the earlier observations > > (including by me!) that just say it is *possible* to leak data by > > leaking the private key; here I'm trying to prove that there is *no > > other way*). > > > > However I still am probably in the large majority that thinks it's > > appalling to imagine a sig attached to every pubkey onchain. > > > > Either way, I found it very interesting! Perhaps others will find the > > analysis valuable. > > > > Feedback (especially of the "that's wrong/that's not meaningful" > > variety) appreciated. > > > > Regards, > > AdamISZ/waxwing > > > > -- > > You received this message because you are subscribed to the Google > > Groups "Bitcoin Development Mailing List" group. > > To unsubscribe from this group and stop receiving emails from it, > > send an email to bitcoindev+...@googlegroups.com. > > To view this discussion visit > > > https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com > > . > -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/61eb9abe-3e26-495d-9d00-dbda69fe018bn%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 7906 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ ` (3 preceding siblings ...) 2025-10-31 9:10 ` Tim Ruffing @ 2025-10-31 13:19 ` Garlo Nicon 2025-11-01 14:49 ` waxwing/ AdamISZ 4 siblings, 1 reply; 19+ messages in thread From: Garlo Nicon @ 2025-10-31 13:19 UTC (permalink / raw) To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List [-- Attachment #1: Type: text/plain, Size: 3479 bytes --] > if you can embed data into a (P, R, s) tuple (Schnorr pubkey and signature, BIP340 style), without grinding or using a sidechannel to "inform" the reader, you must be leaking your private key You can embed data into a valid signature. For example: R=k*G P=d*G k=first_chunk_of_data d=second_chunk_of_data And then, keys are "weak", because people can use "known plaintext attack", to get them. However, if you want to push random data, that is unknown to the reader, then it is known only by the holder of the data. Which means, that the efficiency of this encoding is somewhere around 66%, by grinding SHA-256 hashes, it could probably reach around 70% in practice. Only s-value is something, that needs any grinding, for k-value and d-value, you need only the data, and nothing else. So, I guess it is a spectrum: something like 70% efficiency means, that you need "known plaintext attack" to get the data. And then, you can use less and less bits per public key, to make it arbitrarily weaker. Then, instead of relying on a timelock, you can rely on computation difficulty for the reader, for example: "how many bits I need to leak, to make it breakable by lattice attack". śr., 1 paź 2025 o 21:50 waxwing/ AdamISZ <ekaggata@gmail.com> napisał(a): > Hi all, > > https://github.com/AdamISZ/schnorr-unembeddability/ > > Here I'm analyzing whether the following statement is true: "if you can > embed data into a (P, R, s) tuple (Schnorr pubkey and signature, BIP340 > style), without grinding or using a sidechannel to "inform" the reader, you > must be leaking your private key". > > See the abstract for a slightly more fleshed out context. > > I'm curious about the case of P, R, s published in utxos to prevent usage > of utxos as data. I think this answers in the half-affirmative: you can > only embed data by leaking the privkey so that it (can) immediately fall > out of the utxo set. > > (To emphasize, this is different to the earlier observations (including by > me!) that just say it is *possible* to leak data by leaking the private > key; here I'm trying to prove that there is *no other way*). > > However I still am probably in the large majority that thinks it's > appalling to imagine a sig attached to every pubkey onchain. > > Either way, I found it very interesting! Perhaps others will find the > analysis valuable. > > Feedback (especially of the "that's wrong/that's not meaningful" variety) > appreciated. > > Regards, > AdamISZ/waxwing > > -- > You received this message because you are subscribed to the Google Groups > "Bitcoin Development Mailing List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to bitcoindev+unsubscribe@googlegroups.com. > To view this discussion visit > https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com > <https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAN7kyNhE39gJyV7xCRNpZAu-jkP7bu2DvkhZ7FdLsGxa-QLjQw%40mail.gmail.com. [-- Attachment #2: Type: text/html, Size: 4568 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-10-31 13:19 ` Garlo Nicon @ 2025-11-01 14:49 ` waxwing/ AdamISZ 2025-11-02 9:11 ` Garlo Nicon 0 siblings, 1 reply; 19+ messages in thread From: waxwing/ AdamISZ @ 2025-11-01 14:49 UTC (permalink / raw) To: Bitcoin Development Mailing List [-- Attachment #1.1: Type: text/plain, Size: 5975 bytes --] Hi Garlo Nicon, Before I answer your point I want to mention (to readers): probably some things remained tacit in this thread but are worth emphasizing: 1. It's always trivial to get a 100% embedding rate if it's OK to assume the embedder is choosing to share data off-blockchain with others (just xor the real signature with their chosen data and call that the key). This is of course is a bit silly (though not entirely silly); if the purpose is to *communicate* then they can use the communication channel for the data, instead of the xor value, and forget about the blockchain. On the other hand if their purpose is to publish data, and rely on the immutability and persistence of the blockchain, then there is the problem that the xor key can be lost; it's that offchain data that represents the actual semantics of what they published, and so they're in rather the same position as they would have been without the blockchain existing at all. (insert finesses/caveats but, basically). 2. All of the above theoretical analysis doesn't work for ECDSA *as an algorithm outside of Bitcoin*. You get 32 bytes of embedding without leaking the private key, there. (the s-value can literally be made to say "hello world" 3 times or whatever). this is the non-pubkey-committing nature of standard ECDSA. I *think* you can make it behave the same as Schnorr in terms of pubkey-unembeddability-without-key-leakage by putting the pubkey in the message, but it's even harder to analyze than Schnorr (which is already hard). 3. In contrast to 2., the pubkey is in fact embedded in the message (indirectly), at least usually, in Bitcoin (except sighash_noinput type stuff which isn't live), so you can't put hello world in the signatures for now, at least AFAIK. Still even then you're stuck at a 33% rate if we include all of P, R, s, which seems reasonable (in fact, that's a generous measure). Again, I am ignoring grinding which always adds a bit more. Anyway, you say: > So, I guess it is a spectrum: something like 70% efficiency means, that you need "known plaintext attack" to get the data. And then, you can use less and less bits per public key, to make it arbitrarily weaker. Then, instead of relying on a timelock, you can rely on computation difficulty for the reader, for example: "how many bits I need to leak, to make it breakable by lattice attack". I think it's an interesting idea to use lattice attacks but I can't find a way to agree with 66 or 70%. Here's why: We assume a "few" signatures are all on the same private key. If there are N such signatures, then once LLL or similar lattice method is successful, you retrieve the 1 private key (32 bytes) and the N * 27 bytes (or so; imagining 5 bytes are biased; it *can* go lower, requiring more signatures; doesn't change the situation). So you embedded successfully 27N+32 (all the nonces and the private key) into 64N + 32N [1] for a ratio that is a bit less than 33%. Compare with just using a repeated nonce in 2 equations, where you get 64 bytes (nonce, privkey) from 2*P + 2*(R,s) or so a total of 196, i.e. 33% exactly. Basically, at least in a bitcoin context, there is no gain in doing a partial exposure of the nonce; you may as well just reveal all of it, either by repetition or as noted in the pdf, by using something public like a block hash. Notice that if my note [1] did not apply, then all the above isn't correct, the ratios work differently. Can you let me know how you're getting 66%+? I'm guessing you're just saying "the k and the d values" but as per above I don't see it. Maybe write out concretely what the data-reader would be doing? [1] It's easy to slip up here - I know I did - when considering publication *on bitcoin* compared with just publishing signatures. In the latter case, I can publish 100 signatures with the tacit assumption that they all refer to the same key (or, you can verify, to check). In bitcoin the pubkey is never tacit, it's always published in the scriptPubKey or scriptSig or whatever, so you can't gain efficiency from repeated uses of the same key (i.e. you can't write 64N + 32, it must be 64N + 32N for (P, R, s) tuples). Cheers, Adam On Friday, October 31, 2025 at 10:25:30 AM UTC-3 Garlo Nicon wrote: > if you can embed data into a (P, R, s) tuple (Schnorr pubkey and signature, BIP340 style), without grinding or using a sidechannel to "inform" the reader, you must be leaking your private key You can embed data into a valid signature. For example: R=k*G P=d*G k=first_chunk_of_data d=second_chunk_of_data And then, keys are "weak", because people can use "known plaintext attack", to get them. However, if you want to push random data, that is unknown to the reader, then it is known only by the holder of the data. Which means, that the efficiency of this encoding is somewhere around 66%, by grinding SHA-256 hashes, it could probably reach around 70% in practice. Only s-value is something, that needs any grinding, for k-value and d-value, you need only the data, and nothing else. So, I guess it is a spectrum: something like 70% efficiency means, that you need "known plaintext attack" to get the data. And then, you can use less and less bits per public key, to make it arbitrarily weaker. Then, instead of relying on a timelock, you can rely on computation difficulty for the reader, for example: "how many bits I need to leak, to make it breakable by lattice attack". -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/781840dd-b633-4d87-b05d-d389c6374d63n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 6721 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-11-01 14:49 ` waxwing/ AdamISZ @ 2025-11-02 9:11 ` Garlo Nicon 2025-11-02 13:30 ` waxwing/ AdamISZ 0 siblings, 1 reply; 19+ messages in thread From: Garlo Nicon @ 2025-11-02 9:11 UTC (permalink / raw) To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List [-- Attachment #1: Type: text/plain, Size: 10875 bytes --] > Can you let me know how you're getting 66%+? You have three chunks, which are needed: (P,R,s). You can control "P" and "R" directly and fully, by feeding it with your data. That means, you can get 66%, because it is just 2/3, if you assume, that all values have the same size. Then, to get 70% or more, grinding s-value is needed, which is doable, if you want to for example grind two or three bytes of s-value, and stop there. But let's assume, that you want to make it as fast as possible, so you don't grind anything, and then stop at 66%. > Maybe write out concretely what the data-reader would be doing? I already told you, when I said "known plaintext attack". If you want to put random data into private keys or signatures, then things are hard to break. However, if it is something useful for the reader, then usually, that kind of data are non-random. For example: some users store transactions inside OP_RETURNs, and they use ASCII hex representation. If they would use binary encoding, then they would save 50% space. But people simply don't care. And the similar case is possible here: if you want to store random data, then it is hard to use this method. However, if you want to store ASCII text, where many words can be found in a dictionary, or where the format of the data is known upfront, or can be easily guessed, then the security of the keys, is comparable to the brainwallets. Which means, that you can just put your data into the private key of the user, and a "signature nonce" (which is nothing else, but yet another private key, placed on secp256k1). And then, if you know, that your data, is for example "ASCII string", then it means, that each and every key, that you produce, simply leaks at least 32 bits per 256-bit key, if not more. And then, if the attacker can get coins from brainwallets, then decoding such data is not much harder than that. If your data contains simple words, then even dictionary attacks can be used. So, let's say that you want to encode 64 bytes in a signature: d="This is a test of storing data i"=0x5468697320697320612074657374206f662073746f72696e6720646174612069 k="n private keys inside signatures"=0x6e2070726976617465206b65797320696e73696465207369676e617475726573 P=d*G=02A2EF730B26A905A7D91940E3A512C5771D8BC8BCCA153D714E328043856CBB2B R=k*G=02E19FCA1025CFD67409309E2B1711D723BFB67EC520917D9A0AD9432414DA0D0A And then, s-value comes from SHA-256 hashing, so it is harder to control. But grinding a few bytes can give something around 70%. However, even if we stop at 66%, then still: useful data are regular. There are many patterns. If something is an ASCII string, then 1/8 bits are cleared, and it is known, which ones should be set to zero. If it is in English, then the entropy is even lower. Which means, that the private key is not directly "leaked", by being passed to the reader, but there is an assumption, that it will be easy enough to get. Also, if the key won't be leaked, then it can be used as an advantage: first, NFTs can be minted, and transferred, and then, you can pass the data directly, and say: "See? You can confirm, that they are encoded into private keys properly". And as long as the data in question is difficult enough to fully guess, the key is not revealed, even if it is quite weak. Which means, that my answer to your question is: it is a spectrum. You can make a weak signature, and have 33% encoding efficiency, and leak every private key immediately. But you can make something in a spectrum between 33% and 66%, and make something, that is "weak", but something, which won't be broken "on the spot, immediately after being broadcasted" (so you cannot really say, that the keys are "leaked", because you need to know "something" about the plaintext inside private keys, or about its format). And it is good for spammers, because then, funds can be safely confirmed, and later revealed, that "hey, I encoded that data here, by wasting 3 MB of block space, to encode 2 MB of ASCII strings, here is your NFT, that you can buy here". sob., 1 lis 2025 o 16:47 waxwing/ AdamISZ <ekaggata@gmail.com> napisał(a): > Hi Garlo Nicon, > > Before I answer your point I want to mention (to readers): probably some > things remained tacit in this thread but are worth emphasizing: > > 1. It's always trivial to get a 100% embedding rate if it's OK to assume > the embedder is choosing to share data off-blockchain with others (just xor > the real signature with their chosen data and call that the key). This is > of course is a bit silly (though not entirely silly); if the purpose is to > *communicate* then they can use the communication channel for the data, > instead of the xor value, and forget about the blockchain. On the other > hand if their purpose is to publish data, and rely on the immutability and > persistence of the blockchain, then there is the problem that the xor key > can be lost; it's that offchain data that represents the actual semantics > of what they published, and so they're in rather the same position as they > would have been without the blockchain existing at all. (insert > finesses/caveats but, basically). > > 2. All of the above theoretical analysis doesn't work for ECDSA *as an > algorithm outside of Bitcoin*. You get 32 bytes of embedding without > leaking the private key, there. (the s-value can literally be made to say > "hello world" 3 times or whatever). this is the non-pubkey-committing > nature of standard ECDSA. I *think* you can make it behave the same as > Schnorr in terms of pubkey-unembeddability-without-key-leakage by putting > the pubkey in the message, but it's even harder to analyze than Schnorr > (which is already hard). > > 3. In contrast to 2., the pubkey is in fact embedded in the message > (indirectly), at least usually, in Bitcoin (except sighash_noinput type > stuff which isn't live), so you can't put hello world in the signatures for > now, at least AFAIK. Still even then you're stuck at a 33% rate if we > include all of P, R, s, which seems reasonable (in fact, that's a generous > measure). Again, I am ignoring grinding which always adds a bit more. > > Anyway, you say: > > > So, I guess it is a spectrum: something like 70% efficiency means, that > you need "known plaintext attack" to get the data. And then, you can use > less and less bits per public key, to make it arbitrarily weaker. Then, > instead of relying on a timelock, you can rely on computation difficulty > for the reader, for example: "how many bits I need to leak, to make it > breakable by lattice attack". > > I think it's an interesting idea to use lattice attacks but I can't find a > way to agree with 66 or 70%. Here's why: > > We assume a "few" signatures are all on the same private key. If there are > N such signatures, then once LLL or similar lattice method is successful, > you retrieve the 1 private key (32 bytes) and the N * 27 bytes (or so; > imagining 5 bytes are biased; it *can* go lower, requiring more signatures; > doesn't change the situation). > > So you embedded successfully 27N+32 (all the nonces and the private key) > into 64N + 32N [1] for a ratio that is a bit less than 33%. Compare with > just using a repeated nonce in 2 equations, where you get 64 bytes (nonce, > privkey) from 2*P + 2*(R,s) or so a total of 196, i.e. 33% exactly. > Basically, at least in a bitcoin context, there is no gain in doing a > partial exposure of the nonce; you may as well just reveal all of it, > either by repetition or as noted in the pdf, by using something public like > a block hash. Notice that if my note [1] did not apply, then all the above > isn't correct, the ratios work differently. > > Can you let me know how you're getting 66%+? I'm guessing you're just > saying "the k and the d values" but as per above I don't see it. Maybe > write out concretely what the data-reader would be doing? > > [1] It's easy to slip up here - I know I did - when considering > publication *on bitcoin* compared with just publishing signatures. In the > latter case, I can publish 100 signatures with the tacit assumption that > they all refer to the same key (or, you can verify, to check). In bitcoin > the pubkey is never tacit, it's always published in the scriptPubKey or > scriptSig or whatever, so you can't gain efficiency from repeated uses of > the same key (i.e. you can't write 64N + 32, it must be 64N + 32N for (P, > R, s) tuples). > > Cheers, > Adam > > On Friday, October 31, 2025 at 10:25:30 AM UTC-3 Garlo Nicon wrote: > > > if you can embed data into a (P, R, s) tuple (Schnorr pubkey and > signature, BIP340 style), without grinding or using a sidechannel to > "inform" the reader, you must be leaking your private key > > You can embed data into a valid signature. For example: > > R=k*G > P=d*G > k=first_chunk_of_data > d=second_chunk_of_data > > And then, keys are "weak", because people can use "known plaintext > attack", to get them. However, if you want to push random data, that is > unknown to the reader, then it is known only by the holder of the data. > > Which means, that the efficiency of this encoding is somewhere around 66%, > by grinding SHA-256 hashes, it could probably reach around 70% in practice. > Only s-value is something, that needs any grinding, for k-value and > d-value, you need only the data, and nothing else. > > So, I guess it is a spectrum: something like 70% efficiency means, that > you need "known plaintext attack" to get the data. And then, you can use > less and less bits per public key, to make it arbitrarily weaker. Then, > instead of relying on a timelock, you can rely on computation difficulty > for the reader, for example: "how many bits I need to leak, to make it > breakable by lattice attack". > > -- > You received this message because you are subscribed to the Google Groups > "Bitcoin Development Mailing List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to bitcoindev+unsubscribe@googlegroups.com. > To view this discussion visit > https://groups.google.com/d/msgid/bitcoindev/781840dd-b633-4d87-b05d-d389c6374d63n%40googlegroups.com > <https://groups.google.com/d/msgid/bitcoindev/781840dd-b633-4d87-b05d-d389c6374d63n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAN7kyNgyoA5rb8hYuxai6bSaPdon%3Dy%3D9Z%2BdAfqP6Mf%3DPyniJLw%40mail.gmail.com. [-- Attachment #2: Type: text/html, Size: 12275 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [bitcoindev] On (in)ability to embed data into Schnorr 2025-11-02 9:11 ` Garlo Nicon @ 2025-11-02 13:30 ` waxwing/ AdamISZ 0 siblings, 0 replies; 19+ messages in thread From: waxwing/ AdamISZ @ 2025-11-02 13:30 UTC (permalink / raw) To: Bitcoin Development Mailing List [-- Attachment #1.1: Type: text/plain, Size: 4249 bytes --] > I already told you, when I said "known plaintext attack". If you want to put random data into private keys or signatures, then things are hard to break. However, if it is something useful for the reader, then usually, that kind of data are non-random. For example: some users store transactions inside OP_RETURNs, and they use ASCII hex representation. If they would use binary encoding, then they would save 50% space. But people simply don't care. > And the similar case is possible here: if you want to store random data, then it is hard to use this method. However, if you want to store ASCII text, where many words can be found in a dictionary, or where the format of the data is known upfront, or can be easily guessed, then the security of the keys, is comparable to the brainwallets. > Which means, that you can just put your data into the private key of the user, and a "signature nonce" (which is nothing else, but yet another private key, placed on secp256k1). And then, if you know, that your data, is for example "ASCII string", then it means, that each and every key, that you produce, simply leaks at least 32 bits per 256-bit key, if not more. Ah, right; I had originally written a response to this idea but then discarded it on the basis that it's kinda "obvious" that we shouldn't think about that, and focused on the more in-the-weeds concept of a lattice attack instead. But it isn't obvious. So let's think of the spectrum here. First, the most trivial nonce to break: one consisting of a single bit (OK technically you can't encode k=0, heh, but, whatever, put it in the second bit of the string). Obviously that is extractable, getting 32 bytes plus one bit. That one extra bit above the 33% is achievable because of "grinding" except here grinding is the most trivial version possible: trying 2 alternatives. This still fits my original claim, which is "33% plus whatever you can get from grinding, and you leak the secret key in the process". Other end of the spectrum: not 1 bit or 5 bytes but say 20 bytes represent an actual message, and let's say the rest of the 256 bit k-string is zero. Now clearly one can't grind that, if it's random. Which brings us to your point about weakness: let's say the 20 bytes of message comes from a space of possible messages, known to all potential readers, whose size is actually 40 bits. Because they can grind 40 bits, they can retrieve the message, but that message is only 40 bits of information. E.g. most crude idea; a table of 2^40 messages, you are picking one .. notice it doesn't matter if the length of each message is 40 bits or 160 bits or 256 bits; you are only conveying 40 bits of *information* if you do this. From this point of view it's pretty clear that we haven't changed the general conclusion: you only get 33% (say 32 bytes), *plus* whatever you can get from grinding, and since that's exponential work, it's never going to be very big, say 5 bytes or possibly 6? And you leak the key of course. I do agree with you that there could be scenarios where this "mode" of publication/embedding might be the preferable one, because we're gliding over that line between "pure publication" and "publication with sidechannels". As I argued here and elsewhere, if there is a proper, viable, sidechannel, then most of this analysis doesn't apply but a sort of mixup where "if you know information X you can grind out more information Y from the onchain data" is possible. But no, as per the above, you are definitely not conveying 66% (that is to say , 64 bytes out of 96) in the P, R, s tuple using this method. That'd only be true in the sense that if the space of possible messages is "hello world\n\n" and "goodbye world" and then you claimed you were sending 13 bytes because a reader can find the message. Cheers, AdamISZ/waxwing -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/31d18bd9-62e0-4035-b04f-f70ff4253257n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 4720 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2025-11-02 13:33 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ 2025-10-01 22:10 ` Greg Maxwell 2025-10-01 23:11 ` Andrew Poelstra 2025-10-02 0:25 ` waxwing/ AdamISZ 2025-10-02 15:56 ` waxwing/ AdamISZ 2025-10-02 19:49 ` Greg Maxwell 2025-10-06 13:04 ` waxwing/ AdamISZ 2025-10-03 13:24 ` Peter Todd 2025-10-04 2:39 ` waxwing/ AdamISZ 2025-10-07 8:22 ` Anthony Towns 2025-10-07 12:05 ` waxwing/ AdamISZ 2025-10-08 5:12 ` Anthony Towns 2025-10-08 12:55 ` waxwing/ AdamISZ 2025-10-31 9:10 ` Tim Ruffing 2025-10-31 13:09 ` waxwing/ AdamISZ 2025-10-31 13:19 ` Garlo Nicon 2025-11-01 14:49 ` waxwing/ AdamISZ 2025-11-02 9:11 ` Garlo Nicon 2025-11-02 13:30 ` waxwing/ AdamISZ
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox