[bitcoindev] On (in)ability to embed data into Schnorr

Bitcoin Development Mailinglist
 help / color / mirror / Atom feed

* [bitcoindev] On (in)ability to embed data into Schnorr
@ 2025-10-01 14:24 waxwing/ AdamISZ
  2025-10-01 22:10 ` Greg Maxwell
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-01 14:24 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 1548 bytes --]

Hi all,

https://github.com/AdamISZ/schnorr-unembeddability/

Here I'm analyzing whether the following statement is true: "if you can 
embed data into a (P, R, s) tuple (Schnorr pubkey and signature, BIP340 
style), without grinding or using a sidechannel to "inform" the reader, you 
must be leaking your private key".

See the abstract for a slightly more fleshed out context.

I'm curious about the case of P, R, s published in utxos to prevent usage 
of utxos as data. I think this answers in the half-affirmative: you can 
only embed data by leaking the privkey so that it (can) immediately fall 
out of the utxo set.

(To emphasize, this is different to the earlier observations (including by 
me!) that just say it is *possible* to leak data by leaking the private 
key; here I'm trying to prove that there is *no other way*).

However I still am probably in the large majority that thinks it's 
appalling to imagine a sig attached to every pubkey onchain.

Either way, I found it very interesting! Perhaps others will find the 
analysis valuable.

Feedback (especially of the "that's wrong/that's not meaningful" variety) 
appreciated.

Regards,
AdamISZ/waxwing

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 2061 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ
@ 2025-10-01 22:10 ` Greg Maxwell
  2025-10-01 23:11   ` Andrew Poelstra
  2025-10-03 13:24 ` Peter Todd
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Greg Maxwell @ 2025-10-01 22:10 UTC (permalink / raw)
  To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 3193 bytes --]

Intuitively it sounds likely, -- just in that the available values are a
image on the curve and a value summed with a hash dependent on everything
else.  I think it would be hard to prove.

But is it even really worth the analysis when grinding gets you a 12%
embedding rate in that signature at not that significant cost? (because you
can independently grind the nonce and signature itself, or nonce and
pubkey) -- and when beyond the cost of the additional signature (making the
output 3x its cost) requiring signing when forming the address completely
kills public derivation, multisig with cold keys. etc?  ... and then any of
whatever spam concerns people have would likely be exacerbated by the
spammers using more resources due to the embedding rate?

Also re private key leaking an utxo set, well not so if it's part of an
explicit multisig. E.g. 2 of 2 with leaked key and a secure one.




On Wed, Oct 1, 2025 at 7:50 PM waxwing/ AdamISZ <ekaggata@gmail.com> wrote:

> Hi all,
>
> https://github.com/AdamISZ/schnorr-unembeddability/
>
> Here I'm analyzing whether the following statement is true: "if you can
> embed data into a (P, R, s) tuple (Schnorr pubkey and signature, BIP340
> style), without grinding or using a sidechannel to "inform" the reader, you
> must be leaking your private key".
>
> See the abstract for a slightly more fleshed out context.
>
> I'm curious about the case of P, R, s published in utxos to prevent usage
> of utxos as data. I think this answers in the half-affirmative: you can
> only embed data by leaking the privkey so that it (can) immediately fall
> out of the utxo set.
>
> (To emphasize, this is different to the earlier observations (including by
> me!) that just say it is *possible* to leak data by leaking the private
> key; here I'm trying to prove that there is *no other way*).
>
> However I still am probably in the large majority that thinks it's
> appalling to imagine a sig attached to every pubkey onchain.
>
> Either way, I found it very interesting! Perhaps others will find the
> analysis valuable.
>
> Feedback (especially of the "that's wrong/that's not meaningful" variety)
> appreciated.
>
> Regards,
> AdamISZ/waxwing
>
> --
> You received this message because you are subscribed to the Google Groups
> "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to bitcoindev+unsubscribe@googlegroups.com.
> To view this discussion visit
> https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com
> <https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAAS2fgQRz%3DEJ%2BNm2rxrB_SEpqroFbcc%2BhUhmghJJ1jrJc-WUDA%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 4267 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-01 22:10 ` Greg Maxwell
@ 2025-10-01 23:11   ` Andrew Poelstra
  2025-10-02  0:25     ` waxwing/ AdamISZ
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Poelstra @ 2025-10-01 23:11 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 2288 bytes --]

On Wed, Oct 01, 2025 at 10:10:16PM +0000, Greg Maxwell wrote:
> Intuitively it sounds likely, -- just in that the available values are a
> image on the curve and a value summed with a hash dependent on everything
> else.  I think it would be hard to prove.
> 
> But is it even really worth the analysis when grinding gets you a 12%
> embedding rate in that signature at not that significant cost? (because you
> can independently grind the nonce and signature itself, or nonce and
> pubkey) -- and when beyond the cost of the additional signature (making the
> output 3x its cost) requiring signing when forming the address completely
> kills public derivation, multisig with cold keys. etc?  ... and then any of
> whatever spam concerns people have would likely be exacerbated by the
> spammers using more resources due to the embedding rate?
>

Some time ago, I talked to Ethan Heilman about this in the context of PQ
signatures, and he made the interesting point that you can think of
12% embedding rate as representing an 8x discount for real signatures vs
embedded data. And that maybe that's okay, incentive-wise.

Needing to grind out portions of 32-byte blocks probably also reduces
the risk from people trying to embed virus signatures or other malicious
data.

As for waxwing's original question -- I also intuitively believe that
the only way to embed data in a Schnorr signature is by grinding or
revealing your key ... and I'm not convinced you can do it even by
revealing your key. (R is an EC point that you can't force to be any
particular value except by making a NUMS point, which you then can't use
to sign; and s = k + ex where e is a hash of kG (among other things)
so I don't think you can force that value at all.)

-- 
Andrew Poelstra
Director, Blockstream Research
Email: apoelstra at wpsoftware.net
Web:   https://www.wpsoftware.net/andrew

The sun is always shining in space
    -Justin Lewis-Webster

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/aN21KbXTORgXAVH0%40mail.wpsoftware.net.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-01 23:11   ` Andrew Poelstra
@ 2025-10-02  0:25     ` waxwing/ AdamISZ
  2025-10-02 15:56       ` waxwing/ AdamISZ
  0 siblings, 1 reply; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-02  0:25 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 5725 bytes --]

Hi Greg, Andrew, list,

Answers to Greg then Andrew:

> E.g. 2 of 2 with leaked key and a secure one.

That's a very good point! I was narrowly focused on the signature scheme, 
but Bitcoin is more than a signature scheme!

>   But is it even really worth the analysis when grinding gets you a 12% 
embedding rate in that signature at not that significant cost? (because you 
can independently grind the nonce and signature itself, or nonce and 
pubkey) -- and when beyond the cost of the additional signature (making the 
output 3x its cost) requiring signing when forming the address completely 
kills public derivation, multisig with cold keys. etc?  ... and then any of 
whatever spam concerns people have would likely be exacerbated by the 
spammers using more resources due to the embedding rate?

I certainly don't think it's worth *doing* (hence my use of the term 
"appalling idea" :) ), as per the things you mention there.

I wrote the document as a mostly academic investigation. It would be nice 
to be surer what the limits are, although I suspect we're all reasonably 
confident of what is/isn't possible.

>  12% embedding rate
Where do you get that number from? 33% for embedding 256 bits in (P, R, s) 
(but as per this discussion, according to me, at the cost of key leakage). 
If we include the other bytes in a (taproot anyway) utxo that's not much 
less, I guess 30% ish. I could try to guess but it'd be easier if you told 
me :)

to Andrew:

> As for waxwing's original question -- I also intuitively believe that
the only way to embed data in a Schnorr signature is by grinding or
revealing your key ... and I'm not convinced you can do it even by
revealing your key. (R is an EC point that you can't force to be any
particular value except by making a NUMS point, which you then can't use
to sign; and s = k + ex where e is a hash of kG (among other things)
so I don't think you can force that value at all.)

Ah, I see what you're saying, it's a subtly different target. ECDSA allows 
that s be controlled, Schnorr doesn't, but I set up the game as "adversary 
must be able to publish a function f such that f(any published R, s, (e)) = 
data", i.e. not just f = identity function. That was why I wrote in the 
introduction (copied here for convenience:)

"Data can effectively be embedded in signatures by using a publically-
inferrable nonce, as was noted \href{https://groups.google.com/g/bitcoindev
/c/d6ZO7gXGYbQ/m/Y8BfxMVxAAAJ}{here} and was later fleshed out in detail 
\href{https://blog.bitmex.com/the-unstoppable-jpg-in-private-keys/}{here} (
\textbf{note}: both these sources discuss nonce-reuse but it's worse than 
that: any \emph{publically inferrable} nonce can achieve the same thing, 
such as, the block hash of the parent block; this will have the same 
embedding rate and cannot be disallowed)."

It may be a different target "politically" :) but I was only thinking 
technically, in terms of how people might end up using outputs. From a 
technical point of view it makes no difference if f is the identity or 
something more complex (as long as it's efficiently computable).

Cheers,
AdamISZ/waxwing
On Wednesday, October 1, 2025 at 8:20:25 PM UTC-3 Andrew Poelstra wrote:

> On Wed, Oct 01, 2025 at 10:10:16PM +0000, Greg Maxwell wrote:
> > Intuitively it sounds likely, -- just in that the available values are a
> > image on the curve and a value summed with a hash dependent on everything
> > else. I think it would be hard to prove.
> > 
> > But is it even really worth the analysis when grinding gets you a 12%
> > embedding rate in that signature at not that significant cost? (because 
> you
> > can independently grind the nonce and signature itself, or nonce and
> > pubkey) -- and when beyond the cost of the additional signature (making 
> the
> > output 3x its cost) requiring signing when forming the address completely
> > kills public derivation, multisig with cold keys. etc? ... and then any 
> of
> > whatever spam concerns people have would likely be exacerbated by the
> > spammers using more resources due to the embedding rate?
> >
>
> Some time ago, I talked to Ethan Heilman about this in the context of PQ
> signatures, and he made the interesting point that you can think of
> 12% embedding rate as representing an 8x discount for real signatures vs
> embedded data. And that maybe that's okay, incentive-wise.
>
> Needing to grind out portions of 32-byte blocks probably also reduces
> the risk from people trying to embed virus signatures or other malicious
> data.
>
> As for waxwing's original question -- I also intuitively believe that
> the only way to embed data in a Schnorr signature is by grinding or
> revealing your key ... and I'm not convinced you can do it even by
> revealing your key. (R is an EC point that you can't force to be any
> particular value except by making a NUMS point, which you then can't use
> to sign; and s = k + ex where e is a hash of kG (among other things)
> so I don't think you can force that value at all.)
>
> -- 
> Andrew Poelstra
> Director, Blockstream Research
> Email: apoelstra at wpsoftware.net
> Web: https://www.wpsoftware.net/andrew
>
> The sun is always shining in space
> -Justin Lewis-Webster
>
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/2e366b25-f789-4c9d-acf9-b87149d6a796n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 10070 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-02  0:25     ` waxwing/ AdamISZ
@ 2025-10-02 15:56       ` waxwing/ AdamISZ
  2025-10-02 19:49         ` Greg Maxwell
  0 siblings, 1 reply; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-02 15:56 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 2892 bytes --]

> >  12% embedding rate
> Where do you get that number from? 33% for embedding 256 bits in (P, R, 
s) (but as per this discussion, according to me, at the cost of key 
leakage). If we include the other bytes in a (taproot anyway) utxo that's 
not much less, I guess 30% ish. I could try to guess but it'd be easier if 
you told me :)

Thinking about it again: to publish data, you have to publish a 
transaction! I guess the most economical, paying taproot to taproot, is 
about 192 bytes with script path plus the posited extra 64 for the (R,s) in 
the output, so yeah that'd be 32 out of 256, 12.5%. Isn't the figure a bit 
different for key path though, because no control block? Well it hardly 
matters, it's some small fraction in that range.

An interesting mechanical detail in this near-absurd scenario is that if 
you wanted to repeatedly publish off the same (presumably a few multiples 
of dust level) output, you couldn't also do the leak single key thing, 
since you'd lose control to re-spend. So that'd place us in the "explicit 
multisig" scenario that Greg mentioned, which I think would only make sense 
with legacy script? Kind of a different scenario, also it would be really 
weird to update legacy script to take into account a new "you must sign the 
pubkeys" rule. Though I guess in this fictional scenario, it might happen 
like that. If you did do it with legacy, you'd be publishing bare 2 of 2 
multisig. If you did it with taproot due to how that works, the script is 
not published until the output is spent, so I think that's outside what I 
was considering ("data in utxo set"). (I guess you could also use something 
like a hash lock which might be more efficient). So anyway if you wanted to 
do this repeatedly and minimize cost, for whatever strange reason, you'd be 
adding another 50-100 bytes each time bringing that % down to like 10% or 
less.

But that all became way too hypothetical to even analyze properly :)

Anyway just to reemphasize I certainly wasn't advocating this sig-attaching 
system, but it seems important to know what the result of it would be: we 
would still not have changed the obvious reality that embedding data in 
witness gives more space for data, and is more economical, and we would 
only reduce by a big factor how much can be embedded in outputs (anything 
from 8% to 15% embedding rate seems possible depending on the hypothetical 
details), while having to screw up much of Bitcoin's functionality in the 
process.

Cheers,
AdamISZ/waxwing

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 3303 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-02 15:56       ` waxwing/ AdamISZ
@ 2025-10-02 19:49         ` Greg Maxwell
  2025-10-06 13:04           ` waxwing/ AdamISZ
  0 siblings, 1 reply; 19+ messages in thread
From: Greg Maxwell @ 2025-10-02 19:49 UTC (permalink / raw)
  To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 3950 bytes --]

I just meant in the purely grinding non-key leaking case you could get 4
bytes into the nonce pretty easily and 4 bytes into either the pubkey or
signature out of a 64 byte signature.  Obviously the delivered embedding
rate in a whole txn will be lower, but maybe not that much thanks to
multisig outputs.


On Thu, Oct 2, 2025 at 4:17 PM waxwing/ AdamISZ <ekaggata@gmail.com> wrote:

> > >  12% embedding rate
> > Where do you get that number from? 33% for embedding 256 bits in (P, R,
> s) (but as per this discussion, according to me, at the cost of key
> leakage). If we include the other bytes in a (taproot anyway) utxo that's
> not much less, I guess 30% ish. I could try to guess but it'd be easier if
> you told me :)
>
> Thinking about it again: to publish data, you have to publish a
> transaction! I guess the most economical, paying taproot to taproot, is
> about 192 bytes with script path plus the posited extra 64 for the (R,s) in
> the output, so yeah that'd be 32 out of 256, 12.5%. Isn't the figure a bit
> different for key path though, because no control block? Well it hardly
> matters, it's some small fraction in that range.
>
> An interesting mechanical detail in this near-absurd scenario is that if
> you wanted to repeatedly publish off the same (presumably a few multiples
> of dust level) output, you couldn't also do the leak single key thing,
> since you'd lose control to re-spend. So that'd place us in the "explicit
> multisig" scenario that Greg mentioned, which I think would only make sense
> with legacy script? Kind of a different scenario, also it would be really
> weird to update legacy script to take into account a new "you must sign the
> pubkeys" rule. Though I guess in this fictional scenario, it might happen
> like that. If you did do it with legacy, you'd be publishing bare 2 of 2
> multisig. If you did it with taproot due to how that works, the script is
> not published until the output is spent, so I think that's outside what I
> was considering ("data in utxo set"). (I guess you could also use something
> like a hash lock which might be more efficient). So anyway if you wanted to
> do this repeatedly and minimize cost, for whatever strange reason, you'd be
> adding another 50-100 bytes each time bringing that % down to like 10% or
> less.
>
> But that all became way too hypothetical to even analyze properly :)
>
> Anyway just to reemphasize I certainly wasn't advocating this
> sig-attaching system, but it seems important to know what the result of it
> would be: we would still not have changed the obvious reality that
> embedding data in witness gives more space for data, and is more
> economical, and we would only reduce by a big factor how much can be
> embedded in outputs (anything from 8% to 15% embedding rate seems possible
> depending on the hypothetical details), while having to screw up much of
> Bitcoin's functionality in the process.
>
> Cheers,
> AdamISZ/waxwing
>
> --
> You received this message because you are subscribed to the Google Groups
> "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to bitcoindev+unsubscribe@googlegroups.com.
> To view this discussion visit
> https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com
> <https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAAS2fgQtx_FnecKxpKryTq9o5HJfirY_Vyih6FXzHGHG2itmQQ%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 4778 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-02 19:49         ` Greg Maxwell
@ 2025-10-06 13:04           ` waxwing/ AdamISZ
  0 siblings, 0 replies; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-06 13:04 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 4873 bytes --]

Yes, sorry, reading fail on my part (somehow missed that you were 
explicitly referring to grinding in the comment).

Still don't think the 12% figure is a good one though? in (P,R,s) it's 8 
out of 96 (and as discussed, worse if whole tx is (realistically) 
included), 1/4 the rate you get from direct key leakage. (Plus the perhaps 
trivial point that it does actually require work, which might conceivably 
matter at scale?). I'm not sure why one would not include P in the measure?

Even an explicit multisig that does not sacrifice control of the output 
would be of the order of double the embedding rate, without having to do 
work. (P,R,s x 2 = 192 and embed 32 for a 1/6 rate; vs. grinding all 4 P,R 
values for a 1/12 rate).



On Thursday, October 2, 2025 at 6:59:41 PM UTC-3 Greg Maxwell wrote:

> I just meant in the purely grinding non-key leaking case you could get 4 
> bytes into the nonce pretty easily and 4 bytes into either the pubkey or 
> signature out of a 64 byte signature.  Obviously the delivered embedding 
> rate in a whole txn will be lower, but maybe not that much thanks to 
> multisig outputs.
>
>
> On Thu, Oct 2, 2025 at 4:17 PM waxwing/ AdamISZ <ekag...@gmail.com> wrote:
>
>> > >  12% embedding rate
>> > Where do you get that number from? 33% for embedding 256 bits in (P, R, 
>> s) (but as per this discussion, according to me, at the cost of key 
>> leakage). If we include the other bytes in a (taproot anyway) utxo that's 
>> not much less, I guess 30% ish. I could try to guess but it'd be easier if 
>> you told me :)
>>
>> Thinking about it again: to publish data, you have to publish a 
>> transaction! I guess the most economical, paying taproot to taproot, is 
>> about 192 bytes with script path plus the posited extra 64 for the (R,s) in 
>> the output, so yeah that'd be 32 out of 256, 12.5%. Isn't the figure a bit 
>> different for key path though, because no control block? Well it hardly 
>> matters, it's some small fraction in that range.
>>
>> An interesting mechanical detail in this near-absurd scenario is that if 
>> you wanted to repeatedly publish off the same (presumably a few multiples 
>> of dust level) output, you couldn't also do the leak single key thing, 
>> since you'd lose control to re-spend. So that'd place us in the "explicit 
>> multisig" scenario that Greg mentioned, which I think would only make sense 
>> with legacy script? Kind of a different scenario, also it would be really 
>> weird to update legacy script to take into account a new "you must sign the 
>> pubkeys" rule. Though I guess in this fictional scenario, it might happen 
>> like that. If you did do it with legacy, you'd be publishing bare 2 of 2 
>> multisig. If you did it with taproot due to how that works, the script is 
>> not published until the output is spent, so I think that's outside what I 
>> was considering ("data in utxo set"). (I guess you could also use something 
>> like a hash lock which might be more efficient). So anyway if you wanted to 
>> do this repeatedly and minimize cost, for whatever strange reason, you'd be 
>> adding another 50-100 bytes each time bringing that % down to like 10% or 
>> less.
>>
>> But that all became way too hypothetical to even analyze properly :)
>>
>> Anyway just to reemphasize I certainly wasn't advocating this 
>> sig-attaching system, but it seems important to know what the result of it 
>> would be: we would still not have changed the obvious reality that 
>> embedding data in witness gives more space for data, and is more 
>> economical, and we would only reduce by a big factor how much can be 
>> embedded in outputs (anything from 8% to 15% embedding rate seems possible 
>> depending on the hypothetical details), while having to screw up much of 
>> Bitcoin's functionality in the process.
>>
>> Cheers,
>> AdamISZ/waxwing
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Bitcoin Development Mailing List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to bitcoindev+...@googlegroups.com.
>>
> To view this discussion visit 
>> https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com 
>> <https://groups.google.com/d/msgid/bitcoindev/cf15c24e-18d0-4221-a3d4-4177c82a6381n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/b486e5dd-d5b4-43f1-9d9a-20b772d3dc1bn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 6492 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ
  2025-10-01 22:10 ` Greg Maxwell
@ 2025-10-03 13:24 ` Peter Todd
  2025-10-04  2:39   ` waxwing/ AdamISZ
  2025-10-07  8:22 ` Anthony Towns
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Peter Todd @ 2025-10-03 13:24 UTC (permalink / raw)
  To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 1531 bytes --]

On Wed, Oct 01, 2025 at 07:24:50AM -0700, waxwing/ AdamISZ wrote:
> Hi all,
> 
> https://github.com/AdamISZ/schnorr-unembeddability/
> 
> Here I'm analyzing whether the following statement is true: "if you can 
> embed data into a (P, R, s) tuple (Schnorr pubkey and signature, BIP340 
> style), without grinding or using a sidechannel to "inform" the reader, you 
> must be leaking your private key".
> 
> See the abstract for a slightly more fleshed out context.
> 
> I'm curious about the case of P, R, s published in utxos to prevent usage 
> of utxos as data. I think this answers in the half-affirmative: you can 
> only embed data by leaking the privkey so that it (can) immediately fall 
> out of the utxo set.
> 
> (To emphasize, this is different to the earlier observations (including by 
> me!) that just say it is *possible* to leak data by leaking the private 
> key; here I'm trying to prove that there is *no other way*).

You can probably use timelock encryption to ensure that the leak of the private
key only happens in the future, after the funds are recovered by the owner in a
subsequent transaction.

-- 
https://petertodd.org 'peter'[:-1]@petertodd.org

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/aN_OlgvB-Co1BL19%40petertodd.org.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-03 13:24 ` Peter Todd
@ 2025-10-04  2:39   ` waxwing/ AdamISZ
  0 siblings, 0 replies; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-04  2:39 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 1252 bytes --]

Hi Peter,

> You can probably use timelock encryption to ensure that the leak of the 
private
key only happens in the future, after the funds are recovered by the owner 
in a
subsequent transaction.

Another very interesting point, there, to get around the issue of key 
leakage ... albeit I don't see a usecase, maybe I'm just not imaginative 
enough, very possible.

If someone wants to keep something in the utxo set "forever", it doesn't 
help. If they want the property of "immediately accessible in the utxo set" 
(like "deposit into some fancy system with a blob of data"; I emphasize 
"deposit" because that would explain why not "just put it in the witness", 
your current outputs don't support that; correct me if my reasoning is 
wrong here), then I guess they don't get that, either: the data is 
accessible "intermediate term" instead.

Cheers,
AdamISZ/waxwing

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/7b4296ca-50ed-4a8b-b853-0accff46abfbn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1627 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ
  2025-10-01 22:10 ` Greg Maxwell
  2025-10-03 13:24 ` Peter Todd
@ 2025-10-07  8:22 ` Anthony Towns
  2025-10-07 12:05   ` waxwing/ AdamISZ
  2025-10-31  9:10 ` Tim Ruffing
  2025-10-31 13:19 ` Garlo Nicon
  4 siblings, 1 reply; 19+ messages in thread
From: Anthony Towns @ 2025-10-07  8:22 UTC (permalink / raw)
  To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List

On Wed, Oct 01, 2025 at 07:24:50AM -0700, waxwing/ AdamISZ wrote:
> I'm curious about the case of P, R, s published in utxos to prevent usage
> of utxos as data. I think this answers in the half-affirmative: you can
> only embed data by leaking the privkey so that it (can) immediately fall
> out of the utxo set.

I think you can attack the setup here.

If you allow scriptPubKeys in the utxo set whose spending conditions
are HTLC/atomic-swap-like:

   (pubkey A and preimage reveal of X)
   OR (pubkey B and block height > H)

then you either set H to be arbitrarily far in the future and reveal
B's privkey, or choose an NUMS X with no known preimage, and reveal
A's privkey.

If you don't allow those things (eg, by requiring such constructions
also have a (pubkey musig(A,B)) path) then I think you rule out NUMS-IPK
constructions, and end up making things like vaults ("hotkey with delay,
coldkey anytime") difficult to send to ("I have to sign with my cold
key to request funds?"), or, depending on what the utxo R,s is signing,
encourage key reuse.

> (To emphasize, this is different to the earlier observations (including by
> me!) that just say it is *possible* to leak data by leaking the private
> key; here I'm trying to prove that there is *no other way*).

That seems right to me.

I think if the signature scheme supported pubkey recovery (ie, s*G = R +
H(R,m)*P, and our "m" didn't commit to P as well), you could get around
this by just having P be the data, with no one, including the "signer"
able to recover the private key.

> However I still am probably in the large majority that thinks it's
> appalling to imagine a sig attached to every pubkey onchain.

I think the only thing achieved by embedding data in the utxo set (vs
an OP_RETURN output or witness data) is to bloat the utxo set; and if
that's the goal, it can equally easily be done with spendable outputs
that the attacker simply chooses not to ever spend. So that doesn't seem
like a terribly interesting solution to anything.

As far as embedding data in signatures goes, I think the following
scheme would allow you to publish data in a cryptographically-secure way,
with minimal lost funds:

 0) Setup secret keys p and q, and a 32-byte secret k. H(a,b,..) is sha256
    of a,b,.. concatenated.

 1) Split your data into N 31 byte blocks, a1, a2, .., aN.

 2) Calculate r0 as H(k*G). Calculate r1, .., rN as:

      r(i+1) = H(p, r(i)) + a(i)

 3) Sign N+1 transactions in a chain spending pubkey p*G, using rN, r(N-1),
    .., r1, r0 as nonces. All but the final tx should pay to a p*G output to
    continue the chain; the final output should pay to q*G instead.

 4) Once all transactions are sufficiently confirmed, spend the final
    output with k as the secret nonce (and hence R=k*G as the public
    nonce).

Recover the data using the following process:

 1) From the final transaction, recover R=k*G, and calculate r0 as H(R).
    Recover p from the previous transaction, p = (s0-r0)/H(r0*G, P,mi).

 2) Recover ri from each signature; ri = si - H(Ri, P, mi)*p. Recover
    the data ai as ai = ri - H(p,r(i-1)).

Dealing with the points being 32-bytes might require carrying over a
sign-bit; but that should be possible in the spare ~7 bits since each
block was only 31 bytes not 32 bytes. Left as an exercise for the
reader, etc.

I believe that the privkey p is secure prior to k*G being revealed,
since all the nonces are distinct hashes seeded by that privkey; and q
remains secure because k is never revealed.

If you wanted to not reuse the pubkey p*G repeatedly, you could tweak it
to be p0 = p, p(i+1) = p + H(k*G, p(i)), or similar. That would allow you
to use an n-of-n multisig to get multiple blocks in a single transaction
without seeming weird, eg.

I believe the only way to distinguish this from a normal transaction
pattern where a wallet has a change output, is via the final transaction
that reveals k*G, and detecting the relationship between k*G and the
spending conditions of the transaction that created the coin being spent.
That's already somewhat expensive to check for every spend, but could
be made more so by publishing k*G on some other medium (ie the data is
in the blockchain, but you obtain the txid and key to find the data
from elsewhere), or by revealing (k+x)*G where x is a random 20-bit
(?) number, and a significant but tractable amount of grinding is needed
to recover the desired k*G and decode the data -- the idea being that
that is tractable for someone who knows there is data at that txid,
but not tractable when performed on every signature in the blockchain
in order to filter data publication.

I think if you did 20 such transactions per block, each spending a single
20-of-20 tapscript multisig, you'd get 12400 bytes of data per block
(without violating standardness constraints), at a cost of ~11800vb, so
much less efficient than inscriptions, but slightly more efficient than
OP_RETURN, and significantly less detectable than either. I think Knots
default policy currently allows up to 50-of-50 multisig in tapscript,
which would give you 31kB of data in ~26.6kvB of tx weight in a block.

If you're regularly making payments from a particular wallet, I think
that procedure would allow you to encode data in your change outputs at
the rate of 32B/tx for no additional cost. Though the data would only be
recoverable once complete, and it's probably worth noting that I haven't
provided any security proofs...

Cheers,
aj

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/aOTNvteE8PCm6yDd%40erisian.com.au.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-07  8:22 ` Anthony Towns
@ 2025-10-07 12:05   ` waxwing/ AdamISZ
  2025-10-08  5:12     ` Anthony Towns
  0 siblings, 1 reply; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-07 12:05 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 8898 bytes --]

Hi aj,

Interesting points! Answers inline.

On Tuesday, October 7, 2025 at 6:38:40 AM UTC-3 Anthony Towns wrote:

On Wed, Oct 01, 2025 at 07:24:50AM -0700, waxwing/ AdamISZ wrote: 
> I'm curious about the case of P, R, s published in utxos to prevent usage 
> of utxos as data. I think this answers in the half-affirmative: you can 
> only embed data by leaking the privkey so that it (can) immediately fall 
> out of the utxo set. 

I think you can attack the setup here. 

If you allow scriptPubKeys in the utxo set whose spending conditions 
are HTLC/atomic-swap-like: 

(pubkey A and preimage reveal of X) 
OR (pubkey B and block height > H) 

then you either set H to be arbitrarily far in the future and reveal 
B's privkey, or choose an NUMS X with no known preimage, and reveal 
A's privkey.

Yes. In the paper (and my OP email) I'm trying to narrow it down completely 
to a P, R, s structure. I guess if we try to be realistic about this 
"publish a signature in the output always" horrible scenario, it would have 
to just ditch the NUMS variant of taproot, and I agree, that is a very Bad 
Thing (TM). (uh sorry you discuss this in the next paragraph but, w/e).

Alternative examples like multisig or hash lock in script to get the data 
leakage without losing control of the output (necessarily) have been 
mentioned but I like your 2-branch setup as a good flexible example.

If you don't allow those things (eg, by requiring such constructions 
also have a (pubkey musig(A,B)) path) then I think you rule out NUMS-IPK 
constructions, and end up making things like vaults ("hotkey with delay, 
coldkey anytime") difficult to send to ("I have to sign with my cold 
key to request funds?"), or, depending on what the utxo R,s is signing, 
encourage key reuse. 

> (To emphasize, this is different to the earlier observations (including 
by 
> me!) that just say it is *possible* to leak data by leaking the private 
> key; here I'm trying to prove that there is *no other way*). 

That seems right to me. 

I think if the signature scheme supported pubkey recovery (ie, s*G = R + 
H(R,m)*P, and our "m" didn't commit to P as well), you could get around 
this by just having P be the data, with no one, including the "signer" 
able to recover the private key. 

Yes, basically. I discuss this in the paper w.r.t. ECDSA. Your description 
of the relevance of pubkey recovery is good, but there are some nuances. 
You can't quite (with ECDSA) get P to be the data and have a valid sig, but 
you can get 's' to be the data simply by backsolving for the private key x. 
Lack of "pubkey prefixing" in the very funky 'commitment to the nonce' in 
ECDSA causes that. And the second nuance, you did actually mention: you get 
"not leaking the key" for free, here. But it's still only a 32/96 bytes 
embedding rate though, the way I count it.

> However I still am probably in the large majority that thinks it's 
> appalling to imagine a sig attached to every pubkey onchain. 

I think the only thing achieved by embedding data in the utxo set (vs 
an OP_RETURN output or witness data) is to bloat the utxo set; and if 
that's the goal, it can equally easily be done with spendable outputs 
that the attacker simply chooses not to ever spend. So that doesn't seem 
like a terribly interesting solution to anything.

I think the logic of that is not quite right. Suppose I want to embed 
pictures into the unpruneable utxo set specifically (and not only 'in 
transactions'). The starting point here was me trying to write out how you 
can't embed data in known-privkey (Schnorr) P, R, s tuples.

And not only pictures; as Andrew pointed out above, there's always the 
concern of some kind of virus-y "naughty" data.

As far as embedding data in signatures goes, I think the following 
scheme would allow you to publish data in a cryptographically-secure way, 
with minimal lost funds: 

0) Setup secret keys p and q, and a 32-byte secret k. H(a,b,..) is sha256 
of a,b,.. concatenated. 

1) Split your data into N 31 byte blocks, a1, a2, .., aN. 

2) Calculate r0 as H(k*G). Calculate r1, .., rN as: 

r(i+1) = H(p, r(i)) + a(i) 

3) Sign N+1 transactions in a chain spending pubkey p*G, using rN, r(N-1), 
.., r1, r0 as nonces. All but the final tx should pay to a p*G output to 
continue the chain; the final output should pay to q*G instead. 

4) Once all transactions are sufficiently confirmed, spend the final 
output with k as the secret nonce (and hence R=k*G as the public 
nonce). 

Recover the data using the following process: 

1) From the final transaction, recover R=k*G, and calculate r0 as H(R). 
Recover p from the previous transaction, p = (s0-r0)/H(r0*G, P,mi). 

2) Recover ri from each signature; ri = si - H(Ri, P, mi)*p. Recover 
the data ai as ai = ri - H(p,r(i-1)). 

Dealing with the points being 32-bytes might require carrying over a 
sign-bit; but that should be possible in the spare ~7 bits since each 
block was only 31 bytes not 32 bytes. Left as an exercise for the 
reader, etc. 

I believe that the privkey p is secure prior to k*G being revealed, 
since all the nonces are distinct hashes seeded by that privkey; and q 
remains secure because k is never revealed. 

If you wanted to not reuse the pubkey p*G repeatedly, you could tweak it 
to be p0 = p, p(i+1) = p + H(k*G, p(i)), or similar. That would allow you 
to use an n-of-n multisig to get multiple blocks in a single transaction 
without seeming weird, eg. 

I believe the only way to distinguish this from a normal transaction 
pattern where a wallet has a change output, is via the final transaction 
that reveals k*G, and detecting the relationship between k*G and the 
spending conditions of the transaction that created the coin being spent. 
That's already somewhat expensive to check for every spend, but could 
be made more so by publishing k*G on some other medium (ie the data is 
in the blockchain, but you obtain the txid and key to find the data 
from elsewhere), or by revealing (k+x)*G where x is a random 20-bit 
(?) number, and a significant but tractable amount of grinding is needed 
to recover the desired k*G and decode the data -- the idea being that 
that is tractable for someone who knows there is data at that txid, 
but not tractable when performed on every signature in the blockchain 
in order to filter data publication. 

I think if you did 20 such transactions per block, each spending a single 
20-of-20 tapscript multisig, you'd get 12400 bytes of data per block 
(without violating standardness constraints), at a cost of ~11800vb, so 
much less efficient than inscriptions, but slightly more efficient than 
OP_RETURN, and significantly less detectable than either. I think Knots 
default policy currently allows up to 50-of-50 multisig in tapscript, 
which would give you 31kB of data in ~26.6kvB of tx weight in a block. 

If you're regularly making payments from a particular wallet, I think 
that procedure would allow you to encode data in your change outputs at 
the rate of 32B/tx for no additional cost. Though the data would only be 
recoverable once complete, and it's probably worth noting that I haven't 
provided any security proofs...

Very nice example. I am glad you took the trouble to write it out, because 
I agree that examples like that are worth working through because as you 
say they lean closer to being properly indistinguishable from ordinary 
transaction patterns.

My analysis was narrower: output-side embedding (in a theoretical future of 
P,R,s outputs). But that's a little confusing because (P, R, s) is still 
there whether some of it is put in witness or not. So everyone seems to 
agree that privkey reveal is necessary for that, but everyone is also 
pointing out that with Bitcoin's actual consensus scripting system, that 
doesn't quite mean what it seems! And the embedding rate is not very good. 
In this framing, not much has changed in your "chained" example: once the 
privkey p is revealed, you get the k value per chain link, so it's still 
roughly a 1/3 ratio, or more realistically, as you mention (and I did 
upthread), it's per *transaction* which is a much lower rate.

Your points about limits, standardness constraints are well taken; those 
are the kinds of things that do actually matter today, but I was not 
thinking about.

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/e4d271ad-9ea3-41e5-96e2-6cb0118943e4n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 10567 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-07 12:05   ` waxwing/ AdamISZ
@ 2025-10-08  5:12     ` Anthony Towns
  2025-10-08 12:55       ` waxwing/ AdamISZ
  0 siblings, 1 reply; 19+ messages in thread
From: Anthony Towns @ 2025-10-08  5:12 UTC (permalink / raw)
  To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List

On Tue, Oct 07, 2025 at 05:05:24AM -0700, waxwing/ AdamISZ wrote:
> Yes, basically. I discuss this in the paper w.r.t. ECDSA. Your description
> of the relevance of pubkey recovery is good, but there are some nuances.
> You can't quite (with ECDSA) get P to be the data and have a valid sig, but
> you can get 's' to be the data simply by backsolving for the private key x.
> Lack of "pubkey prefixing" in the very funky 'commitment to the nonce' in
> ECDSA causes that. And the second nuance, you did actually mention: you get
> "not leaking the key" for free, here. But it's still only a 32/96 bytes
> embedding rate though, the way I count it.

You've got 4x 32-byte values to play with: s, r, p and m. The verification
equation determines one of these, reducing it to 3x. m isn't able to be
freely chosen, reducing it to 2x. And being able to reverse the equation
in order to calculate anything requires the receiver to know one of the
secrets, which reduces it to 1x. (Grinding can bump that back up to a
factor of 1.something) So that's the 32. On the other side, you need to
transmit everything but m which is otherwise determined by the setup,
so that's the 96.

> I think the logic of that is not quite right. Suppose I want to embed
> pictures into the unpruneable utxo set specifically (and not only 'in
> transactions').

Sure, but then I'll also suppose your goal is to harm Bitcoin by bloating
the utxo set. If that weren't one of your fundamental goals, you'd use
other, cheaper and easier, ways of encoding the data.

> Very nice example. I am glad you took the trouble to write it out, because
> I agree that examples like that are worth working through because as you
> say they lean closer to being properly indistinguishable from ordinary
> transaction patterns.

I think the (P,R,s) outputs could be an interesting design for a
non-programmable system that was intended purely for payments -- a
FEDwire/SWIFT replacement without the possibility of vaults, lightning,
etc. Presumably more mimblewimble friendly etc too. Presumably the "R,s"
values could also be a signature of P by the operator's well known pubkey,
giving you a KYC/CBDC-like system too.

You could get programmability back in this scenario by allow P to sign
a script, which you then satisfy, rather than signing a payment directly
(ie, the graftroot approach).

Anyway, once you make the system programmable in interesting ways, I
think you get data embeddability pretty much immediately, and then it's
just a matter of trading off the optimal encoding rate versus how easily
identifiable your transactions can be. Forcing data to be hidden at a
cost of making it less efficient just leaves less resources available
to other users of the system, though, which doesn't seem like a win in
any way to me.

> Your points about limits, standardness constraints are well taken; those
> are the kinds of things that do actually matter today, but I was not
> thinking about.

Note that I mentioned the standardness constraints not because they're
limits today, but rather because they reflect the form existing txs take,
so mimicing that form would allow txs embedding data via this scheme to
be difficult to distinguish from other txs, and hence equally difficult
to censor/filter.

Cheers,
aj

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/aOXyvGaKfe7bqTXv%40erisian.com.au.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-08  5:12     ` Anthony Towns
@ 2025-10-08 12:55       ` waxwing/ AdamISZ
  0 siblings, 0 replies; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-08 12:55 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 5579 bytes --]

Answers inline.

On Wednesday, October 8, 2025 at 5:45:06 AM UTC-3 Anthony Towns wrote:

On Tue, Oct 07, 2025 at 05:05:24AM -0700, waxwing/ AdamISZ wrote: 
> Yes, basically. I discuss this in the paper w.r.t. ECDSA. Your 
description 
> of the relevance of pubkey recovery is good, but there are some nuances. 
> You can't quite (with ECDSA) get P to be the data and have a valid sig, 
but 
> you can get 's' to be the data simply by backsolving for the private key 
x. 
> Lack of "pubkey prefixing" in the very funky 'commitment to the nonce' in 
> ECDSA causes that. And the second nuance, you did actually mention: you 
get 
> "not leaking the key" for free, here. But it's still only a 32/96 bytes 
> embedding rate though, the way I count it. 

You've got 4x 32-byte values to play with: s, r, p and m. The verification 
equation determines one of these, reducing it to 3x. m isn't able to be 
freely chosen, reducing it to 2x. And being able to reverse the equation 
in order to calculate anything requires the receiver to know one of the 
secrets, which reduces it to 1x. (Grinding can bump that back up to a 
factor of 1.something) So that's the 32. On the other side, you need to 
transmit everything but m which is otherwise determined by the setup, 
so that's the 96. 

Yeah I think so, roughly. It's not 100% watertight deductions but it seems 
correct from where I'm sitting.
(I would only nit that 'm' isn't in consideration as it's implicit, not 
published, in current signature usage; in a proposed signature-in-output, m 
would obviously be constrained to something with no wiggle room (and 
including P if we used ECDSA, but we wouldn't).

> I think the logic of that is not quite right. Suppose I want to embed 
> pictures into the unpruneable utxo set specifically (and not only 'in 
> transactions'). 

Sure, but then I'll also suppose your goal is to harm Bitcoin by bloating 
the utxo set. If that weren't one of your fundamental goals, you'd use 
other, cheaper and easier, ways of encoding the data.

But the goal can be simply this: my data is more marketable if I can 
plausibly claim that it's embedded into bitcoin nodes for eternity (whether 
true or not, it's marketable). AFAIK this is indeed a thing, in the real 
world.

> Very nice example. I am glad you took the trouble to write it out, 
because 
> I agree that examples like that are worth working through because as you 
> say they lean closer to being properly indistinguishable from ordinary 
> transaction patterns. 

I think the (P,R,s) outputs could be an interesting design for a 
non-programmable system that was intended purely for payments -- a 
FEDwire/SWIFT replacement without the possibility of vaults, lightning, 
etc. Presumably more mimblewimble friendly etc too. Presumably the "R,s" 
values could also be a signature of P by the operator's well known pubkey, 
giving you a KYC/CBDC-like system too. 

You could get programmability back in this scenario by allow P to sign 
a script, which you then satisfy, rather than signing a payment directly 
(ie, the graftroot approach). 

I like this line of thought, and indeed I'd forgotten about graftroot and 
the whole delegation angle.
(and just to repeat the point made earlier: we'd only need to sign over a 
message including P for ecdsa, but we wouldn't use that.)
I guess if you're discussing a hypothetical permissioned system though it's 
a whole different world, so I'm going to sidestep that one.

But it does sound interesting to do delegation and then ZkPOK outputs even 
in a Bitcoin world. Albeit it's a long way from where we are today.

Of course we're firmly pie in the sky again here, but I think it helps 
inform thinking about Bitcoin as it is concretely today.

Anyway, once you make the system programmable in interesting ways, I 
think you get data embeddability pretty much immediately,

My main motivation in discussing this was indeed the extent to which you 
get embeddability even without any programmability; as we've established, 
it's not zero, and it's not restricted to grinding (exponential work). But 
in *pure* unprogrammable, ZkPOK outputs of form P, R,s and nothing else 
allowed, it *is*, I'm claiming, restricted to key leakage and doesn't 
surpass 33%.

and then it's 
just a matter of trading off the optimal encoding rate versus how easily 
identifiable your transactions can be. Forcing data to be hidden at a 
cost of making it less efficient just leaves less resources available 
to other users of the system, though, which doesn't seem like a win in 
any way to me. 

> Your points about limits, standardness constraints are well taken; those 
> are the kinds of things that do actually matter today, but I was not 
> thinking about. 

Note that I mentioned the standardness constraints not because they're 
limits today, but rather because they reflect the form existing txs take, 
so mimicing that form would allow txs embedding data via this scheme to 
be difficult to distinguish from other txs, and hence equally difficult 
to censor/filter. 

I see. Good point.

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/323c2d13-e90f-49c5-bfe0-f161b8b8dbb4n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 7024 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ
                   ` (2 preceding siblings ...)
  2025-10-07  8:22 ` Anthony Towns
@ 2025-10-31  9:10 ` Tim Ruffing
  2025-10-31 13:09   ` waxwing/ AdamISZ
  2025-10-31 13:19 ` Garlo Nicon
  4 siblings, 1 reply; 19+ messages in thread
From: Tim Ruffing @ 2025-10-31  9:10 UTC (permalink / raw)
  To: waxwing/ AdamISZ, Bitcoin Development Mailing List

Hey Adam,

I think something is wrong here. 

Assume a group of order n=p*2^t where p is a large enough prime such
that the DL problem is hard. For example, Curve25519 has t=3 but the DL
problem still hard. Or, assuming n+1 is also prime, work in the
multiplicative group of integers modulo n+1 (which has group order n
then). I'm not aware of any obstacles to constructing such groups for
sufficiently large values of t. 

The crucial point is that, in these groups, the Pohlig-Hellman
algorithm can be used to compute the t least significant bits of the
discrete logarithm k of a group element R efficiently. So to embed t
bits in a Schnorr signature (R, s), simply pick k such that its t least
significant bits t are exactly these bits.

Of course, this does not work in BIP340 because it uses the secp256k1
group for which t=0, i.e., the group has prime order. But it appears
that the reasoning in your write up is not specific to prime-order
groups. Thus I conclude that something must be wrong or insufficient in
your argument.

Let me clarify that I do not claim that data can be embedded in a
BIP340 signature. I only claim that your arguments for why data can't
be embedded do not appear to be sound. I believe any proof that data
cannot be embedded in a Schnorr signature (or in a group element R) in
a prime-order group must somehow exploit the fact that all bits of k
are hard to compute from R; see Section 10 in Håstad-Näslund 2003 [1]
for a proof that this is the case for prime-order groups.

Best,
Tim

[1] https://www.csc.kth.se/~johanh/hnrsaacm.pdf

On Wed, 2025-10-01 at 07:24 -0700, waxwing/ AdamISZ wrote:
> Hi all,
> 
> https://github.com/AdamISZ/schnorr-unembeddability/
> 
> Here I'm analyzing whether the following statement is true: "if you
> can embed data into a (P, R, s) tuple (Schnorr pubkey and signature,
> BIP340 style), without grinding or using a sidechannel to "inform"
> the reader, you must be leaking your private key".
> 
> See the abstract for a slightly more fleshed out context.
> 
> I'm curious about the case of P, R, s published in utxos to prevent
> usage of utxos as data. I think this answers in the half-affirmative:
> you can only embed data by leaking the privkey so that it (can)
> immediately fall out of the utxo set.
> 
> (To emphasize, this is different to the earlier observations
> (including by me!) that just say it is *possible* to leak data by
> leaking the private key; here I'm trying to prove that there is *no
> other way*).
> 
> However I still am probably in the large majority that thinks it's
> appalling to imagine a sig attached to every pubkey onchain.
> 
> Either way, I found it very interesting! Perhaps others will find the
> analysis valuable.
> 
> Feedback (especially of the "that's wrong/that's not meaningful"
> variety) appreciated.
> 
> Regards,
> AdamISZ/waxwing
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to bitcoindev+unsubscribe@googlegroups.com.
> To view this discussion visit
> https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com
> .

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/5c15c2c265c92d5527fe3da510ac76c2a6e8e0e4.camel%40real-or-random.org.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-31  9:10 ` Tim Ruffing
@ 2025-10-31 13:09   ` waxwing/ AdamISZ
  0 siblings, 0 replies; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-10-31 13:09 UTC (permalink / raw)
  To: Bitcoin Development Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 5873 bytes --]

Hi Tim,

First, thanks for the considered reply! That is a very interesting point 
for sure.

I guess I have 2 or 3 responses:

First, my "theorem 1" was deliberately specific about BIP340. I am aware of 
the impact of Pohlig-Hellman on non prime order groups.

However despite me being able to "defend the thesis" in that literal sense, 
I still think your overall critique is valid. I think the "framework" (at 
least in the updated version of the paper; the first couple of drafts were 
a bit incoherent) makes sense, but it's too vague in the most important 
part of the reasoning, namely the invertibility of the functions described. 
But w.r.t. the values P and R, throughout, I was assuming pseudorandomness 
(uncontrollable output-ness) [1] of the mappings x -> P = xG and k -> R=kG. 
That assumption was both explicit and implicit in several steps (or perhaps 
leaps) I took (see e.g. how I refer to the function f(P, R, s) and in at 
least one place basically "ignore" the P, R dependency because they are 
uncontrollable); in my head , that was justifiable based on it being a 
prime order group, but at the very least, I should have been explicit.

> I believe any proof that data
cannot be embedded in a Schnorr signature (or in a group element R) in
a prime-order group must somehow exploit the fact that all bits of k
are hard to compute from R; see Section 10 in Håstad-Näslund 2003 [1]
for a proof that this is the case for prime-order groups.

Nice reference, thanks! I definitely wouldn't have found that. As per 
above, I just assumed this without justifying it; so my end conclusion that 
there is a reduction to hash preimage resistance is I guess incomplete.

[1] so .. k -> kG is kind of a pseudorandom function, or generator, right? 
If this is a DDH assumption, then perhaps that's what we should really 
reduce to (well, plus hash preimage resistance)?

Cheers,
Adam

On Friday, October 31, 2025 at 7:51:48 AM UTC-3 Tim Ruffing wrote:

> Hey Adam,
>
> I think something is wrong here. 
>
> Assume a group of order n=p*2^t where p is a large enough prime such
> that the DL problem is hard. For example, Curve25519 has t=3 but the DL
> problem still hard. Or, assuming n+1 is also prime, work in the
> multiplicative group of integers modulo n+1 (which has group order n
> then). I'm not aware of any obstacles to constructing such groups for
> sufficiently large values of t. 
>
> The crucial point is that, in these groups, the Pohlig-Hellman
> algorithm can be used to compute the t least significant bits of the
> discrete logarithm k of a group element R efficiently. So to embed t
> bits in a Schnorr signature (R, s), simply pick k such that its t least
> significant bits t are exactly these bits.
>
> Of course, this does not work in BIP340 because it uses the secp256k1
> group for which t=0, i.e., the group has prime order. But it appears
> that the reasoning in your write up is not specific to prime-order
> groups. Thus I conclude that something must be wrong or insufficient in
> your argument.
>
> Let me clarify that I do not claim that data can be embedded in a
> BIP340 signature. I only claim that your arguments for why data can't
> be embedded do not appear to be sound. I believe any proof that data
> cannot be embedded in a Schnorr signature (or in a group element R) in
> a prime-order group must somehow exploit the fact that all bits of k
> are hard to compute from R; see Section 10 in Håstad-Näslund 2003 [1]
> for a proof that this is the case for prime-order groups.
>
> Best,
> Tim
>
> [1] https://www.csc.kth.se/~johanh/hnrsaacm.pdf
>
>
>
> On Wed, 2025-10-01 at 07:24 -0700, waxwing/ AdamISZ wrote:
> > Hi all,
> > 
> > https://github.com/AdamISZ/schnorr-unembeddability/
> > 
> > Here I'm analyzing whether the following statement is true: "if you
> > can embed data into a (P, R, s) tuple (Schnorr pubkey and signature,
> > BIP340 style), without grinding or using a sidechannel to "inform"
> > the reader, you must be leaking your private key".
> > 
> > See the abstract for a slightly more fleshed out context.
> > 
> > I'm curious about the case of P, R, s published in utxos to prevent
> > usage of utxos as data. I think this answers in the half-affirmative:
> > you can only embed data by leaking the privkey so that it (can)
> > immediately fall out of the utxo set.
> > 
> > (To emphasize, this is different to the earlier observations
> > (including by me!) that just say it is *possible* to leak data by
> > leaking the private key; here I'm trying to prove that there is *no
> > other way*).
> > 
> > However I still am probably in the large majority that thinks it's
> > appalling to imagine a sig attached to every pubkey onchain.
> > 
> > Either way, I found it very interesting! Perhaps others will find the
> > analysis valuable.
> > 
> > Feedback (especially of the "that's wrong/that's not meaningful"
> > variety) appreciated.
> > 
> > Regards,
> > AdamISZ/waxwing
> > 
> > -- 
> > You received this message because you are subscribed to the Google
> > Groups "Bitcoin Development Mailing List" group.
> > To unsubscribe from this group and stop receiving emails from it,
> > send an email to bitcoindev+...@googlegroups.com.
> > To view this discussion visit
> > 
> https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com
> > .
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/61eb9abe-3e26-495d-9d00-dbda69fe018bn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 7906 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ
                   ` (3 preceding siblings ...)
  2025-10-31  9:10 ` Tim Ruffing
@ 2025-10-31 13:19 ` Garlo Nicon
  2025-11-01 14:49   ` waxwing/ AdamISZ
  4 siblings, 1 reply; 19+ messages in thread
From: Garlo Nicon @ 2025-10-31 13:19 UTC (permalink / raw)
  To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 3479 bytes --]

> if you can embed data into a (P, R, s) tuple (Schnorr pubkey and
signature, BIP340 style), without grinding or using a sidechannel to
"inform" the reader, you must be leaking your private key

You can embed data into a valid signature. For example:

R=k*G
P=d*G
k=first_chunk_of_data
d=second_chunk_of_data

And then, keys are "weak", because people can use "known plaintext attack",
to get them. However, if you want to push random data, that is unknown to
the reader, then it is known only by the holder of the data.

Which means, that the efficiency of this encoding is somewhere around 66%,
by grinding SHA-256 hashes, it could probably reach around 70% in practice.
Only s-value is something, that needs any grinding, for k-value and
d-value, you need only the data, and nothing else.

So, I guess it is a spectrum: something like 70% efficiency means, that you
need "known plaintext attack" to get the data. And then, you can use less
and less bits per public key, to make it arbitrarily weaker. Then, instead
of relying on a timelock, you can rely on computation difficulty for the
reader, for example: "how many bits I need to leak, to make it breakable by
lattice attack".

śr., 1 paź 2025 o 21:50 waxwing/ AdamISZ <ekaggata@gmail.com> napisał(a):

> Hi all,
>
> https://github.com/AdamISZ/schnorr-unembeddability/
>
> Here I'm analyzing whether the following statement is true: "if you can
> embed data into a (P, R, s) tuple (Schnorr pubkey and signature, BIP340
> style), without grinding or using a sidechannel to "inform" the reader, you
> must be leaking your private key".
>
> See the abstract for a slightly more fleshed out context.
>
> I'm curious about the case of P, R, s published in utxos to prevent usage
> of utxos as data. I think this answers in the half-affirmative: you can
> only embed data by leaking the privkey so that it (can) immediately fall
> out of the utxo set.
>
> (To emphasize, this is different to the earlier observations (including by
> me!) that just say it is *possible* to leak data by leaking the private
> key; here I'm trying to prove that there is *no other way*).
>
> However I still am probably in the large majority that thinks it's
> appalling to imagine a sig attached to every pubkey onchain.
>
> Either way, I found it very interesting! Perhaps others will find the
> analysis valuable.
>
> Feedback (especially of the "that's wrong/that's not meaningful" variety)
> appreciated.
>
> Regards,
> AdamISZ/waxwing
>
> --
> You received this message because you are subscribed to the Google Groups
> "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to bitcoindev+unsubscribe@googlegroups.com.
> To view this discussion visit
> https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com
> <https://groups.google.com/d/msgid/bitcoindev/0f6c92cc-e922-4d9f-9fdf-69384dcc4086n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAN7kyNhE39gJyV7xCRNpZAu-jkP7bu2DvkhZ7FdLsGxa-QLjQw%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 4568 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-10-31 13:19 ` Garlo Nicon
@ 2025-11-01 14:49   ` waxwing/ AdamISZ
  2025-11-02  9:11     ` Garlo Nicon
  0 siblings, 1 reply; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-11-01 14:49 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 5975 bytes --]

Hi Garlo Nicon,

Before I answer your point I want to mention (to readers): probably some 
things remained tacit in this thread but are worth emphasizing:

1. It's always trivial to get a 100% embedding rate if it's OK to assume 
the embedder is choosing to share data off-blockchain with others (just xor 
the real signature with their chosen data and call that the key). This is 
of course is a bit silly (though not entirely silly); if the purpose is to 
*communicate* then they can use the communication channel for the data, 
instead of the xor value, and forget about the blockchain. On the other 
hand if their purpose is to publish data, and rely on the immutability and 
persistence of the blockchain, then there is the problem that the xor key 
can be lost; it's that offchain data that represents the actual semantics 
of what they published, and so they're in rather the same position as they 
would have been without the blockchain existing at all. (insert 
finesses/caveats but, basically).

2. All of the above theoretical analysis doesn't work for ECDSA *as an 
algorithm outside of Bitcoin*. You get 32 bytes of embedding without 
leaking the private key, there. (the s-value can literally be made to say 
"hello world" 3 times or whatever). this is the non-pubkey-committing 
nature of standard ECDSA. I *think* you can make it behave the same as 
Schnorr in terms of pubkey-unembeddability-without-key-leakage by putting 
the pubkey in the message, but it's even harder to analyze than Schnorr 
(which is already hard).

3. In contrast to 2., the pubkey is in fact embedded in the message 
(indirectly), at least usually, in Bitcoin (except sighash_noinput type 
stuff which isn't live), so you can't put hello world in the signatures for 
now, at least AFAIK. Still even then you're stuck at a 33% rate if we 
include all of P, R, s, which seems reasonable (in fact, that's a generous 
measure). Again, I am ignoring grinding which always adds a bit more.

Anyway, you say:

> So, I guess it is a spectrum: something like 70% efficiency means, that 
you need "known plaintext attack" to get the data. And then, you can use 
less and less bits per public key, to make it arbitrarily weaker. Then, 
instead of relying on a timelock, you can rely on computation difficulty 
for the reader, for example: "how many bits I need to leak, to make it 
breakable by lattice attack".

I think it's an interesting idea to use lattice attacks but I can't find a 
way to agree with 66 or 70%. Here's why:

We assume a "few" signatures are all on the same private key. If there are 
N such signatures, then once LLL or similar lattice method is successful, 
you retrieve the 1 private key (32 bytes) and the N * 27 bytes (or so; 
imagining 5 bytes are biased; it *can* go lower, requiring more signatures; 
doesn't change the situation).

So you embedded successfully 27N+32 (all the nonces and the private key) 
into 64N + 32N [1] for a ratio that is a bit less than 33%. Compare with 
just using a repeated nonce in 2 equations, where you get 64 bytes (nonce, 
privkey) from 2*P + 2*(R,s) or so a total of 196, i.e. 33% exactly. 
Basically, at least in a bitcoin context, there is no gain in doing a 
partial exposure of the nonce; you may as well just reveal all of it, 
either by repetition or as noted in the pdf, by using something public like 
a block hash. Notice that if my note [1] did not apply, then all the above 
isn't correct, the ratios work differently.

Can you let me know how you're getting 66%+? I'm guessing you're just 
saying "the k and the d values" but as per above I don't see it. Maybe 
write out concretely what the data-reader would be doing?

[1] It's easy to slip up here - I know I did - when considering publication 
*on bitcoin* compared with just publishing signatures. In the latter case, 
I can publish 100 signatures with the tacit assumption that they all refer 
to the same key (or, you can verify, to check). In bitcoin the pubkey is 
never tacit, it's always published in the scriptPubKey or scriptSig or 
whatever, so you can't gain efficiency from repeated uses of the same key 
(i.e. you can't write 64N + 32, it must be 64N + 32N for (P, R, s) tuples).

Cheers,
Adam

On Friday, October 31, 2025 at 10:25:30 AM UTC-3 Garlo Nicon wrote:

> if you can embed data into a (P, R, s) tuple (Schnorr pubkey and 
signature, BIP340 style), without grinding or using a sidechannel to 
"inform" the reader, you must be leaking your private key

You can embed data into a valid signature. For example:

R=k*G
P=d*G
k=first_chunk_of_data
d=second_chunk_of_data

And then, keys are "weak", because people can use "known plaintext attack", 
to get them. However, if you want to push random data, that is unknown to 
the reader, then it is known only by the holder of the data.

Which means, that the efficiency of this encoding is somewhere around 66%, 
by grinding SHA-256 hashes, it could probably reach around 70% in practice. 
Only s-value is something, that needs any grinding, for k-value and 
d-value, you need only the data, and nothing else.

So, I guess it is a spectrum: something like 70% efficiency means, that you 
need "known plaintext attack" to get the data. And then, you can use less 
and less bits per public key, to make it arbitrarily weaker. Then, instead 
of relying on a timelock, you can rely on computation difficulty for the 
reader, for example: "how many bits I need to leak, to make it breakable by 
lattice attack".

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/781840dd-b633-4d87-b05d-d389c6374d63n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 6721 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-11-01 14:49   ` waxwing/ AdamISZ
@ 2025-11-02  9:11     ` Garlo Nicon
  2025-11-02 13:30       ` waxwing/ AdamISZ
  0 siblings, 1 reply; 19+ messages in thread
From: Garlo Nicon @ 2025-11-02  9:11 UTC (permalink / raw)
  To: waxwing/ AdamISZ; +Cc: Bitcoin Development Mailing List

[-- Attachment #1: Type: text/plain, Size: 10875 bytes --]

> Can you let me know how you're getting 66%+?

You have three chunks, which are needed: (P,R,s). You can control "P" and
"R" directly and fully, by feeding it with your data. That means, you can
get 66%, because it is just 2/3, if you assume, that all values have the
same size.

Then, to get 70% or more, grinding s-value is needed, which is doable, if
you want to for example grind two or three bytes of s-value, and stop
there. But let's assume, that you want to make it as fast as possible, so
you don't grind anything, and then stop at 66%.

> Maybe write out concretely what the data-reader would be doing?

I already told you, when I said "known plaintext attack". If you want to
put random data into private keys or signatures, then things are hard to
break. However, if it is something useful for the reader, then usually,
that kind of data are non-random. For example: some users store
transactions inside OP_RETURNs, and they use ASCII hex representation. If
they would use binary encoding, then they would save 50% space. But people
simply don't care.

And the similar case is possible here: if you want to store random data,
then it is hard to use this method. However, if you want to store ASCII
text, where many words can be found in a dictionary, or where the format of
the data is known upfront, or can be easily guessed, then the security of
the keys, is comparable to the brainwallets.

Which means, that you can just put your data into the private key of the
user, and a "signature nonce" (which is nothing else, but yet another
private key, placed on secp256k1). And then, if you know, that your data,
is for example "ASCII string", then it means, that each and every key, that
you produce, simply leaks at least 32 bits per 256-bit key, if not more.

And then, if the attacker can get coins from brainwallets, then decoding
such data is not much harder than that. If your data contains simple words,
then even dictionary attacks can be used.

So, let's say that you want to encode 64 bytes in a signature:

d="This is a test of storing data
i"=0x5468697320697320612074657374206f662073746f72696e6720646174612069
k="n private keys inside
signatures"=0x6e2070726976617465206b65797320696e73696465207369676e617475726573
P=d*G=02A2EF730B26A905A7D91940E3A512C5771D8BC8BCCA153D714E328043856CBB2B
R=k*G=02E19FCA1025CFD67409309E2B1711D723BFB67EC520917D9A0AD9432414DA0D0A

And then, s-value comes from SHA-256 hashing, so it is harder to control.
But grinding a few bytes can give something around 70%. However, even if we
stop at 66%, then still: useful data are regular. There are many patterns.
If something is an ASCII string, then 1/8 bits are cleared, and it is
known, which ones should be set to zero. If it is in English, then the
entropy is even lower. Which means, that the private key is not directly
"leaked", by being passed to the reader, but there is an assumption, that
it will be easy enough to get.

Also, if the key won't be leaked, then it can be used as an advantage:
first, NFTs can be minted, and transferred, and then, you can pass the data
directly, and say: "See? You can confirm, that they are encoded into
private keys properly". And as long as the data in question is difficult
enough to fully guess, the key is not revealed, even if it is quite weak.

Which means, that my answer to your question is: it is a spectrum. You can
make a weak signature, and have 33% encoding efficiency, and leak every
private key immediately. But you can make something in a spectrum between
33% and 66%, and make something, that is "weak", but something, which won't
be broken "on the spot, immediately after being broadcasted" (so you cannot
really say, that the keys are "leaked", because you need to know
"something" about the plaintext inside private keys, or about its format).
And it is good for spammers, because then, funds can be safely confirmed,
and later revealed, that "hey, I encoded that data here, by wasting 3 MB of
block space, to encode 2 MB of ASCII strings, here is your NFT, that you
can buy here".

sob., 1 lis 2025 o 16:47 waxwing/ AdamISZ <ekaggata@gmail.com> napisał(a):

> Hi Garlo Nicon,
>
> Before I answer your point I want to mention (to readers): probably some
> things remained tacit in this thread but are worth emphasizing:
>
> 1. It's always trivial to get a 100% embedding rate if it's OK to assume
> the embedder is choosing to share data off-blockchain with others (just xor
> the real signature with their chosen data and call that the key). This is
> of course is a bit silly (though not entirely silly); if the purpose is to
> *communicate* then they can use the communication channel for the data,
> instead of the xor value, and forget about the blockchain. On the other
> hand if their purpose is to publish data, and rely on the immutability and
> persistence of the blockchain, then there is the problem that the xor key
> can be lost; it's that offchain data that represents the actual semantics
> of what they published, and so they're in rather the same position as they
> would have been without the blockchain existing at all. (insert
> finesses/caveats but, basically).
>
> 2. All of the above theoretical analysis doesn't work for ECDSA *as an
> algorithm outside of Bitcoin*. You get 32 bytes of embedding without
> leaking the private key, there. (the s-value can literally be made to say
> "hello world" 3 times or whatever). this is the non-pubkey-committing
> nature of standard ECDSA. I *think* you can make it behave the same as
> Schnorr in terms of pubkey-unembeddability-without-key-leakage by putting
> the pubkey in the message, but it's even harder to analyze than Schnorr
> (which is already hard).
>
> 3. In contrast to 2., the pubkey is in fact embedded in the message
> (indirectly), at least usually, in Bitcoin (except sighash_noinput type
> stuff which isn't live), so you can't put hello world in the signatures for
> now, at least AFAIK. Still even then you're stuck at a 33% rate if we
> include all of P, R, s, which seems reasonable (in fact, that's a generous
> measure). Again, I am ignoring grinding which always adds a bit more.
>
> Anyway, you say:
>
> > So, I guess it is a spectrum: something like 70% efficiency means, that
> you need "known plaintext attack" to get the data. And then, you can use
> less and less bits per public key, to make it arbitrarily weaker. Then,
> instead of relying on a timelock, you can rely on computation difficulty
> for the reader, for example: "how many bits I need to leak, to make it
> breakable by lattice attack".
>
> I think it's an interesting idea to use lattice attacks but I can't find a
> way to agree with 66 or 70%. Here's why:
>
> We assume a "few" signatures are all on the same private key. If there are
> N such signatures, then once LLL or similar lattice method is successful,
> you retrieve the 1 private key (32 bytes) and the N * 27 bytes (or so;
> imagining 5 bytes are biased; it *can* go lower, requiring more signatures;
> doesn't change the situation).
>
> So you embedded successfully 27N+32 (all the nonces and the private key)
> into 64N + 32N [1] for a ratio that is a bit less than 33%. Compare with
> just using a repeated nonce in 2 equations, where you get 64 bytes (nonce,
> privkey) from 2*P + 2*(R,s) or so a total of 196, i.e. 33% exactly.
> Basically, at least in a bitcoin context, there is no gain in doing a
> partial exposure of the nonce; you may as well just reveal all of it,
> either by repetition or as noted in the pdf, by using something public like
> a block hash. Notice that if my note [1] did not apply, then all the above
> isn't correct, the ratios work differently.
>
> Can you let me know how you're getting 66%+? I'm guessing you're just
> saying "the k and the d values" but as per above I don't see it. Maybe
> write out concretely what the data-reader would be doing?
>
> [1] It's easy to slip up here - I know I did - when considering
> publication *on bitcoin* compared with just publishing signatures. In the
> latter case, I can publish 100 signatures with the tacit assumption that
> they all refer to the same key (or, you can verify, to check). In bitcoin
> the pubkey is never tacit, it's always published in the scriptPubKey or
> scriptSig or whatever, so you can't gain efficiency from repeated uses of
> the same key (i.e. you can't write 64N + 32, it must be 64N + 32N for (P,
> R, s) tuples).
>
> Cheers,
> Adam
>
> On Friday, October 31, 2025 at 10:25:30 AM UTC-3 Garlo Nicon wrote:
>
> > if you can embed data into a (P, R, s) tuple (Schnorr pubkey and
> signature, BIP340 style), without grinding or using a sidechannel to
> "inform" the reader, you must be leaking your private key
>
> You can embed data into a valid signature. For example:
>
> R=k*G
> P=d*G
> k=first_chunk_of_data
> d=second_chunk_of_data
>
> And then, keys are "weak", because people can use "known plaintext
> attack", to get them. However, if you want to push random data, that is
> unknown to the reader, then it is known only by the holder of the data.
>
> Which means, that the efficiency of this encoding is somewhere around 66%,
> by grinding SHA-256 hashes, it could probably reach around 70% in practice.
> Only s-value is something, that needs any grinding, for k-value and
> d-value, you need only the data, and nothing else.
>
> So, I guess it is a spectrum: something like 70% efficiency means, that
> you need "known plaintext attack" to get the data. And then, you can use
> less and less bits per public key, to make it arbitrarily weaker. Then,
> instead of relying on a timelock, you can rely on computation difficulty
> for the reader, for example: "how many bits I need to leak, to make it
> breakable by lattice attack".
>
> --
> You received this message because you are subscribed to the Google Groups
> "Bitcoin Development Mailing List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to bitcoindev+unsubscribe@googlegroups.com.
> To view this discussion visit
> https://groups.google.com/d/msgid/bitcoindev/781840dd-b633-4d87-b05d-d389c6374d63n%40googlegroups.com
> <https://groups.google.com/d/msgid/bitcoindev/781840dd-b633-4d87-b05d-d389c6374d63n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAN7kyNgyoA5rb8hYuxai6bSaPdon%3Dy%3D9Z%2BdAfqP6Mf%3DPyniJLw%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 12275 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bitcoindev] On (in)ability to embed data into Schnorr
  2025-11-02  9:11     ` Garlo Nicon
@ 2025-11-02 13:30       ` waxwing/ AdamISZ
  0 siblings, 0 replies; 19+ messages in thread
From: waxwing/ AdamISZ @ 2025-11-02 13:30 UTC (permalink / raw)
  To: Bitcoin Development Mailing List

[-- Attachment #1.1: Type: text/plain, Size: 4249 bytes --]

> I already told you, when I said "known plaintext attack". If you want to 
put random data into private keys or signatures, then things are hard to 
break. However, if it is something useful for the reader, then usually, 
that kind of data are non-random. For example: some users store 
transactions inside OP_RETURNs, and they use ASCII hex representation. If 
they would use binary encoding, then they would save 50% space. But people 
simply don't care.

> And the similar case is possible here: if you want to store random data, 
then it is hard to use this method. However, if you want to store ASCII 
text, where many words can be found in a dictionary, or where the format of 
the data is known upfront, or can be easily guessed, then the security of 
the keys, is comparable to the brainwallets.

> Which means, that you can just put your data into the private key of the 
user, and a "signature nonce" (which is nothing else, but yet another 
private key, placed on secp256k1). And then, if you know, that your data, 
is for example "ASCII string", then it means, that each and every key, that 
you produce, simply leaks at least 32 bits per 256-bit key, if not more.

Ah, right; I had originally written a response to this idea but then 
discarded it on the basis that it's kinda "obvious" that we shouldn't think 
about that, and focused on the more in-the-weeds concept of a lattice 
attack instead.

But it isn't obvious.

So let's think of the spectrum here. First, the most trivial nonce to 
break: one consisting of a single bit (OK technically you can't encode k=0, 
heh, but, whatever, put it in the second bit of the string). Obviously that 
is extractable, getting 32 bytes plus one bit. That one extra bit above the 
33% is achievable because of "grinding" except here grinding is the most 
trivial version possible: trying 2 alternatives. This still fits my 
original claim, which is "33% plus whatever you can get from grinding, and 
you leak the secret key in the process".

Other end of the spectrum: not 1 bit or 5 bytes but say 20 bytes represent 
an actual message, and let's say the rest of the 256 bit k-string is zero. 
Now clearly one can't grind that, if it's random. Which brings us to your 
point about weakness: let's say the 20 bytes of message comes from a space 
of possible messages, known to all potential readers, whose size is 
actually 40 bits. Because they can grind 40 bits, they can retrieve the 
message, but that message is only 40 bits of information. E.g. most crude 
idea; a table of 2^40 messages, you are picking one .. notice it doesn't 
matter if the length of each message is 40 bits or 160 bits or 256 bits; 
you are only conveying 40 bits of *information* if you do this.

From this point of view it's pretty clear that we haven't changed the 
general conclusion: you only get 33% (say 32 bytes), *plus* whatever you 
can get from grinding, and since that's exponential work, it's never going 
to be very big, say 5 bytes or possibly 6? And you leak the key of course.

I do agree with you that there could be scenarios where this "mode" of 
publication/embedding might be the preferable one, because we're gliding 
over that line between "pure publication" and "publication with 
sidechannels". As I argued here and elsewhere, if there is a proper, 
viable, sidechannel, then most of this analysis doesn't apply but a sort of 
mixup where "if you know information X you can grind out more information Y 
from the onchain data" is possible.

But no, as per the above, you are definitely not conveying 66% (that is to 
say , 64 bytes out of 96) in the P, R, s tuple using this method. That'd 
only be true in the sense that if the space of possible messages is "hello 
world\n\n" and "goodbye world" and then you claimed you were sending 13 
bytes because a reader can find the message.

Cheers,
AdamISZ/waxwing

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/31d18bd9-62e0-4035-b04f-f70ff4253257n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 4720 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2025-11-02 13:33 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-01 14:24 [bitcoindev] On (in)ability to embed data into Schnorr waxwing/ AdamISZ
2025-10-01 22:10 ` Greg Maxwell
2025-10-01 23:11   ` Andrew Poelstra
2025-10-02  0:25     ` waxwing/ AdamISZ
2025-10-02 15:56       ` waxwing/ AdamISZ
2025-10-02 19:49         ` Greg Maxwell
2025-10-06 13:04           ` waxwing/ AdamISZ
2025-10-03 13:24 ` Peter Todd
2025-10-04  2:39   ` waxwing/ AdamISZ
2025-10-07  8:22 ` Anthony Towns
2025-10-07 12:05   ` waxwing/ AdamISZ
2025-10-08  5:12     ` Anthony Towns
2025-10-08 12:55       ` waxwing/ AdamISZ
2025-10-31  9:10 ` Tim Ruffing
2025-10-31 13:09   ` waxwing/ AdamISZ
2025-10-31 13:19 ` Garlo Nicon
2025-11-01 14:49   ` waxwing/ AdamISZ
2025-11-02  9:11     ` Garlo Nicon
2025-11-02 13:30       ` waxwing/ AdamISZ

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox