Privacy on the Blockchain
The need for privacy in the financial
Supply chain privacy:
system
• A manufacturer does not want to reveal
how much it pays its supplier for parts.
Payment privacy:
• A company that pays its employees in crypto wants to keep list
of employees and salaries private.
• Endusers need privacy for rent, donations, purchases
Business logic privacy: Can the code of a smart contract be private?
The need for privacy in the financial
system
The bottom line:
Blockchains cannot reach their full potential
without some form of private transactions
Either (i) hide transaction amount, but not the end points, or
(ii) hide both the amount and the end points.
Types of Privacy
Pseudonymity: (weak privacy)
• Every user has a long-term consistent pseudonym (e.g. reddit)
• Pros: reputation
• Cons: link to real-world identity can leak over time
Full anonymity: User’s transactions are unlinkable
• No one can tell if two transactions are from the same address
A difficult question: privacy from
No privacy: who? Privacy from the
Everyone can see all public:
transactions in the Only a trusted operator
clear can see transactions
Semi-full privacy: full privacy:
only “local” law no one can see
enforcement can see transactions
transactions
Negative aspects of full privacy
How to prevent criminal activity?
The challenge:
• How to support positive applications of
private payments, but prevent the negative ones?
• Can we ensure legal compliance while preserving privacy?
• Yes! The key technology: zero knowledge proofs
Are Bitcoin and Ethereum Private?
The base systems are definitely not …
Privacy in Ethereum?
• Every account balance is public
• For Dapps: code and internal state are public
• All account transactions are linked to account
[Link]:
Address 0x1654b0c3f62902d7A86237
…
Privacy in Bitcoin?
from addresses amounts to addresses amounts
Alice can have many addresses (creating address is free) Transaction data can be used
Inputs: A1:4, A2: 5 out: B:6, A3:3 to link an address to a physical
Change address identity
(chainalysis)
Alice’s addresses Bob’s address
Linking an addresses to an identity
inputs: A1: 4, A2: 5 outputs: B: 6, A3: 3
Alice buys a book from a merchant:
• Alice learns one of merchant’s address (B)
• Merchant links three addresses to Alice (A1, A2, A3)
Alice uses an exchange (ETH ↔ USD)
• BSA: a US exchange must do KYC (know your customer)
… collect and verify Alice’s ID
• Exchange links Alice to her addresses (A1, A2, A3)
At the network layer …
Miners/validators learn Alice’s IP address
end users
skA
signed Tx
skB
skC
P2P network
Solution:
De-anonymization strategy: Idioms of use
A general strategy for de-anonymizing Bitcoin addresses
Heuristic 1:
Two addresses are input to a TX
⇒ both addresses are controlled by same entity
De-anonymization strategy: Idioms of use
Heuristic 2:
Change address is controlled by the same user as input address
Which is the change address?
• Heuristic: a new address that receives less than every input
A Bitcoin experiment [Meiklejohn, et al.]
step 1: Heuristic 1 and 2 ⇒ 3.3M clusters
step 2: 1070 addresses identified by interacting with merchants
• Coinbase, Kraken, …
step 3: now 15% of all addresses identified
• Learn total assets for all clusters
Commercial efforts: Chainalysis, Elliptic, …
Private coins on a Public Blockchain
Attempt 1: simple mixing
1 ETH to M fresh addr X mixer
from Alice from Alice TLS address: M
Send:
1 ETH to M fresh addr Y 1 ETH to X
from Bob from Bob TLS 1 ETH to Y
1 ETH to Z
blockchain
1 ETH to M fresh addr Z
from Carol from Carol has 3 ETH
TLS
Observer knows Y belongs to one of {Alice, Bob, Carol} but does not know which one
⟹ anonymity set of size 3.
Problems: (i) mixer M knows shuffle, (ii) mixer can abscond with 3 ETH !!
Increasing the anonymity set
Mixer Mixer
M1 M2
addr X X’ X’’
Privacy: as long as one of M1 or M2 are honest
Secure mixing without a mixer?
Problem: Mixer can abscond with funds or reveal the shuffle.
Can we securely mix without a trusted mixer? Answer: yes!
• on Bitcoin: CoinJoin (used by, e.g., Wasabi wallet)
• on Ethereum: Tornado cash, Railgun, Privacy Pools, …
… a single mixer using ZK proofs – next lecture
CoinJoin: Bitcoin Mixing without
Mixer
The setup: Alice, Bob, and Carol want to mix together.
Alice owns UTXO A1:5, Bob owns UTXO B1:3, Carol owns C1:2
A1: 5, A3 (chang
e addr)
A2 (post mix ad
dress over Tor) A1: 5, A3
B1: 3, B3 mix
B1: 3, B3 (change addr) C1: 2, C3 addresse
B2 (post mix address over Tor) s
B ob) B2, A2, C2
(sa m e a s A li c e a n d
public forum
CoinJoin: Bitcoin Mixing without
Mixer
CoinJoin TX: all three prepare and sign the following Tx:
inputs (not private): A1: 5, B1: 3, C1: 2
mix addresses
outputs (private): B2: 2, A2: 2, C2: 2
outputs (not private): A3: 3, B3: 1
Mixed UTXOs all have same value = min of inputs (2 in this case)
All three post sigs on Pastebin ⇒ one of them posts Tx on chain.
Coinjoin drawbacks
In practice: each CoinJoin Tx mixes about 40 inputs
• Large Tx: 40 inputs, 80 outputs
All participants must sign CoinJoin Tx for it to be valid
⇒ ensures all of them approve the CoinJoin Tx
… but any one of them can disrupt the process
Beyond simple mixing
Private Tx on a public blockchain
Can we have private transactions
on a public blockchain?
Naïve reasoning:
universal verifiability ⇒ transaction data must be public
otherwise, how we can verify Tx ??
crypto magic ⇒ private Tx on a publicly verifiable blockchain
Crypto tools: commitments and zero knowledge proofs
A paradigm for Private Tx
public blockchain
committed updated
state Tx committed state
committed
Tx data
(reveals nothing about Tx data or state)
Review: cryptographic commitments
Cryptographic commitment: emulates an envelope
data data
Many applications: e.g., a DAPP for a sealed bid auction
• Every participant commits to its bid,
• Once all bids are in, everyone opens their commitment
Cryptographic Commitments
Syntax: a commitment scheme is two algorithms
• commit(msg, r) ⇾ com
secret randomness commitment string
• verify(msg, com, r) ⇾ accept or reject
anyone can verify that commitment was opened correctly
Commitments: security properties
Example: hash-based commitment
•
What is a zk-SNARK?
Succinct zero knowledge proofs:
an important tool for privacy on the blockchain
What is a zk-SNARK ? (intuition)
•
Commercial interest in SNARKs
Many more companies using (zk)-SNARKs: [Link]
Blockchain Applications I
Outsourcing computation: (no need for zero knowledge)
L1 chain quickly verifies the work of an off-chain service
To minimize gas: need a short proof, fast to verify
Examples:
• Scalability: proof-based Rollups (zkRollup)
off-chain service processes a batch of Tx;
L1 chain verifies a succinct proof that Tx were processed correctly
• Bridging blockchains: proof of consensus (zkBridge)
Chain A produces a succinct proof about its state. Chain B verifies.
Blockchain Applications II
Some applications require zero knowledge (privacy):
• Private Tx on a public blockchain:
• zk proof that a private Tx is valid (Tornado cash, Railgun, Zcash, Aleo)
• Compliance:
• Proof that a private Tx is compliant with banking laws
• Proof that an exchange is solvent in zero-knowledge (Proven)
More on these blockchain applications in a minute
Many non-blockchain applications
Blockchains drive the development of SNARKs
… but many non-blockchain applications benefit
Arithmetic circuits
•
Interesting arithmetic circuits
Examples:
• Chash(h, m): outputs 0 if SHA256(m) = h , and ≠0 otherwise
Chash(h, m) = (h – SHA256(m)) , |Chash| ≈ 20K gates
• Csig(pk, m, σ): outputs 0 if σ is a valid ECDSA signature
on m with respect to pk
(preprocessing) NARK: Non-interactive ARgument of
Knowledge
Prover Verifier accept or
reject
(preprocessing) NARK: Non-interactive ARgument of
Knowledge
•
NARK: requirements (informal)
accept or reject
SNARK: a Succinct ARgument of
Knowledge
•
short “summary” of circuit
SNARK: a Succinct ARgument of
Knowledge
SNARK: a NARC (complete and knowledge sound) that is succinct
zk-SNARK: a SNARK that is also zero knowledge
The trivial SNARK is not a SNARK
•
The SNARK zoo (a sub-sample) … WHIR 🌪
FRI-Binius
Greyhound
Basefold STIR
Cambrian explosion of SNARKs Hyperplonk Lasso
🥣
Mangrove
Brakedown Gemini Jolt
Binius Ligetron
Ligero++fflonk Plonky2
SPARK Halo Infinite LaBRADORTinySpartan
SLAP
MIRAGE Orion
👋
Zeromorph
Dory
FRI
STARK
Ligero
Aurora
Bulletproofs
BCS
PST Hyrax
Groth16
KZG zk Virtual Machines
GGPR’13
2010
Credit: Giacomo Fenzi
SNARKs in practice: two
DSL
program
approaches
SNARK
friendly heavy
compiler format computation
Circom,
Gnark circuit, SNARK
Leo, R1CS, prover
Zinc, Plonkish,
Noir,
compiler RISC-V
Rust Virtual
program Machine
(CPU + memory)
END OF LECTURE
Next lecture:
more on zk-SNARKs and their applications