BitIodine: extracting intelligence from the Bitcoin

Transkript

BitIodine: extracting intelligence from the Bitcoin
BitIodine: extracting
intelligence from the
Bitcoin network
Michele Spagnuolo
http://miki.it
[email protected]
@mikispag
Bitcoin
BitIodine
About Bitcoin
Decentralized, global digital currency
A global peer-to-peer network, across which
transactions are broadcast
Can support strong anonymity implementations not strongly anonymous
Recent clarification to US regulations - to be
considered as foreign currency and not legal
tender
“Open money”
“Open money”
trust-no-one currency/commodity that isn’t
subject to manipulation by central banks or
corporations
designed for security, founded on
distributed cryptography
“crypto-currency”
”
.
.
.
s
i
mer
u
n
n
i
1
s
2
i
4
v
1
s,
t
i
i
r
s
e
a
um
t
N
n
e
a
D
,
u
e
v
Q
“
ru
t
S
.
T
History
Digital currency based in cryptography 1993, 1998.
Bitcoin whitepaper - 2008, by "Satoshi
Nakamoto" (pseudonym)
First client in January 2009
Open Source
No single point of failure
Who uses Bitcoin?
A lot of “legitimate” uses of Bitcoin
http://weusecoins.org
Why use Bitcoin?
Speed and price
No central authority
No setup
Better privacy
No counterfeit, no chargebacks
No account freezing
Algorithmically known inflation
Trading goods for
Bitcoin
Also accepting
Bitcoin...
Who also uses
Bitcoin? Top 20 categories
$1.2M
rev/month
220 categories, ranging from
digital goods to various kinds of
narcotics or prescription
medicines
Some stats about
Bitcoin
As of June 1:
1 BTC is traded for 128.4 USD
(max: 266 USD on April 11, min: 50 USD on April 16)
$1.5B market capitalization
30,000 BTC / 3.8M$ sent per hour
Approximately 60,000 transactions per day
Key concepts
Address
Wallet
Transaction
Block
Blockchain
Mining / Generation
Address and Wallet
An address is a string like
1MikiSPbrhCFk7S4wzZP7gQqhwWH866DCb
generated by a Bitcoin client together with the private key
needed to spend and redeem the coins sent to it. It is public, and
can be posted everywhere in order to receive payments.
A wallet is a file which stores addresses and the private keys
needed to use them.
Transaction
A “record” of coins moving from one or more addresses to one
recipient address. It is saved in a block.
How spending works
Several cryptographic
technologies
public key
cryptography - ECDSA
hashing algorithms SHA-1, SHA-256, ...
Future proof - designed
to be upgraded in a
forward compatible way
Block
A block is a list of transactions.
Each block embeds new transactions that took place before it was
created.
A new block is generated, on average, every 10 minutes.
The first generated block is called genesis block.
Transactions are placed in blocks, which are linked by SHA-256
hashes.
Blockchain
The blockchain is the linked list of all the blocks that have been
generated since day one.
A full copy of the blockchain contains every transaction ever
executed, and is distributed on every client.
For any block on the chain, there is only one path to the genesis
block. Coming from the genesis block, however, there can be
forks.
Mining / Generation
Users on the Bitcoin network compete to generate a new valid block
that satisfies a difficult requirement (block hash with N leading zero
bytes) - this is done by brute force. The lucky solver is then rewarded
a fixed number of bitcoins for cryptographically validating
transactions on the network.
The parameter N is automatically adjusted by the protocol
(difficulty), making it scale with the number of miners and making
sure that, statistically, one block is generated every 10 minutes.
Mining hardware
Nowadays mining with GPUs is no longer profitable - FPGAs or
ASICs.
Distributed
consensus problem
New blocks are linked to older blocks, forming a block chain that
is constantly being extended.
Miners may generate blocks at the same time or too close
together.
A participant choosing to extend an
existing path in the block chain
indicates a vote towards consensus on
that path. The longer the path,
the more computation was expended
building it.
in
a
h
c
d
e
t
p
e
cc
a
=
h
t
a
p
t
s
e
g
n
Lo
Bitcoin offers a unique solution to the consensus problem in distributed systems
since voting power is directly proportional to computing power.
BitIodine
Is Bitcoin anonymous?
Yes
No
• Users don't need to
• Every transaction is
set up accounts
• No ID required
• Cash-like
• Tor / VPN help
stored in the blockchain
• Analysis of the
blockchain can correlate
addresses
BitIodine
a tool for analyzing and profiling the Bitcoin
network
Transaction graph
User graph
Techniques for creating the
User Graph
Two main techniques:
• Multi-input transactions grouping
• Shadow address guessing (change)
Multi-input transactions
When a transaction has multiple input addresses,
we can safely assume that those addresses belong to
the same wallet, thus to the same user.
Assumption: owners don't share private keys.
Caveat: web wallets - pools that would be
"mistakenly" grouped as a single user.
Shadow addresses
• the entire value of an unspent output of a prior
transaction must be spent and used as input for a
new transaction
• input is destroyed, and change should be sent back
to the user
• to improve anonymity, a "shadow" address is
automatically created and used to collect back the
"change"
Shadow addresses
When a Bitcoin transaction has exactly two output
addresses, O1 and O2, such that O2 is a new address
(i.e., an address that has never appeared before), and
O1 corresponds to an old address (an address that
has appeared previously), we can assume that O2
constitutes a shadow address for one of the input
addresses of that transaction.
Transactions with two outputs are the vast majority.
Transac/on'count'per'number'of'outputs'
10000000"
9000000"
Why?
8000000"
One payee, one shadow
Number'of'transac/ons'
7000000"
address for change back.
6000000"
5000000"
4000000"
3000000"
2000000"
1000000"
0"
1" 2" 3" 4" 5" 6" 7" 8" 9" 10"11"12"13"14"15"16"17"18"19"20"21"22"23"24"25"26"27"28"29"30"31"32"33"34"35"36"37"38"39"40"41"42"43"44"45"46"47"48"49"50"
Number'of'outputs'
Transac7on(volume(per(number(of(outputs(
1,400,000,000"
Volumes((BTC)(
1,200,000,000"
1,000,000,000"
800,000,000"
600,000,000"
400,000,000"
200,000,000"
0"
1" 2" 3" 4" 5" 6" 7" 8" 9" 10"11"12"13"14"15"16"17"18"19"20"21"22"23"24"25"26"27"28"29"30"31"32"33"34"35"36"37"38"39"40"41"42"43"44"45"46"47"48"49"50"
Number(of(outputs(
Shadow addresses
The official bitcoin client tries to randomize the position of the change
output, but code is flawed:
File: wallet.cpp
Number of payees
// Insert change txn at random position:
vector<CTxOut>::iterator position = wtxNew.vout.begin()+GetRandInt(wtxNew.vout.size());
wtxNew.vout.insert(position, CTxOut(nChange, scriptChange));
size()+1
If just two outputs (one payee), GetRandInt(1) always returns 0.
The change ends up always in the first output.
If multiple outputs, change is never the last output.
Still not fixed! (Recently fixed!)
Shadow addresses
Given the bug in the official client, BitIodine checks
transactions with exactly two outputs, and also checks that
the first address was new at the time of the transaction.
If it is, chances are it's a shadow address.
It is a heuristic.
Evaluating heuristics
Number of clusters
6M
5M
4M
3M
2M
1M
0M
Cluster size
15
14.32
5,331,657
10
2,175,621
660,494
Heur. 1
Heur. 1+2 det. Heur. 1+2 relaxed
Coverage
100
5
4.30
67
33
0
87.1%
30%
54.5%
Heur. 1
39%
50%
99.7%
Heur. 1+2 det. Heur. 1+2 relaxed
Transaction coverage
Addresses clustered
5
1.74
0
1
Heur. 1
1
Heur. 1+2 det. Heur. 1+2 relaxed
Average
Median
The Classifier:
labels
for addresses
The Classifier:
labels
for users
Example: classifying address and
owner
• Classify an address we
found in IRC chat logs of
channel #bitcoin-otc
• Borrowed money from
other Bitcoin-OTC users
Zero-balance address,
exhausted in 84 transactions,
belonging to user xisalty on
BitcoinTalk forum and user
xisalty-otc on Bitcoin-OTC.
The owner is a known
scammer!
Every address belonging to the
user is empty, 4.5% are OneTime-Addresses, 2.3% are
zombies.
A real-world case: investigating the
Silk Road
• Identify the addresses owned by the Silk Road
• Investigate the activity of the addresses
A real-world case: investigating the
Silk Road
One of the bitcoin addresses that moved
most funds on the network during 2012 is:
1DkyBEKt5S2GDtv7aQw6rQepAvnsRyHoYM
A real-world case: investigating the
Silk Road
Using the Mt.Gox scraper, we find that:
•
On July 17, 2012 this address had a balance of
517,825 BTC
• On the same day Bitcoin looked on the verge of
breaking 10 USD/BTC
• At 02:00 AM, someone sold 10,000 BTC at 9 USD/
BTC, driving the price down
• At 02:29, two large withdrawals of respectively
20,000 BTC and 60,000 BTC were made from this
address one after the other, and included in a block at
02:32.
A real-world case: investigating the
Silk Road
Using the Mt.Gox scraper, we find that:
• Mt.Gox, at the time, needed 6 confirmations in order
to allow the user to spend deposited bitcoins. 6
confirmations matured with block at height 189421,
relayed at 02:47:24 AM
• At 02:52 and 02:53, someone sold approximately
15,000 BTC at market price in several batches, causing
the price to drop below 7.5 USD/BTC.
A real-world case: investigating the
Silk Road
A real-world case: investigating the
Silk Road
• Sign up to the Silk Road
• Deposit 0.001 BTC to a one-time deposit address
• The coins are mixed: the deposit address is
provably in the same wallet as more than 25,000
other addresses
A real-world case: investigating the
Silk Road
• Find a connection between the addresses in the
mixer and the large 1Dky... address
• The mixer is a cluster active since June 18, 2012 more than 80,000 inputs/outputs
• Follow the flow of coins!
A real-world case: investigating the
Silk Road
Multi-hop connection found: the 1Dky... address belongs to the Silk
Road.
Conclusions
We present BitIodine
We test it on real-world use cases
we are able to get valuable
information
we get insights on the Silk Road
Plans for the future
add user-friendly front-end to
the framework
Thank you.

Benzer belgeler

BitIodine: Extracting Intelligence from the Bitcoin Network

BitIodine: Extracting Intelligence from the Bitcoin Network analysis of the Bitcoin blockchain, which is easily expandable and future-proof. • We automatically label clusters/users with limited to no supervision. • We test our framework on real-world use ca...

Detaylı