twitter

Thursday, March 19, 2015

Machine learning to rapidly search for the correct bitcoin block header nonce

Using machine learning, is it possible to predict the bitcoin nonce that will satisfy the difficulty target condition?

In this blog, I will attempt to do this. It is a work in progress, but the results are promising.

Background

First, lets begin with a VERY brief introduction to the relevent portions of bitcoin and bitcoin mining to help understand the rest of the blog. Bitcoin is an online paymeny system, which operates in a completely decentralized manner. Bitcoins, the currency of the bitcoin system, are exchanged for goods and services. A good analogy to bitcoins in the conventional finalcial world is a cheque. Cheques can be used as payment for goods and services. However, the cheque has to be verified to ensure that the payor has sufficient funds to cover the cheque, a process that is performed by the banks. In the bitcoin system, the process of transaction verification (i.e., bitcoin mining) is performed by the bitcoin community (the equivelent of conventional banks, but completely decentralized).

Bitcoin mining, or transaction verification, is performed in chunks called "blocks". Successful verification of a block is rewarded by the bitcoin system with 25 bitcoins paid to the first person providing proof of verification. Each block is formed by collating all the transactions since the previous block, constructing a block header from these transactions, and then altering this header methodically until the SHA256 double hash of this header is less than the currently set "difficulty" target. Ken Shirriff's excellent blog post (http://www.righto.com/2014/02/bitcoin-mining-hard-way-algorithms.html) explains the specifics of this process well and is worth reading carefully.

For the purposes of this blog, here are the take-away: - if a bitcoin block is verified, the verifier gets paid 25 bitcoins (as of this writing) - the SHA256 double hash of the block header has to be below a predefined difficulty target for the verification to be deemed successful - bitcoin miner verify blocks and meet the difficulty threshold to get paid

Assumptions (or why bitcoin mining is deemed to be hard)

SHA256 hashing function is a one-way hash function. The only way to know if the hash is below the difficulty target set by the bitcoin network is to alter the input one by one and compare the resulting hash to the target - i.e., by brute force. Since the size of the search space is very very large, a lot of computational power is necessary to be successful at this task.

The current approach to mining

The block header is an 80 byte string made of 1. version (4 bytes) 2. previous block's hash (32 bytes) 3. Merkle root (32 bytes) 4. timestamp (4 bytes) 5. bits (4 bytes) 6. nonce (4 bytes)

Of these, miners systematically alter the nonce (and timestamp with some constraints), compute the double SHA256 hash for each change, and check if the result is below the target. Merkle root can also be changed (by changing some of the coinbase transaction fields).

The proposed machine learning approach

The idea is to take the previously solved block headers and learn where the nonce might be. To do this, take each solved header and build the training data table such as this:

-----------------------------------------------------------------------  
| ver | prev_hash | merkle_root | time | bits | candidate nonce | label|
|----------------------------------------------------------------------|  
| 3   | 46ae7...  | 8732ad...   | 843..| 1d.. | 00000001        | 0    |
| 3   | 46ae7...  | 8732ad...   | 843..| 1d.. | 00010001        | 0    |
| .   | ...       | ...         | ..   | ..   | .               | .    |
| 3   | 46ae7...  | 8732ad...   | 843..| 1d.. | 10000001        | 1    |
| 3   | 46ae7...  | 8732ad...   | 843..| 1d.. | 40000001        | 1    |
-----------------------------------------------------------------------       

The "candidate nonce" and "label" need explaining. The candidate nonces are numbers less than and greater than the true nonce - i.e., candidates that we are testing. The label is encoded as 0 if the candidate nonce is less than the true nonce, 1 if greater.

A machine learning classifier can be trained on this data. Once trained, the classifier will predict 0 if the true nonce of an unknown header is greater than the candidate nonce, and 1 if less. So for any given candidate nonce, this classifier will say if the true nonce is lower or higher.

Data for 49800 solved bitcoin headers is available here (python pickle format. See code below)

Python code implementing the above ideas

The necessary imports

import hashlib
import struct
import pandas as pd
import random
import os.path
import pickle
import numpy as np
from sklearn.cross_validation import train_test_split
from sklearn.ensemble import RandomForestClassifier
import json
import matplotlib.pyplot as plt
%matplotlib inline
from matplotlib.legend_handler import HandlerLine2D

Helper functions with some comments

# Helper functions with some commenting
#############################################################
# load block header data
# Note: the data has been preprocessed and has been pickled
def load_data(filename):
    with open(filename, 'rb') as f:
        return pickle.load(f)

# Takes true nonce, returns random nonce value and label indicating 
#   above/below true nonce
# Used to make the training labels
def get_nonce(n):
    r = random.randint(0, 4294967296)
    while r == n:
        r = random.randint(0, 4294967296)
    if r < n:
        l = 0
    elif r > n:
        l = 1
    return (r, l)

# Makes a training set from the block header data by adding 
#   random nonce candidates and label
def make_df(data_dict):
    ml_df = []
    row = []
    hex_int_dict = {'0':0, '1':1, '2':2, '3':3, '4':4, '5':5, 
                    '6':6, '7':7, '8':8, '9':9, 'a': 10, 
                    'b': 11, 'c':12, 'd':13, 'e':14, 'f':15}
    
    for i in data_dict:
        try:
            header = [str(i['ver'])]
            header.extend(list(i['prev_block']))
            header.extend(list(i['mrkl_root']))
            header.extend(list(str(i['time']).zfill(10))) 
            header.extend(list(str(i['bits'])))
            nonce = int(i['nonce'])

            for n in range(150):
                rand_test_nonce, label = get_nonce(nonce)
                row = list(header)
                row = [hex_int_dict[r] for r in row]
                row.extend([rand_test_nonce, label, nonce])
                ml_df.append(row)
        except:
            continue
    return ml_df

# The machine learning routine - uses a Random Forest classifier
# Runs 20 jobs - change this as necessary based on number of cores
# Warning - this takes time and memory to build the model
#     About 5 minutes on a 32 core 32GB machine
# Decrease sample size to reduce memory usage. Accuracy does not seem to suffer.
def train_randomforest(X_train, Y_train):
    clf = RandomForestClassifier(n_jobs=20)
    clf = clf.fit(X_train, Y_train)
    return clf

Load 20000 block headers to train on and a further 4000 for testing

t = load_data('bitcoinheaders.pkl')
# use only 20000 + 4000 rows instead of the ~49000 - makes the process faster
train = t[0:20000]
test = t[21000:25000]

# Convert it to dataframe in a format that sklearn can use
ml_df = pd.DataFrame(make_df(train))
ml_df_test = pd.DataFrame(make_df(test))

# Split the raw data into test and training sets
X = ml_df.columns[0:148]
Y = ml_df.columns[148]
X_train = ml_df[X]
Y_train = ml_df[Y]
X_test = ml_df_test[X]
Y_test = ml_df_test[Y]
pd.DataFrame(X_train).head()
0 1 2 3 4 5 6 7 8 9 ... 138 139 140 141 142 143 144 145 146 147
0 1 0 0 0 0 0 0 0 0 3 ... 9 1 13 0 0 15 15 15 15 612623580
1 1 0 0 0 0 0 0 0 0 3 ... 9 1 13 0 0 15 15 15 15 1273942402
2 1 0 0 0 0 0 0 0 0 3 ... 9 1 13 0 0 15 15 15 15 2892576921
3 1 0 0 0 0 0 0 0 0 3 ... 9 1 13 0 0 15 15 15 15 2279745217
4 1 0 0 0 0 0 0 0 0 3 ... 9 1 13 0 0 15 15 15 15 1415422173

5 rows × 148 columns

Train a classifier

# Train the classifier on the training data
clf = train_randomforest(X_train, Y_train)

# Print the training set accuracy
print "Training set accuracy : " + str(clf.score(X_train, Y_train))

# Print the test set accuracy
print "Test set accuracy : " + str(clf.score(X_test, Y_test))
Training set accuracy : 0.999698333333
Test set accuracy : 0.783788333333

Graph the results

# Graph a prediction in relation to the true nonce
# rerun this cell to get another random selection
unique_nonces = ml_df[149].unique()
true_nonce = unique_nonces[random.randint(0, unique_nonces.shape[0])]
df_to_graph = ml_df.loc[ml_df[149] == true_nonce]
p = clf.predict(df_to_graph[X])

plt.scatter(df_to_graph[147], p, label='Candidate nonces')
plt.scatter(true_nonce, 0.5, c='red', label='True nonce for reference')
plt.xlabel('Nonce - 0 to 2^32')
plt.ylabel('Probability(true nonce < selected nonce)' )
plt.legend(loc=2)
<matplotlib.legend.Legend at 0x7f886f8f98d0>

Feature importance

clf.feature_importances_
array([ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.00505718,
        0.00411218,  0.00443182,  0.0042087 ,  0.0042018 ,  0.00401395,
        0.00403606,  0.00432378,  0.0043062 ,  0.00433245,  0.00423386,
        0.00406486,  0.00432232,  0.00410184,  0.00426883,  0.00427783,
        0.00415867,  0.00421704,  0.00432546,  0.00423495,  0.00417807,
        0.00435239,  0.00399234,  0.00408457,  0.00419569,  0.0042272 ,
        0.0041832 ,  0.00421131,  0.00436315,  0.00405665,  0.00407534,
        0.00405407,  0.00444397,  0.00426483,  0.00413364,  0.00441375,
        0.00413599,  0.0041483 ,  0.00430871,  0.00402952,  0.00445774,
        0.00407513,  0.00440545,  0.00401896,  0.0042067 ,  0.004515  ,
        0.00434291,  0.00427326,  0.00433099,  0.00432221,  0.00435674,
        0.00394147,  0.00419859,  0.00431308,  0.00405089,  0.00422883,
        0.00413344,  0.00419294,  0.00418918,  0.00421018,  0.00428869,
        0.00417185,  0.0039788 ,  0.00421298,  0.00437247,  0.00417522,
        0.004153  ,  0.00438604,  0.00432773,  0.0040969 ,  0.00437398,
        0.00419273,  0.00422763,  0.00438585,  0.00424765,  0.00435936,
        0.00417301,  0.00427413,  0.00409838,  0.00414348,  0.00424256,
        0.00421371,  0.00411336,  0.00425224,  0.00424889,  0.00431411,
        0.00396213,  0.00436847,  0.00416592,  0.00435117,  0.00420203,
        0.00440759,  0.00433886,  0.0042691 ,  0.00420313,  0.0044612 ,
        0.00468124,  0.00426538,  0.00428175,  0.00435905,  0.00416996,
        0.00421363,  0.00442613,  0.00445018,  0.00441981,  0.00429876,
        0.00421206,  0.00432622,  0.00442857,  0.00418831,  0.0042089 ,
        0.00401741,  0.00426253,  0.00422956,  0.00425827,  0.00441802,
        0.00439529,  0.00434705,  0.00417416,  0.00424235,  0.        ,
        0.        ,  0.02489109,  0.00509744,  0.00338871,  0.00344143,
        0.00355746,  0.00329374,  0.00357129,  0.00355075,  0.        ,
        0.00323213,  0.0033215 ,  0.00279406,  0.0077297 ,  0.01157081,
        0.00901339,  0.01932312,  0.38234826])

Each byte in the header contributes equally (about 0.004) to the prediction, except for the candidate nonce which contributes a lot more. Makes sense.

Concluding remarks

As can be seen from the above test set accuracy and the graph, the classifier does a very good job of identifying the nonce. Also, "feature importance" shows that most of the bytes in the header contribute equally to the prediction - an observation that makes intuitive sense. There is a tremendous amount of interraction between the input variables (bytes) in the SHA hashing algorithms. The candidate nonce contributes a lot more, which also makes intuitive sense.

In this blog, I have trained a classifier to search for the nonce. A classifier can obviously also be trained to search for the correct timestamp, or a combination of the two, by changing the data encoding. The next step is to use the classifier on live data from bitcoind and try to mine the next block.

A pot of bitcoins await at the end of the random forest!