Using machine learning, is it possible to predict the bitcoin nonce that will satisfy the difficulty target condition?

In this blog, I will attempt to do this. It is a work in progress, but the results are promising.

Background

First, lets begin with a VERY brief introduction to the relevent portions of bitcoin and bitcoin mining to help understand the rest of the blog. Bitcoin is an online paymeny system, which operates in a completely decentralized manner. Bitcoins, the currency of the bitcoin system, are exchanged for goods and services. A good analogy to bitcoins in the conventional finalcial world is a cheque. Cheques can be used as payment for goods and services. However, the cheque has to be verified to ensure that the payor has sufficient funds to cover the cheque, a process that is performed by the banks. In the bitcoin system, the process of transaction verification (i.e., bitcoin mining) is performed by the bitcoin community (the equivelent of conventional banks, but completely decentralized).

Bitcoin mining, or transaction verification, is performed in chunks called "blocks". Successful verification of a block is rewarded by the bitcoin system with 25 bitcoins paid to the first person providing proof of verification. Each block is formed by collating all the transactions since the previous block, constructing a block header from these transactions, and then altering this header methodically until the SHA256 double hash of this header is less than the currently set "difficulty" target. Ken Shirriff's excellent blog post (http://www.righto.com/2014/02/bitcoin-mining-hard-way-algorithms.html) explains the specifics of this process well and is worth reading carefully.

For the purposes of this blog, here are the take-away: - if a bitcoin block is verified, the verifier gets paid 25 bitcoins (as of this writing) - the SHA256 double hash of the block header has to be below a predefined difficulty target for the verification to be deemed successful - bitcoin miner verify blocks and meet the difficulty threshold to get paid

Assumptions (or why bitcoin mining is deemed to be hard)

SHA256 hashing function is a one-way hash function. The only way to know if the hash is below the difficulty target set by the bitcoin network is to alter the input one by one and compare the resulting hash to the target - i.e., by brute force. Since the size of the search space is very very large, a lot of computational power is necessary to be successful at this task.

The current approach to mining

The block header is an 80 byte string made of 1. version (4 bytes) 2. previous block's hash (32 bytes) 3. Merkle root (32 bytes) 4. timestamp (4 bytes) 5. bits (4 bytes) 6. nonce (4 bytes)

Of these, miners systematically alter the nonce (and timestamp with some constraints), compute the double SHA256 hash for each change, and check if the result is below the target. Merkle root can also be changed (by changing some of the coinbase transaction fields).

The proposed machine learning approach

The idea is to take the previously solved block headers and learn where the nonce might be. To do this, take each solved header and build the training data table such as this:

-----------------------------------------------------------------------  
| ver | prev_hash | merkle_root | time | bits | candidate nonce | label|
|----------------------------------------------------------------------|  
| 3   | 46ae7...  | 8732ad...   | 843..| 1d.. | 00000001        | 0    |
| 3   | 46ae7...  | 8732ad...   | 843..| 1d.. | 00010001        | 0    |
| .   | ...       | ...         | ..   | ..   | .               | .    |
| 3   | 46ae7...  | 8732ad...   | 843..| 1d.. | 10000001        | 1    |
| 3   | 46ae7...  | 8732ad...   | 843..| 1d.. | 40000001        | 1    |
-----------------------------------------------------------------------

The "candidate nonce" and "label" need explaining. The candidate nonces are numbers less than and greater than the true nonce - i.e., candidates that we are testing. The label is encoded as 0 if the candidate nonce is less than the true nonce, 1 if greater.

A machine learning classifier can be trained on this data. Once trained, the classifier will predict 0 if the true nonce of an unknown header is greater than the candidate nonce, and 1 if less. So for any given candidate nonce, this classifier will say if the true nonce is lower or higher.

Data for 49800 solved bitcoin headers is available here (python pickle format. See code below)

Python code implementing the above ideas

The necessary imports

import hashlib
import struct
import pandas as pd
import random
import os.path
import pickle
import numpy as np
from sklearn.cross_validation import train_test_split
from sklearn.ensemble import RandomForestClassifier
import json
import matplotlib.pyplot as plt
%matplotlib inline
from matplotlib.legend_handler import HandlerLine2D

Helper functions with some comments

# Helper functions with some commenting
#############################################################
# load block header data
# Note: the data has been preprocessed and has been pickled
def load_data(filename):
    with open(filename, 'rb') as f:
        return pickle.load(f)

# Takes true nonce, returns random nonce value and label indicating 
#   above/below true nonce
# Used to make the training labels
def get_nonce(n):
    r = random.randint(0, 4294967296)
    while r == n:
        r = random.randint(0, 4294967296)
    if r < n:
        l = 0
    elif r > n:
        l = 1
    return (r, l)

# Makes a training set from the block header data by adding 
#   random nonce candidates and label
def make_df(data_dict):
    ml_df = []
    row = []
    hex_int_dict = {'0':0, '1':1, '2':2, '3':3, '4':4, '5':5, 
                    '6':6, '7':7, '8':8, '9':9, 'a': 10, 
                    'b': 11, 'c':12, 'd':13, 'e':14, 'f':15}
    
    for i in data_dict:
        try:
            header = [str(i['ver'])]
            header.extend(list(i['prev_block']))
            header.extend(list(i['mrkl_root']))
            header.extend(list(str(i['time']).zfill(10))) 
            header.extend(list(str(i['bits'])))
            nonce = int(i['nonce'])

            for n in range(150):
                rand_test_nonce, label = get_nonce(nonce)
                row = list(header)
                row = [hex_int_dict[r] for r in row]
                row.extend([rand_test_nonce, label, nonce])
                ml_df.append(row)
        except:
            continue
    return ml_df

# The machine learning routine - uses a Random Forest classifier
# Runs 20 jobs - change this as necessary based on number of cores
# Warning - this takes time and memory to build the model
#     About 5 minutes on a 32 core 32GB machine
# Decrease sample size to reduce memory usage. Accuracy does not seem to suffer.
def train_randomforest(X_train, Y_train):
    clf = RandomForestClassifier(n_jobs=20)
    clf = clf.fit(X_train, Y_train)
    return clf

Load 20000 block headers to train on and a further 4000 for testing

t = load_data('bitcoinheaders.pkl')
# use only 20000 + 4000 rows instead of the ~49000 - makes the process faster
train = t[0:20000]
test = t[21000:25000]

# Convert it to dataframe in a format that sklearn can use
ml_df = pd.DataFrame(make_df(train))
ml_df_test = pd.DataFrame(make_df(test))

# Split the raw data into test and training sets
X = ml_df.columns[0:148]
Y = ml_df.columns[148]
X_train = ml_df[X]
Y_train = ml_df[Y]
X_test = ml_df_test[X]
Y_test = ml_df_test[Y]

pd.DataFrame(X_train).head()

	0	9	...	138	139	140	143	144	145	146	147
0	1	3	...	9	1	13	15	15	15	15	612623580
1	1	3	...	9	1	13	15	15	15	15	1273942402
2	1	3	...	9	1	13	15	15	15	15	2892576921
3	1	3	...	9	1	13	15	15	15	15	2279745217
4	1	3	...	9	1	13	15	15	15	15	1415422173

5 rows Ã— 148 columns

Train a classifier

# Train the classifier on the training data
clf = train_randomforest(X_train, Y_train)

# Print the training set accuracy
print "Training set accuracy : " + str(clf.score(X_train, Y_train))

# Print the test set accuracy
print "Test set accuracy : " + str(clf.score(X_test, Y_test))

Training set accuracy : 0.999698333333
Test set accuracy : 0.783788333333

Graph the results

# Graph a prediction in relation to the true nonce
# rerun this cell to get another random selection
unique_nonces = ml_df[149].unique()
true_nonce = unique_nonces[random.randint(0, unique_nonces.shape[0])]
df_to_graph = ml_df.loc[ml_df[149] == true_nonce]
p = clf.predict(df_to_graph[X])

plt.scatter(df_to_graph[147], p, label='Candidate nonces')
plt.scatter(true_nonce, 0.5, c='red', label='True nonce for reference')
plt.xlabel('Nonce - 0 to 2^32')
plt.ylabel('Probability(true nonce < selected nonce)' )
plt.legend(loc=2)

<matplotlib.legend.Legend at 0x7f886f8f98d0>

Feature importance

clf.feature_importances_

array([ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.00505718,
        0.00411218,  0.00443182,  0.0042087 ,  0.0042018 ,  0.00401395,
        0.00403606,  0.00432378,  0.0043062 ,  0.00433245,  0.00423386,
        0.00406486,  0.00432232,  0.00410184,  0.00426883,  0.00427783,
        0.00415867,  0.00421704,  0.00432546,  0.00423495,  0.00417807,
        0.00435239,  0.00399234,  0.00408457,  0.00419569,  0.0042272 ,
        0.0041832 ,  0.00421131,  0.00436315,  0.00405665,  0.00407534,
        0.00405407,  0.00444397,  0.00426483,  0.00413364,  0.00441375,
        0.00413599,  0.0041483 ,  0.00430871,  0.00402952,  0.00445774,
        0.00407513,  0.00440545,  0.00401896,  0.0042067 ,  0.004515  ,
        0.00434291,  0.00427326,  0.00433099,  0.00432221,  0.00435674,
        0.00394147,  0.00419859,  0.00431308,  0.00405089,  0.00422883,
        0.00413344,  0.00419294,  0.00418918,  0.00421018,  0.00428869,
        0.00417185,  0.0039788 ,  0.00421298,  0.00437247,  0.00417522,
        0.004153  ,  0.00438604,  0.00432773,  0.0040969 ,  0.00437398,
        0.00419273,  0.00422763,  0.00438585,  0.00424765,  0.00435936,
        0.00417301,  0.00427413,  0.00409838,  0.00414348,  0.00424256,
        0.00421371,  0.00411336,  0.00425224,  0.00424889,  0.00431411,
        0.00396213,  0.00436847,  0.00416592,  0.00435117,  0.00420203,
        0.00440759,  0.00433886,  0.0042691 ,  0.00420313,  0.0044612 ,
        0.00468124,  0.00426538,  0.00428175,  0.00435905,  0.00416996,
        0.00421363,  0.00442613,  0.00445018,  0.00441981,  0.00429876,
        0.00421206,  0.00432622,  0.00442857,  0.00418831,  0.0042089 ,
        0.00401741,  0.00426253,  0.00422956,  0.00425827,  0.00441802,
        0.00439529,  0.00434705,  0.00417416,  0.00424235,  0.        ,
        0.        ,  0.02489109,  0.00509744,  0.00338871,  0.00344143,
        0.00355746,  0.00329374,  0.00357129,  0.00355075,  0.        ,
        0.00323213,  0.0033215 ,  0.00279406,  0.0077297 ,  0.01157081,
        0.00901339,  0.01932312,  0.38234826])

Each byte in the header contributes equally (about 0.004) to the prediction, except for the candidate nonce which contributes a lot more. Makes sense.

Concluding remarks

As can be seen from the above test set accuracy and the graph, the classifier does a very good job of identifying the nonce. Also, "feature importance" shows that most of the bytes in the header contribute equally to the prediction - an observation that makes intuitive sense. There is a tremendous amount of interraction between the input variables (bytes) in the SHA hashing algorithms. The candidate nonce contributes a lot more, which also makes intuitive sense.

In this blog, I have trained a classifier to search for the nonce. A classifier can obviously also be trained to search for the correct timestamp, or a combination of the two, by changing the data encoding. The next step is to use the classifier on live data from bitcoind and try to mine the next block.

A pot of bitcoins await at the end of the random forest!

16 comments:

AnonymousMarch 20, 2015 at 5:40 AM
Silly bear, you left the labels in your training data. Good name for your blog.
Noir WonsApril 16, 2015 at 11:51 AM
Hello Arun! Thank you for the thought-provoking experiment. I really appreciate the amount of detail and repeatability you've provided.

Have you considered comparing your test set performance to the "naive baseline" that classifies a nonce based on whether its value is greater than or less than the average value of all "true" nonces in your training set?
__August 18, 2015 at 8:24 PM
After more testing, I found an error in the article. The learning algorithm does not learn to localize the nonce. All that it has done is learn the (somewhat skewed) distribution of the nonces in the dataset I used. It uses this information to "guess" the nonce's location. It does not really learn the nonce's location from the header.

Well... it was too good to be true anyway ;-)
BloggerSeptember 17, 2016 at 2:38 PM
This comment has been removed by a blog administrator.
BloggerJanuary 22, 2017 at 6:26 AM
This comment has been removed by a blog administrator.
BloggerJanuary 22, 2017 at 6:28 AM
This comment has been removed by a blog administrator.
BloggerAugust 10, 2017 at 10:54 AM
This comment has been removed by a blog administrator.
BloggerSeptember 6, 2017 at 8:06 AM
This comment has been removed by a blog administrator.
BloggerOctober 11, 2017 at 7:29 AM
This comment has been removed by a blog administrator.
BloggerOctober 29, 2017 at 11:14 AM
This comment has been removed by a blog administrator.
BloggerNovember 15, 2017 at 3:12 AM
This comment has been removed by a blog administrator.
FeiMarch 20, 2018 at 12:26 AM
Wondering where did you download the blocks data from?
srikanthSeptember 26, 2018 at 2:34 PM
Masterfully done.
Optimus PrimeNovember 19, 2019 at 2:45 AM
I’m going to read this. I’ll be sure to come back. thanks for sharing. and also This article gives the light in which we can observe the reality. this is very nice one and gives indepth information. thanks for this nice article... bitcoin profit app
PriyankaJune 8, 2020 at 10:47 PM
This comment has been removed by a blog administrator.

Careless Learner

twitter

Thursday, March 19, 2015

Machine learning to rapidly search for the correct bitcoin block header nonce