The outcome of any one round in blackjack depends on
- the current shoe (the cards that are left in it)
- the dealer's face card
- the player's hand
The player's decision points are dependent on the composition of the shoe. The player has to decide:
- How much to bet before the round begins (a function of expected returns)
- What stay threshold to use (i.e., when to stop requesting more cards) There are others, such as split/double/insurance, but I have ignored these for now.
In this NB, I wrote a basic BJ simulator and tested the relation between the current deck, dealer's face card, the player's stay threshold and the expected returns. Plots the result using seaborn heatmap. The simulator does not impliment doubling, spliting or insurance (all of which could increase the player's odds if done right)
As an aside, I don't know much about numba. I just added @jit, and magically, the code speed up achieved was 100x; INCREDIBLY fast; so fast it's almost VOODOO.
The code for the simulations is presented first. The analysis follows.
import numpy as np
import pandas as pd
from numba import jit
%matplotlib inline
import seaborn as sns; sns.set()
from matplotlib import pyplot as plt
The code has the following functions:
- new_deck - makes a deck of 6 packs. All Aces count as 11.
- payout - calculates who won, and how much
- player - pops cards from the show until the sum of the cards is above the requested threshold
- deal_hand - deals 2 cards for the player and 1 for the dealer
- play_out_round - finishes the round, giving cards to the player until the sum of the cards crosses the requested threshold; then the dealer with a threshold of 17. Then determines payout.
- tabulate and tabulate_threshold are helper functions to aggregate the results
@jit
def new_deck(takeout=[]):
'''make a 6 pack deck, removing cards from the takeout list
the takeout list is used to deliberately bias the deck if needed'''
cards = np.repeat([2,3,4,5,6,7,8,9,10,10,10,10,11], 4 * 6)
for i in takeout:
cards = cards[cards != i]
np.random.shuffle(cards)
return cards
@jit
def payout(p_sum, d_sum):
'''takes player's card sum and dealers card sum
returns the player's payout'''
if p_sum == 21:
return 1.5 # 3:2 payment if blackjack
elif p_sum == 21 and d_sum == 21:
return 0
elif p_sum > 21:
return -1
elif d_sum > 21:
return 1
elif p_sum > d_sum:
return 1
elif p_sum == d_sum:
return 0
else:
return -1
@jit
def player(cards, stay_threshold = 13, current_cards = None):
'''Takes a list of cards, a threshold above which to stay and the sum of current cards
returns the cards sum after the dealing'''
count = current_cards
while count < stay_threshold:
count += cards[0]
cards = cards[1:]
return cards, count
@jit
def deal_hand(deck):
'''Deals player's 2 cards and the dealer's one
Returns these and the deck'''
player_2, deck = deck[0] + deck[1], deck[2:]
dealer_1, deck = deck[2], deck[3:]
return deck, player_2, dealer_1
@jit
def play_out_round(deck, p_sum, d_sum, threshold, hit = False):
deck, p_sum = player(deck, stay_threshold = threshold, current_cards = p_sum)
deck, d_sum = player(deck, stay_threshold=17, current_cards = d_sum)
if hit:
return payout(p_sum + deck[0], d_sum), p_sum, d_sum, len(deck)-1
else:
return payout(p_sum, d_sum), p_sum, d_sum, len(deck)
@jit()
def tabulate(table):
''' takes what simulate returns and aggregates'''
# Convert the results to pandas DF, pivot and aggregate.
df = pd.DataFrame(table)
df.columns = ['DealerHand', 'PlayerHand', 'Payoff', 'Threshold', "Cards Left In Deck"]
pvt = pd.pivot_table(df, index='DealerHand', columns='PlayerHand',
values='Payoff', aggfunc='mean', fill_value=0)
return pvt.loc[:,:21]
@jit()
def tabulate_threshold(table):
''' takes what simulate returns and aggregates'''
# Convert the results to pandas DF, pivot and aggregate.
df = pd.DataFrame(table)
df.columns = ['DealerHand', 'PlayerHand', 'Payoff', 'Threshold', "Cards Left In Deck"]
pvt = pd.pivot_table(df, index='DealerHand', columns='Threshold',
values='Payoff', aggfunc='mean', fill_value=0)
#values='Payoff', aggfunc='std', fill_value=0)
return pvt.loc[:,:21]
The following functions simulate 1000000 single rounds using the given shoe.
- simulate_stay simulates the results of staying when the threshold is crossed
- simulate_hit does one more hit
# Simulate 1000000 rounds from the same deck of cards
@jit(nopython = True)
def simulate_stay(passed_deck):
''' take a deck, simulate 1 round many times with staying at random values
return the results'''
table = []
for _ in range(1000000):
deck = np.copy(passed_deck)
np.random.shuffle(deck)
rand_threshold = np.random.randint(5, 22)
deck, p, de = deal_hand(deck)
pay, p_sum, d_sum, test = play_out_round(deck, p, de, rand_threshold)
#table.append([de, rand_threshold, pay])
table.append([de, p_sum, pay, rand_threshold, test])
return table
@jit(nopython = True)
def simulate_hit(passed_deck):
''' Given a hand for the player and the dealer, should I hit or stay?
This shows what will happen if you hit (the previous function simulates staying)'''
table = []
for _ in range(1000000):
deck = np.copy(passed_deck)
np.random.shuffle(deck)
rand_threshold = np.random.randint(5, 22)
deck, p, de = deal_hand(deck)
pay, p_sum, d_sum, test = play_out_round(deck, p, de, rand_threshold, hit=True)
table.append([de, p_sum, pay, rand_threshold, test])
return table
The final result is a table of expected returns (from -1.0 to +1.5) for every dealer hand and every threshold. Plotting the table as a heatmap is useful to identify trends.
I should explain what I mean by threshold - this is the count that the player chooses at or above which to stand. For e.g., if the dealer has 7 and the player chooses a threshold of 15, once the player's cards sum to 15 or greater, he/she will stand.
Simulation of expected return for various stay thresholds in a full shoe (6 decks)¶
deck = new_deck(takeout=[])
simulation = simulate_stay(deck)
table = tabulate_threshold(simulation)
sns.heatmap(table)
table
The table above shows the result of 1 million simulations. An example interpretation: if the dealer has 6, and the player sets her threshold to 12 (i.e., stands at or above 12), her expected return is \$0.242316 for every \$1.00 invested. If the dealer has 9 or above, the player has no threshold that has a positive expected value.
The best threshold for each dealer hand¶
If the dealer has 2, the most profitable threshold is 12 (i.e., stay at or above 12). Dealer has 8 = the most profitable threshold is at 17.
df=pd.concat([table.idxmax(axis=1), table.max(axis=1)], axis=1)
df.columns = ['Optimal Threshold', 'Expected Return']
df_optimal = df['Optimal Threshold']
df_optimal
The mean and standard deviation of expected returns¶
If a player plays a 100 hands, the expected return is \$2.377 with a standard deviation of \$10.28
# Simulate 1 million trials
t = simulate_stay(new_deck(takeout=[]))
df = pd.DataFrame(t)
df.columns = ['DealerHand', 'PlayerHand', 'Payoff', 'Threshold', 'Test']
# Only choose optimal thresholds (optimal defined using previous simulations)
l = []
for k, v in df_optimal.to_dict().items():
l.append(df[((df.DealerHand==k) & (df.Threshold==v))])
optimal_subset = pd.concat(l)
# Randomly take a 100 results and determine cumsum(). Do this 10000 times. Describe.
# The idea is to see what happens if you play many sittings of a 100 rounds each
x = pd.Series([optimal_subset.sample(100).reset_index().Payoff.cumsum().iloc[-1] for _ in range(10000)])
x.hist(bins=50)
x.describe()
The results are very optimistic. I am convinced that the code is correct. I wonder if the rules as I have implimented them are incomplete/incorrect. I hope this is useful to someone. I welcome comment/suggestions for improvement.
No comments:
Post a Comment