Calculating fantasy baseball auction values with Jupyter Notebook

I’ve been playing fantasy baseball for over ten years and I’ve learned quite a bit from this geeky pastime. Along the way, I’ve learned a lot of valuable lessons and skills about leadership, management, negotiation, and of course, data manipulation.

Much of the Excel work I’ve done at my various jobs has come from things I learned playing fantasy baseball and while my interest has been waning in the past few years, I’ve been learning Python as a way to use deep learning for generative creative writing. Last year, I saw what GPT-2 can do and I thought “I have to earn this stuff!”

Turns out, there’s a few foundational skills that lead up to working with and understanding NLP, and Python is core to that. And what better way to learn than by building real-world applications so this is my first side project, which would have saved me so many hours of Excel manipulation, had I only learned it a few years ago.

I took the Codecademy courses on Python3 and Pandas and did some googling to figure out the numpy and pyplot stuff. Both courses are excellent introductions and I have to say that Python has been much much easier to learn than JavaScript, a pleasant surprise.

Here’s the jupyter notebook and you can use it with your own league data and projections (I’m using Steamer projections from Fangraphs). Every league is a little different, so you’ll have to do some customization. I tried to document things in a way that makes it easy for anyone with a little bit of Python/Pandas/Jupyter experience to use.

You can access the full project on Github.

Project Overview

This notebook was created to calculate fantasy baseball auction draft values. The idea was to automate (to the extent possible) the work that I used to do in Excel.

The formulas and methods come from the book Winning Fantasy Baseball, which became an essential resource for me once I started playing in an auction league.

Full disclosure: I haven’t won my league in a while, not since 2007 to be exact. I just don’t have the time to keep up with the day-to-day transactions and strategy anymore, but I do draft pretty well.

There are other tools out there that automate this sort of thing, but I think there’s value in understanding the math and process behind how you get to your valuations, so you know when to override the system and when to trust it. And if you play in a league that’s non-standard in any way, the $ values you get online may not fit your league.

Also, if your leaugue uses different scoring categories, you’ll have to customize things a bit, especially for the ratio stats.

And if you’re experienced with Python, Pandas, or coding in general, you might find some things that could be done better or more elegantly. Feel free to contact me with corrections for formulas or just better ways to do things. In [1]:

# imports!
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

Set up your league

Start by inputting the various league settings: In [2]:

# League Settings
hitter_categories = ['R','HR','RBI','SB','OBP']
pitcher_categories = ['QS','SO','SV','ERA','WHIP']
league_categories = hitter_categories + pitcher_categories

teams_in_league = 14
roster_size_total = 29
draft_budget = 300

league_positions = {
    'hitters': 13, # the number of starting hitters (don't include bench players)
    'pitchers': 12 # the number of pitchers you generally draft
}

# The Draft Pool
hitters_drafted = 15 
pitchers_drafted = roster_size_total - hitters_drafted 

hitters_in_pool = hitters_drafted * teams_in_league
pitchers_in_pool = pitchers_drafted * teams_in_league

draft_pool_total = hitters_in_pool + pitchers_in_pool

What does it take to win in your league?

The idea behind the draft values for each player come from something called Standings Gained Points (SGPs). The idea is that for each scoring category, a certain quantity of that stat will move you up one place in the standings. What we’re looking for is the slope of a given category.

If we take runs as an example, you might need 23 runs at the margin to move up one place in the standings. By calculating SGPs, we can estimate the value of a given players runs and how they will directly contribute to your team.

Some caveats:

  1. If this is the first year of your league, you won’t be able to calculate last year’s stats. My advice is to go and find another league that you’ve played in and use the historical stats from that league. Or you can make an educated guess.
  2. Historical data has been a little more volatile in the last few years with the massive surge in homeruns. So you may want to manually adjust your league history if you think the next season’s scoring environment will be different.

In [3]:

# Create a DataFrame based on last year's CSV
last_year_stats = pd.read_csv('League_Data/last_year_stats.csv',thousands=',')

# Initialize the dictionary that will store the slopes for each category
my_league_history = {}

# The x axis is the number of teams in the league. 
x_axis_league_history = np.reshape(range(0,teams_in_league),(-1,1))

for category in league_categories:
    # get last year's data for the category, sort it, and shape it
    y_axis = last_year_stats[category].sort_values().values.reshape(-1,1) # sort_values(by=row)

    # create a new LinearRegression() object
    linear_regression_category = LinearRegression()

    # run the regression
    linear_regression_category.fit(x_axis_league_history,y_axis)

    # assign the regression coefficient to my_league_history dictionary
    my_league_history[category] = float(linear_regression_category.coef_)

# Create averages for hitter ratio stats (OBP)
historical_mean_obp = last_year_stats['OBP'].mean()

# Calculate the number of PAs for x-1 slots, where x = hitter_roster_size (used later in SGP calculations for OBP)
first_x_hitter_plate_appearances = 600 * (league_positions['hitters'] - 1)

# (used for calculating OBP SGPs later) 
team_historical_on_bases = first_x_hitter_plate_appearances * historical_mean_obp

# Create averages for pitcher ratio stats (WHIP, ERA). This varies by league. Easier if you have an IP limit but I don't
estimated_mean_ip_per_team = 1400 

# WHIP
historical_mean_whip = last_year_stats['WHIP'].mean()
team_historical_walks_plus_hits = estimated_mean_ip_per_team * historical_mean_whip

# ERA
historical_mean_era = last_year_stats['ERA'].mean()
team_historical_earned_runs = (estimated_mean_ip_per_team / 9) * historical_mean_era

print(my_league_history)
print('\n')
print(historical_mean_whip)
print(team_historical_walks_plus_hits)
print('\n')
print(historical_mean_era)
print(team_historical_earned_runs)
{'R': 23.017582417582414, 'HR': 11.195604395604395, 'RBI': 26.758241758241752, 'SB': 7.026373626373626, 'OBP': 0.002072527472527469, 'QS': 5.032967032967033, 'SO': 44.83956043956044, 'SV': 8.980219780219779, 'ERA': 0.06898901098901097, 'WHIP': 0.01375824175824175}


1.2704285714285715
1778.6000000000001


3.9635714285714294
616.5555555555557

Organize our player data

Next we’re going to import our hitter and pitcher projections and start to manipulate the data. I’m using Steamer projections from Fangraphs. This will work for any set of projections, although you may have to change the column labels to match the code below. In [4]:

# Create a dataframe for the hitter data and the pitcher data (we will combine them later)
hitters = pd.read_csv('Projections/Steamer_Hitters.csv')
pitchers = pd.read_csv('Projections/Steamer_Pitchers.csv')

# You should see two dataframes, one for hitters and one for pitchers information.
print(hitters.head())
print(pitchers.head())
               Name     Team    G   PA   AB    H  2B  3B  HR    R  ...   wOBA  \
0        Mike Trout   Angels  150  675  528  157  29   3  44  124  ...  0.427   
1      Mookie Betts  Red Sox  150  688  585  174  41   4  32  118  ...  0.387   
2      Alex Bregman   Astros  150  658  549  157  38   2  32  106  ...  0.391   
3  Francisco Lindor  Indians  150  692  616  178  39   2  35  112  ...  0.365   
4    Cody Bellinger  Dodgers  150  640  544  156  30   3  42   99  ...  0.392   

   -1.1  wRC+  BsR   Fld  -1.2   Off   Def  WAR  playerid  
0   NaN   173  3.1  -1.7   NaN  66.1   0.6  8.8     10155  
1   NaN   139  4.6  12.3   NaN  39.0   4.9  6.6     13611  
2   NaN   150  0.1  -1.5   NaN  42.0   0.6  6.4     17678  
3   NaN   125  0.7   7.8   NaN  22.5  15.3  6.0     12916  
4   NaN   147  1.2   2.0   NaN  40.8  -2.5  5.7     15998  

[5 rows x 31 columns]
               Name       Team   W  L   ERA  GS   G  SV     IP    H  ...  HR  \
0      Jacob deGrom       Mets  14  9  3.13  32  32   0  205.0  168  ...  24   
1       Gerrit Cole    Yankees  15  8  3.26  32  32   0  202.0  154  ...  28   
2      Max Scherzer  Nationals  15  9  3.28  32  32   0  202.0  161  ...  27   
3  Justin Verlander     Astros  16  8  3.46  32  32   0  204.0  165  ...  32   
4        Chris Sale    Red Sox  14  7  3.22  29  29   0  177.0  140  ...  23   

    SO  BB  WHIP    K/9  BB/9   FIP  WAR  RA9-WAR  playerid  
0  258  49  1.06  11.31  2.16  3.01  6.1      5.4     10954  
1  280  57  1.04  12.46  2.52  3.17  6.0      5.8     13125  
2  268  50  1.04  11.91  2.22  3.19  5.7      5.9      3137  
3  266  47  1.04  11.71  2.05  3.42  5.3      5.0      8700  
4  233  42  1.03  11.86  2.12  3.13  4.9      5.4     10603  

[5 rows x 21 columns]

In [5]:

# Create a list of labels that we want to preserve
column_labels = ['Name','PA','IP']

# Gah! Lots of columns we don't need. Let's drop them!
for column in hitters:
    # If the column name doesn't match an item in the hitter_categories or our labels, drop it.
    # You can add logic here to preserve some columns for stuff like Team or whatever else you want to keep.
    if column not in hitter_categories and column not in column_labels:
        hitters.drop(labels=column,axis=1,inplace=True)

# Create a lambda function to calculate QS. Filter for pitchers with more than 90 IP to limit it to starters.
# Since Steamer doesn't include QS projections, I multiply wins by 1.5, which has generally worked in the past.
calculate_qs = lambda row: ( row['W'] * 1.5 if row['IP'] > 90 else 0 )

# Create a new column for QS with our lambda function
pitchers['QS'] = pitchers.apply(calculate_qs,axis=1)

# If the column name doesn't match an item in the pitcher_categories or our new labels, drop it
for column in pitchers:
    if column not in pitcher_categories and column not in column_labels:
        pitchers.drop(labels=column,axis=1,inplace=True)

# Now our dataframes should look nice and clean, with our league data plus some essential info. You can of course
# choose to leave more info in, like team, league, WAR, whatever you want to have in front of you while drafting.
print(hitters.head())
print(pitchers.head())
               Name   PA  HR    R  RBI  SB    OBP
0        Mike Trout  675  44  124  112  14  0.439
1      Mookie Betts  688  32  118   95  18  0.390
2      Alex Bregman  658  32  106  103   6  0.396
3  Francisco Lindor  692  35  112   95  22  0.354
4    Cody Bellinger  640  42   99  115  12  0.385
               Name   ERA  SV     IP   SO  WHIP    QS
0      Jacob deGrom  3.13   0  205.0  258  1.06  21.0
1       Gerrit Cole  3.26   0  202.0  280  1.04  22.5
2      Max Scherzer  3.28   0  202.0  268  1.04  22.5
3  Justin Verlander  3.46   0  204.0  266  1.04  24.0
4        Chris Sale  3.22   0  177.0  233  1.03  21.0

Calculating SGPs

Now we can calculate our SGPs. For counting stats, it’s pretty straightfoward — you just divide a player’s projected total for a given category by the slope that we calculated earlier.

For ratio stats, it’s a bit trickier. We are trying to calculate the marginal impact that a player would have on a team’s OBP, ERA, or WHIP, if we added him to our starting lineup. In [6]:

# We start with the hitters...
for category in hitter_categories:
    # Calculate the SGPs for the counting stats
    if category != 'OBP':
        hitters[category + '_sgp'] = hitters[category] / my_league_history[category]

    # Calculating SGPs for OBP is kind of a pain!
    elif category == 'OBP':    
        hitters[category + '_sgp'] = (\
           1
               Name   PA  HR    R  RBI  SB    OBP     R_sgp    HR_sgp  \
0        Mike Trout  675  44  124  112  14  0.439  5.387186  3.930114   
1      Mookie Betts  688  32  118   95  18  0.390  5.126516  2.858265   
2      Alex Bregman  658  32  106  103   6  0.396  4.605175  2.858265   
3  Francisco Lindor  692  35  112   95  22  0.354  4.865846  3.126227   
4    Cody Bellinger  640  42   99  115  12  0.385  4.301060  3.751472   

    RBI_sgp    SB_sgp   OBP_sgp  Total_SGPs  
0  4.185626  1.992493  4.209589   19.705009  
1  3.550308  2.561777  2.221456   16.318321  
2  3.849281  0.853926  2.375119   14.541766  
3  3.550308  3.131060  0.710164   15.383605  
4  4.297741  1.707851  1.882182   15.940307  

Now let’s do the pitchers… In [7]:

for category in pitcher_categories:
    # Run the counting stats
    if category != 'WHIP' and category != 'ERA':
        pitchers[category + '_sgp'] = pitchers[category] / my_league_history[category]

    # Calculate WHIP
    elif category == 'WHIP':
        pitchers[category + '_sgp'] = \
       2 \
                                  /( pitchers['IP'] + estimated_mean_ip_per_team)))/ my_league_history['WHIP'])

    # Calculate ERA
    elif category == 'ERA':
        pitchers[category + '_sgp'] = \
        ( (historical_mean_era -3)) / my_league_history['ERA'])

# Inelegantly sum them up...
sum_the_pitcher_sgps = lambda row: ( row['QS_sgp'] + row['SO_sgp'] + row['SV_sgp'] + row['ERA_sgp'] + row['WHIP_sgp'])
pitchers['Total_SGPs'] = pitchers.apply(sum_the_pitcher_sgps,axis=1)

# And take a look at what we have. It's likely that the SGPs for pitchers will be lower than for the hitters.
print(pitchers.head())
               Name   ERA  SV     IP   SO  WHIP    QS    QS_sgp    SO_sgp  \
0      Jacob deGrom  3.13   0  205.0  258  1.06  21.0  4.172489  5.753848   
1       Gerrit Cole  3.26   0  202.0  280  1.04  22.5  4.470524  6.244486   
2      Max Scherzer  3.28   0  202.0  268  1.04  22.5  4.470524  5.976865   
3  Justin Verlander  3.46   0  204.0  266  1.04  24.0  4.768559  5.932262   
4        Chris Sale  3.22   0  177.0  233  1.03  21.0  4.172489  5.196304   

   SV_sgp   ERA_sgp  WHIP_sgp  Total_SGPs  
0     0.0  1.543269  1.953532   13.423138  
1     0.0  1.285929  2.111846   14.112785  
2     0.0  1.249375  2.111846   13.808610  
3     0.0  0.928340  2.130096   13.759256  
4     0.0  1.209719  1.961393   12.539905  

Preparing the draft pool and a bit of cleanup

Before we can calculate $ values, we need to remove players that won’t be in the draft pool.

We’ll start by figuring out exactly how many pitchers and hitters should be in our draft pool and removing the rest. To do this, we’re going to revisit two variables that we defined up top, hitters_in_pool and pitchers_in_pool.

These numbers may be somewhat arbitrary because many leagues have a certain number of bench spots that can be taken by either a hitter or pitcher. So how many hitters should you accounting for in your pool?

Generally I estimate the number of hitters per team by looking at how many starting spots there are for each hitter and then adding the number of bench spots that the average team will use for hitters. So you might have 13 starting hitters and then 3 out of the 5 bench spots also allocated for hitters.

If you’re not sure, you can guess, or if you can look at the previous year’s draft to see how people drafted.

Getting this 100\% correct isn’t the goal. And it’s impossible, unless your league has very strict roster rules. In [8]:

# Sort by SGPs so we can keep just the best players in the pool
pitchers.sort_values(by=['Total_SGPs'],ascending=False,inplace=True,ignore_index=True)
hitters.sort_values(by=['Total_SGPs'],ascending=False,inplace=True,ignore_index=True)

# Select only the players we want in the pool
draft_pool_hitters = hitters.iloc[:hitters_in_pool]
draft_pool_pitchers = pitchers.iloc[:pitchers_in_pool]

# Now we can join the hitters and pitchers to create our draft pool
draft_pool = pd.concat([draft_pool_hitters,draft_pool_pitchers],axis=0, ignore_index=True, sort=True)

# Print your draft pool and make sure you have the right number of players (teams in your league x roster size)
print(draft_pool)
      ERA   ERA_sgp    HR    HR_sgp     IP              Name    OBP   OBP_sgp  \
0     NaN       NaN  44.0  3.930114    NaN        Mike Trout  0.439  4.209589   
1     NaN       NaN  36.0  3.215548    NaN  Christian Yelich  0.400  2.586097   
2     NaN       NaN  37.0  3.304868    NaN  Ronald Acuna Jr.  0.363  1.073647   
3     NaN       NaN  32.0  2.858265    NaN      Mookie Betts  0.390  2.221456   
4     NaN       NaN  42.0  3.751472    NaN    Cody Bellinger  0.385  1.882182   
..    ...       ...   ...       ...    ...               ...    ...       ...   
401  3.40  0.324967   NaN       NaN   58.0     Drew Pomeranz    NaN       NaN   
402  5.17 -2.002020   NaN       NaN  181.0         Alex Cobb    NaN       NaN   
403  3.85  0.067653   NaN       NaN   60.0    Shaun Anderson    NaN       NaN   
404  4.85 -0.920209   NaN       NaN  108.0          Joe Ross    NaN       NaN   
405  3.53  0.195716   NaN       NaN   45.0      Corey Knebel    NaN       NaN   

        PA    QS  ...     R_sgp    SB    SB_sgp     SO    SO_sgp   SV  \
0    675.0   NaN  ...  5.387186  14.0  1.992493    NaN       NaN  NaN   
1    672.0   NaN  ...  4.865846  22.0  3.131060    NaN       NaN  NaN   
2    680.0   NaN  ...  4.648620  29.0  4.127307    NaN       NaN  NaN   
3    688.0   NaN  ...  5.126516  18.0  2.561777    NaN       NaN  NaN   
4    640.0   NaN  ...  4.301060  12.0  1.707851    NaN       NaN  NaN   
..     ...   ...  ...       ...   ...       ...    ...       ...  ...   
401    NaN   0.0  ...       NaN   NaN       NaN   73.0  1.628027  1.0   
402    NaN  13.5  ...       NaN   NaN       NaN  129.0  2.876924  0.0   
403    NaN   0.0  ...       NaN   NaN       NaN   55.0  1.226595  8.0   
404    NaN   9.0  ...       NaN   NaN       NaN   90.0  2.007156  0.0   
405    NaN   0.0  ...       NaN   NaN       NaN   62.0  1.382708  4.0   

       SV_sgp  Total_SGPs  WHIP  WHIP_sgp  
0         NaN   19.705009   NaN       NaN  
1         NaN   17.535716   NaN       NaN  
2         NaN   16.667379   NaN       NaN  
3         NaN   16.318321   NaN       NaN  
4         NaN   15.940307   NaN       NaN  
..        ...         ...   ...       ...  
401  0.111356    2.296901  1.19  0.232551  
402  0.000000    2.229399  1.43 -1.327819  
403  0.890847    2.216245  1.26  0.031150  
404  0.000000    2.148623  1.41 -0.726534  
405  0.445423    2.137992  1.22  0.114145  

[406 rows x 24 columns]

A bit of cleanup

The hard work is ahead, but the good news is that if you play in a snake draft, you would do pretty well with just what we have so far.

Before we calculate $ values, let’s clean up our data a bit. In [9]:

# Replace the NaNs with something more aesthetically pleaseing
draft_pool.fillna(value="x",inplace=True)

# Get rid of these _sgp columns
for column in draft_pool:
    if "_sgp" in column:
        draft_pool.drop(labels=column,axis=1,inplace=True)

# And let's add a column to tell us whether or not the player is a hitter or pitcher.
draft_pool['Class'] = draft_pool.IP.apply(lambda x: 'Hitter' if x == 'x' else 'Pitcher')

# Sort by total SGPs. NB: the ignore_index parameter requires pandas 1.0 or higher to work
draft_pool.sort_values(by=['Total_SGPs'],ascending=False,inplace=True,ignore_index=True)

# And let's reorder the columns for better readability
new_columns = ['Class','Name','PA','R','HR','RBI','SB','OBP','IP','QS','SV','SO','ERA','WHIP','Total_SGPs']
draft_pool = draft_pool[new_columns]

# Now let's see what we have. It should look like a ranking of the best players in baseball, with the first pitcher 
# showing up somewhere around 20 - 25. 
print(draft_pool.head(25))
      Class               Name   PA    R  HR  RBI  SB    OBP   IP    QS SV  \
0    Hitter         Mike Trout  675  124  44  112  14  0.439    x     x  x   
1    Hitter   Christian Yelich  672  112  36  100  22    0.4    x     x  x   
2    Hitter   Ronald Acuna Jr.  680  107  37   94  29  0.363    x     x  x   
3    Hitter       Mookie Betts  688  118  32   95  18   0.39    x     x  x   
4    Hitter     Cody Bellinger  640   99  42  115  12  0.385    x     x  x   
5    Hitter        Trea Turner  687  103  22   75  39  0.354    x     x  x   
6    Hitter   Francisco Lindor  692  112  35   95  22  0.354    x     x  x   
7    Hitter       Bryce Harper  664  102  41  102  11  0.383    x     x  x   
8    Hitter          Juan Soto  646   97  34  106   9  0.406    x     x  x   
9    Hitter       Jose Ramirez  648   95  31  101  23  0.362    x     x  x   
10   Hitter       Alex Bregman  658  106  32  103   6  0.396    x     x  x   
11   Hitter      J.D. Martinez  649  102  39  120   3  0.378    x     x  x   
12   Hitter       Trevor Story  652   98  36  101  20  0.346    x     x  x   
13   Hitter        Aaron Judge  670  107  41  101   6  0.373    x     x  x   
14  Pitcher        Gerrit Cole    x    x   x    x   x      x  202  22.5  0   
15   Hitter  Giancarlo Stanton  612  101  48  115   3  0.355    x     x  x   
16   Hitter      Nolan Arenado  657  100  40  114   3  0.371    x     x  x   
17   Hitter    Freddie Freeman  662   98  33  103   6  0.385    x     x  x   
18  Pitcher       Max Scherzer    x    x   x    x   x      x  202  22.5  0   
19  Pitcher   Justin Verlander    x    x   x    x   x      x  204    24  0   
20   Hitter    George Springer  687  112  35   91   8  0.365    x     x  x   
21   Hitter      Anthony Rizzo  660   97  32   99   6  0.388    x     x  x   
22   Hitter     Josh Donaldson  644  100  36  103   4  0.379    x     x  x   
23  Pitcher       Jacob deGrom    x    x   x    x   x      x  205    21  0   
24   Hitter  Adalberto Mondesi  649   82  20   76  49  0.293    x     x  x   

     SO   ERA  WHIP  Total_SGPs  
0     x     x     x   19.705009  
1     x     x     x   17.535716  
2     x     x     x   16.667379  
3     x     x     x   16.318321  
4     x     x     x   15.940307  
5     x     x     x   15.498767  
6     x     x     x   15.383605  
7     x     x     x   15.336310  
8     x     x     x   15.225999  
9     x     x     x   14.931597  
10    x     x     x   14.541766  
11    x     x     x   14.453657  
12    x     x     x   14.446120  
13    x     x     x   14.409209  
14  280  3.26  1.04   14.112785  
15    x     x     x   14.072346  
16    x     x     x   13.967811  
17    x     x     x   13.849842  
18  268  3.28  1.04   13.808610  
19  266  3.46  1.04   13.759256  
20    x     x     x   13.699255  
21    x     x     x   13.683763  
22    x     x     x   13.633914  
23  258  3.13  1.06   13.423138  
24    x     x     x   13.398905  

In [10]:

# If you want to export your current status, uncomment the next line and run it. 
#draft_pool.to_csv('Exports/my_draft_pool.csv')

Translating SGPs into dollar values

OK, this is cool and all, but how much should I pay for weather-loving Mike Trout and his sexy SGPs?

First we need to see how many SGPs we’re working with… In [11]:

# Get total SGPs in pool by player class 
sgp_sum_by_class = draft_pool.groupby('Class').sum()
hitter_pool_sgps = sgp_sum_by_class.loc['Hitter']
pitcher_pool_sgps = sgp_sum_by_class.loc['Pitcher']

print(float(hitter_pool_sgps))
print(float(pitcher_pool_sgps))
2010.121852576441
1033.4382229317596

A somewhat long-winded note about the subtle art of draft budget allocation

Allocating your draft budget between hitters and pitchers is bit of an art because we’re looking to accurately price the players based on their true value (i.e. how much they will help us in the standings) within the context of their market value, or how much other players in our league are willing to pay for a given player.

For instance, if you plan to draft an equal number of hitters and pitchers (e.g. 10 of each), you might think “I’ll just allocate half my budget to hitters and half to pitchers! I have five scoring categories for hitters and five for pitchers, obviously they’re of equal value.”

Well yes, except that your dollar values will be way out of sync with the rest of the league — you’ll end up drafting a lot of top pitchers, thinking each one is a steal, while all the hitters will seem wildly overvalued by your leaguemates.

The problem is that you’ll end up with a great pitching staff and terrible hitters and your team will be mediocre. We’re trying to assemble a great roster, not optimize for some abstract idea of value.

If this doesn’t make sense now, don’t worry. The best way to grok it is to play with the value allocations to see how the values change. In [12]:

# Define the percentage of your budget for hitters and pitchers
hitter_allocation_pct = 0.66
pitcher_allocation_pct = 1 - hitter_allocation_pct

Up top we defined two variables related to your league settings: draft_budget and teams_in_league.

draft_budget is how much money each team in your league gets for the draft.

In a typical auction draft, each team gets $260.

Teams in my league get $300, so that’s what I’m going to use. In [13]:

total_dollar_pool = teams_in_league * draft_budget
print("total_dollar_pool: " + str(total_dollar_pool))

# Now, some maths 
hitter_dollar_allocation = total_dollar_pool * hitter_allocation_pct
pitcher_dollar_allocation = total_dollar_pool * pitcher_allocation_pct
print("\nhitter_dollar_allocation: " + str(hitter_dollar_allocation))
print("pitcher_dollar_allocation: " + str(pitcher_dollar_allocation))

# How much is an SGP worth for a hitter and for a pitcher?
hitter_sgp_value = float(hitter_dollar_allocation / hitter_pool_sgps)
pitcher_sgp_value = float(pitcher_dollar_allocation / pitcher_pool_sgps)

# Check our values
print("\nhitter_sgp_value: " + str(hitter_sgp_value))
print("pitcher_sgp_value: " + str(pitcher_sgp_value))

# You can play with the hitter_allocation_pct above to see how the SGP value for hitters and pitchers changes
total_dollar_pool: 4200

hitter_dollar_allocation: 2772.0
pitcher_dollar_allocation: 1427.9999999999998

hitter_sgp_value: 1.3790208769916283
pitcher_sgp_value: 1.3817952232779898

Let’s give our players their values…

To do this, we’re going to multiply their SGPs by their dollar values. In [14]:

calculate_dollar_values = \
lambda row: ( row['Total_SGPs'] * hitter_sgp_value \
             if row['Class'] == 'Hitter' \
             else row['Total_SGPs'] * pitcher_sgp_value)

draft_pool['Dollar_Value'] = draft_pool.apply(calculate_dollar_values,axis=1)

# And sort...
draft_pool.sort_values(by=['Dollar_Value'],ascending=False,inplace=True,ignore_index=True)

# Inspect the results. Depending on your draft allocation, pitcher values may now be higher, relative to hitters
# Try changing the pitcher budget allocation to 50%. Are the most valueable players all pitchers now? 
print(draft_pool.head(25))
print(draft_pool.tail(5))
      Class               Name   PA    R  HR  RBI  SB    OBP   IP    QS SV  \
0    Hitter         Mike Trout  675  124  44  112  14  0.439    x     x  x   
1    Hitter   Christian Yelich  672  112  36  100  22    0.4    x     x  x   
2    Hitter   Ronald Acuna Jr.  680  107  37   94  29  0.363    x     x  x   
3    Hitter       Mookie Betts  688  118  32   95  18   0.39    x     x  x   
4    Hitter     Cody Bellinger  640   99  42  115  12  0.385    x     x  x   
5    Hitter        Trea Turner  687  103  22   75  39  0.354    x     x  x   
6    Hitter   Francisco Lindor  692  112  35   95  22  0.354    x     x  x   
7    Hitter       Bryce Harper  664  102  41  102  11  0.383    x     x  x   
8    Hitter          Juan Soto  646   97  34  106   9  0.406    x     x  x   
9    Hitter       Jose Ramirez  648   95  31  101  23  0.362    x     x  x   
10   Hitter       Alex Bregman  658  106  32  103   6  0.396    x     x  x   
11   Hitter      J.D. Martinez  649  102  39  120   3  0.378    x     x  x   
12   Hitter       Trevor Story  652   98  36  101  20  0.346    x     x  x   
13   Hitter        Aaron Judge  670  107  41  101   6  0.373    x     x  x   
14  Pitcher        Gerrit Cole    x    x   x    x   x      x  202  22.5  0   
15   Hitter  Giancarlo Stanton  612  101  48  115   3  0.355    x     x  x   
16   Hitter      Nolan Arenado  657  100  40  114   3  0.371    x     x  x   
17   Hitter    Freddie Freeman  662   98  33  103   6  0.385    x     x  x   
18  Pitcher       Max Scherzer    x    x   x    x   x      x  202  22.5  0   
19  Pitcher   Justin Verlander    x    x   x    x   x      x  204    24  0   
20   Hitter    George Springer  687  112  35   91   8  0.365    x     x  x   
21   Hitter      Anthony Rizzo  660   97  32   99   6  0.388    x     x  x   
22   Hitter     Josh Donaldson  644  100  36  103   4  0.379    x     x  x   
23  Pitcher       Jacob deGrom    x    x   x    x   x      x  205    21  0   
24   Hitter  Adalberto Mondesi  649   82  20   76  49  0.293    x     x  x   

     SO   ERA  WHIP  Total_SGPs  Dollar_Value  
0     x     x     x   19.705009     27.173618  
1     x     x     x   17.535716     24.182119  
2     x     x     x   16.667379     22.984664  
3     x     x     x   16.318321     22.503305  
4     x     x     x   15.940307     21.982016  
5     x     x     x   15.498767     21.373124  
6     x     x     x   15.383605     21.214312  
7     x     x     x   15.336310     21.149092  
8     x     x     x   15.225999     20.996970  
9     x     x     x   14.931597     20.590985  
10    x     x     x   14.541766     20.053398  
11    x     x     x   14.453657     19.931895  
12    x     x     x   14.446120     19.921501  
13    x     x     x   14.409209     19.870600  
14  280  3.26  1.04   14.112785     19.500979  
15    x     x     x   14.072346     19.406060  
16    x     x     x   13.967811     19.261903  
17    x     x     x   13.849842     19.099221  
18  268  3.28  1.04   13.808610     19.080672  
19  266  3.46  1.04   13.759256     19.012475  
20    x     x     x   13.699255     18.891559  
21    x     x     x   13.683763     18.870194  
22    x     x     x   13.633914     18.801452  
23  258  3.13  1.06   13.423138     18.548028  
24    x     x     x   13.398905     18.477370  
       Class            Name PA  R HR RBI SB OBP   IP    QS SV   SO   ERA  \
401  Pitcher   Drew Pomeranz  x  x  x   x  x   x   58     0  1   73   3.4   
402  Pitcher       Alex Cobb  x  x  x   x  x   x  181  13.5  0  129  5.17   
403  Pitcher  Shaun Anderson  x  x  x   x  x   x   60     0  8   55  3.85   
404  Pitcher        Joe Ross  x  x  x   x  x   x  108     9  0   90  4.85   
405  Pitcher    Corey Knebel  x  x  x   x  x   x   45     0  4   62  3.53   

     WHIP  Total_SGPs  Dollar_Value  
401  1.19    2.296901      3.173846  
402  1.43    2.229399      3.080573  
403  1.26    2.216245      3.062397  
404  1.41    2.148623      2.968957  
405  1.22    2.137992      2.954267  

Smoothing out the player values

We’re close, but there’s a problem. Look at the head and tail of the draft pool above and you’ll notice that the bottom players are worth around $3 and the top players are worth around $25.

These might be “accurate” values for the players, but they don’t reflect the reality of draft economics.

In reality, a certain percentage of players will be drafted for just $1 and the top players will go for as much as $40 or $50 or even more.

To address this, we need to smooth out the dollar values and redistribute value from the bottom of the pool to the top. In [15]:

# Before we smooth things out, let's visualize the distribution of our player values.

# Convert our dollar values into a numpy array and then round them to the nearest integer. 
dollar_values_array = np.rint(draft_pool['Dollar_Value'].to_numpy())

# And convert the numpy array to a normal list because I want to use list.count() below.
dollar_values_list = [int(value) for value in dollar_values_array]

# x_values should be the number of times a dollar value appears
x_values = list(range(1,51)) # Manually entering this will make it easier when we chart these again later

# y_values should be the dollar value
y_values = []
for number in x_values:
    if dollar_values_list.count(number) == 0:
        y_values.append(0)
    else:
        y_values.append(dollar_values_list.count(number))

# Plot the distribution
plt.bar(x_values, y_values)

plt.xlabel('Dollar Values')
plt.ylabel('Players')
plt.title("Player Value Distribution")

plt.show()

Looking at this distribution, does it match how you think of your draft? Probably not. For one thing, there are no $1 players (in my league, usually 40-60 get drafted at that price) and there are no players above $25. In my league, the top guys go for $50 or more, plus inflation, which we’ll factor in at the end.

Here are the steps:

  1. Take the lowest valued player’s Dollar_value and subtract an amount that would make the Dollar_value exactly equeal to $1. For example, if the lowest-ranked player’s `Dollar_value` was $3.50, you would subtract $2.50 to get to $1.
  2. Subtract that same amount ($2.50 in our example) from every player’s Dollar_value.
  3. Find the sum of every player’s Dollar_Value.
  4. Divide the total_dollar_pool by the sum from step 3. The result should be some number greater than 1.
  5. Multiply every player’s value by the result from step 4.

Basically, we are redistributing value from the bottom of the pool to the top of the pool.

NB: there is almost certainly a fancy statistical method for doing this, but I wasn’t able to find it. In [16]:

# Create a new column for our smoothed values
draft_pool['True_Dollar_Value'] = draft_pool['Dollar_Value']

# Define how many $1 players you want in your pool 
one_dollar_players = 50

i = -1

while abs(i) <= one_dollar_players: 
    lowest_player_dollar_value = draft_pool.iloc[i,-1]   

    # run the loop until the lowest value gets to 1.0     
    while lowest_player_dollar_value > 1.0:
        # 1. Subtract an amount that would make the `Dollar_value` exactly equal to $1. 
        amount_to_subtract = lowest_player_dollar_value - 1

        # 2. Subtract that same amount from every player's `Dollar_value`.
        subtract_it = lambda x: x - amount_to_subtract if x > 1 else x # 2.21410
        draft_pool['True_Dollar_Value'] = draft_pool.True_Dollar_Value.apply(subtract_it)

        # 3. Find the sum of every player's `Dollar_Value`.
        adjusted_dollar_value_after_subtraction = draft_pool.True_Dollar_Value.sum() 

        # 4. Divide the `total_dollar_pool` by the sum from step 3. The result should be some number greater than 1.
        dollar_value_multiplier = total_dollar_pool / adjusted_dollar_value_after_subtraction

        # 5. Multiply every player's value by the result from step 4. 
        multiply_it = lambda x: x * dollar_value_multiplier
        draft_pool['True_Dollar_Value'] = draft_pool.True_Dollar_Value.apply(multiply_it)

        # Update the lowest_player_dollar_value so the loop condition works
        lowest_player_dollar_value = draft_pool.iloc[i,-1] # 1.27

    # decrease i and move to the next highest row
    i -=1

Let’s take a look at what we have now: In [17]:

print(draft_pool.head(60))
print(draft_pool.tail(50))

# Uncomment to save a new CSV.
# draft_pool.to_csv('Exports/my_draft_pool_smoothed.csv')
      Class                Name   PA    R  HR  RBI  SB    OBP   IP    QS SV  \
0    Hitter          Mike Trout  675  124  44  112  14  0.439    x     x  x   
1    Hitter    Christian Yelich  672  112  36  100  22    0.4    x     x  x   
2    Hitter    Ronald Acuna Jr.  680  107  37   94  29  0.363    x     x  x   
3    Hitter        Mookie Betts  688  118  32   95  18   0.39    x     x  x   
4    Hitter      Cody Bellinger  640   99  42  115  12  0.385    x     x  x   
5    Hitter         Trea Turner  687  103  22   75  39  0.354    x     x  x   
6    Hitter    Francisco Lindor  692  112  35   95  22  0.354    x     x  x   
7    Hitter        Bryce Harper  664  102  41  102  11  0.383    x     x  x   
8    Hitter           Juan Soto  646   97  34  106   9  0.406    x     x  x   
9    Hitter        Jose Ramirez  648   95  31  101  23  0.362    x     x  x   
10   Hitter        Alex Bregman  658  106  32  103   6  0.396    x     x  x   
11   Hitter       J.D. Martinez  649  102  39  120   3  0.378    x     x  x   
12   Hitter        Trevor Story  652   98  36  101  20  0.346    x     x  x   
13   Hitter         Aaron Judge  670  107  41  101   6  0.373    x     x  x   
14  Pitcher         Gerrit Cole    x    x   x    x   x      x  202  22.5  0   
15   Hitter   Giancarlo Stanton  612  101  48  115   3  0.355    x     x  x   
16   Hitter       Nolan Arenado  657  100  40  114   3  0.371    x     x  x   
17   Hitter     Freddie Freeman  662   98  33  103   6  0.385    x     x  x   
18  Pitcher        Max Scherzer    x    x   x    x   x      x  202  22.5  0   
19  Pitcher    Justin Verlander    x    x   x    x   x      x  204    24  0   
20   Hitter     George Springer  687  112  35   91   8  0.365    x     x  x   
21   Hitter       Anthony Rizzo  660   97  32   99   6  0.388    x     x  x   
22   Hitter      Josh Donaldson  644  100  36  103   4  0.379    x     x  x   
23  Pitcher        Jacob deGrom    x    x   x    x   x      x  205    21  0   
24   Hitter   Adalberto Mondesi  649   82  20   76  49  0.293    x     x  x   
25   Hitter         Nelson Cruz  641   97  40  114   1  0.363    x     x  x   
26   Hitter          Joey Gallo  616   90  43  101   8  0.349    x     x  x   
27   Hitter      Yordan Alvarez  602   93  39  107   3  0.366    x     x  x   
28   Hitter  Fernando Tatis Jr.  671   93  31   82  23  0.333    x     x  x   
29   Hitter       Rafael Devers  636   95  32  103   9  0.355    x     x  x   
30   Hitter          Tommy Pham  652   92  25   76  18  0.366    x     x  x   
31   Hitter         Kris Bryant  669  100  31   91   4  0.377    x     x  x   
32   Hitter         Jose Altuve  673  100  24   93  12  0.361    x     x  x   
33   Hitter        Peter Alonso  663   98  44  105   2  0.343    x     x  x   
34  Pitcher          Chris Sale    x    x   x    x   x      x  177    21  0   
35   Hitter        Rhys Hoskins  656   93  36   96   4  0.365    x     x  x   
36   Hitter       Manny Machado  664   93  37  102   7  0.343    x     x  x   
37   Hitter      Starling Marte  649   85  22   81  26  0.337    x     x  x   
38   Hitter      Anthony Rendon  655   93  28   99   4  0.374    x     x  x   
39   Hitter       Carlos Correa  637   93  33  105   3  0.359    x     x  x   
40   Hitter    Paul Goldschmidt  663   91  31   91   5  0.369    x     x  x   
41   Hitter    Charlie Blackmon  660  101  30   84   8  0.356    x     x  x   
42   Hitter      Carlos Santana  654   92  29   93   3  0.375    x     x  x   
43   Hitter     Xander Bogaerts  649   92  25   96   6  0.368    x     x  x   
44   Hitter       Shohei Ohtani  555   80  29   89  13  0.353    x     x  x   
45   Hitter         Bo Bichette  663   92  22   74  24  0.328    x     x  x   
46   Hitter      Kyle Schwarber  585   86  37   90   5  0.354    x     x  x   
47   Hitter          Matt Olson  642   91  38  103   1  0.344    x     x  x   
48   Hitter       Marcus Semien  678   98  25   79  12  0.347    x     x  x   
49   Hitter      Austin Meadows  624   87  28   84  15  0.337    x     x  x   
50   Hitter        Yoan Moncada  662   92  28   86  12   0.34    x     x  x   
51   Hitter        Matt Chapman  656   95  34   97   3  0.343    x     x  x   
52   Hitter     Jonathan Villar  657   78  17   64  32  0.327    x     x  x   
53   Hitter      Gleyber Torres  618   86  34   98   6   0.34    x     x  x   
54   Hitter         Jorge Soler  637   87  35   96   3  0.349    x     x  x   
55  Pitcher      Walker Buehler    x    x   x    x   x      x  194    21  0   
56   Hitter        Ozzie Albies  652   89  24   82  14  0.344    x     x  x   
57   Hitter      Eugenio Suarez  651   87  36   96   3  0.345    x     x  x   
58   Hitter         Luis Robert  578   77  26   83  23  0.317    x     x  x   
59   Hitter       Marcell Ozuna  634   82  31  100   7  0.343    x     x  x   

     SO   ERA  WHIP  Total_SGPs  Dollar_Value  True_Dollar_Value  
0     x     x     x   19.705009     27.173618          36.122559  
1     x     x     x   17.535716     24.182119          31.516706  
2     x     x     x   16.667379     22.984664          29.673048  
3     x     x     x   16.318321     22.503305          28.931925  
4     x     x     x   15.940307     21.982016          28.129324  
5     x     x     x   15.498767     21.373124          27.191845  
6     x     x     x   15.383605     21.214312          26.947331  
7     x     x     x   15.336310     21.149092          26.846916  
8     x     x     x   15.225999     20.996970          26.612702  
9     x     x     x   14.931597     20.590985          25.987627  
10    x     x     x   14.541766     20.053398          25.159934  
11    x     x     x   14.453657     19.931895          24.972861  
12    x     x     x   14.446120     19.921501          24.956859  
13    x     x     x   14.409209     19.870600          24.878488  
14  280  3.26  1.04   14.112785     19.500979          24.309404  
15    x     x     x   14.072346     19.406060          24.163261  
16    x     x     x   13.967811     19.261903          23.941311  
17    x     x     x   13.849842     19.099221          23.690837  
18  268  3.28  1.04   13.808610     19.080672          23.662278  
19  266  3.46  1.04   13.759256     19.012475          23.557279  
20    x     x     x   13.699255     18.891559          23.371111  
21    x     x     x   13.683763     18.870194          23.338218  
22    x     x     x   13.633914     18.801452          23.232379  
23  258  3.13  1.06   13.423138     18.548028          22.842196  
24    x     x     x   13.398905     18.477370          22.733407  
25    x     x     x   13.206795     18.212446          22.325518  
26    x     x     x   13.112132     18.081904          22.124529  
27    x     x     x   13.021316     17.956666          21.931706  
28    x     x     x   12.973844     17.891202          21.830915  
29    x     x     x   12.812234     17.668338          21.487783  
30    x     x     x   12.785291     17.631184          21.430579  
31    x     x     x   12.715604     17.535083          21.282617  
32    x     x     x   12.652668     17.448294          21.148992  
33    x     x     x   12.631782     17.419491          21.104646  
34  233  3.22  1.03   12.539905     17.327581          20.963138  
35    x     x     x   12.532399     17.282440          20.893637  
36    x     x     x   12.389126     17.084863          20.589438  
37    x     x     x   12.376786     17.067846          20.563237  
38    x     x     x   12.290491     16.948844          20.380016  
39    x     x     x   12.193362     16.814901          20.173791  
40    x     x     x   12.128043     16.724825          20.035106  
41    x     x     x   12.106473     16.695078          19.989307  
42    x     x     x   12.007923     16.559176          19.780065  
43    x     x     x   11.899798     16.410070          19.550494  
44    x     x     x   11.787255     16.254871          19.311542  
45    x     x     x   11.768332     16.228775          19.271364  
46    x     x     x   11.724804     16.168749          19.178946  
47    x     x     x   11.607333     16.006755          18.929532  
48    x     x     x   11.557202     15.937623          18.823092  
49    x     x     x   11.546488     15.922848          18.800345  
50    x     x     x   11.532918     15.904135          18.771533  
51    x     x     x   11.449309     15.788836          18.594013  
52    x     x     x   11.441110     15.777530          18.576605  
53    x     x     x   11.395779     15.715017          18.480358  
54    x     x     x   11.382805     15.697126          18.452812  
55  225  3.45  1.13   11.338634     15.667670          18.407460  
56    x     x     x   11.339148     15.636922          18.360119  
57    x     x     x   11.321407     15.612456          18.322451  
58    x     x     x   11.318038     15.607811          18.315299  
59    x     x     x   11.290775     15.570214          18.257413  
       Class               Name PA  R HR RBI SB OBP   IP    QS  SV   SO   ERA  \
356  Pitcher         Jose Urena  x  x  x   x  x   x  149  10.5   0  119  4.55   
357  Pitcher    Jordan Yamamoto  x  x  x   x  x   x  135  10.5   0  128  4.72   
358  Pitcher        Brad Keller  x  x  x   x  x   x  182    15   0  137  4.74   
359  Pitcher      Daniel Norris  x  x  x   x  x   x  147    12   0  127  4.93   
360  Pitcher        Matt Barnes  x  x  x   x  x   x   68     0   4   93  3.38   
361  Pitcher      Merrill Kelly  x  x  x   x  x   x  109     9   0  101  4.57   
362  Pitcher  Elieser Hernandez  x  x  x   x  x   x   93   7.5   0   94  4.48   
363  Pitcher    Dellin Betances  x  x  x   x  x   x   55     0   4   78  3.07   
364  Pitcher      Jose Alvarado  x  x  x   x  x   x   60     0   9   76  3.32   
365  Pitcher         Logan Webb  x  x  x   x  x   x   99     9   0   87   4.2   
366  Pitcher     Brett Anderson  x  x  x   x  x   x  143  13.5   0   95  4.54   
367  Pitcher          Chad Kuhl  x  x  x   x  x   x  108     9   0  105  4.61   
368  Pitcher         Mike Fiers  x  x  x   x  x   x  188    15   0  144  5.21   
369  Pitcher     Diego Castillo  x  x  x   x  x   x   60     0   5   73  3.26   
370  Pitcher      Dakota Hudson  x  x  x   x  x   x  166  13.5   0  126  4.54   
371  Pitcher       Emilio Pagan  x  x  x   x  x   x   65     0   1   83  3.51   
372  Pitcher         Zach Eflin  x  x  x   x  x   x  146    12   0  120  5.05   
373  Pitcher        Ryne Stanek  x  x  x   x  x   x   65     0   5   83  3.58   
374  Pitcher   Yoshihisa Hirano  x  x  x   x  x   x   60     0  21   55  4.62   
375  Pitcher       Ryan Pressly  x  x  x   x  x   x   65     0   2   78  3.36   
376  Pitcher         Chad Green  x  x  x   x  x   x   60     0   0   81  3.44   
377  Pitcher         Ty Buttrey  x  x  x   x  x   x   65     0   5   75  3.56   
378  Pitcher     Trent Thornton  x  x  x   x  x   x   93   7.5   0   90  4.64   
379  Pitcher      Julio Teheran  x  x  x   x  x   x  179    15   0  155   5.3   
380  Pitcher      Daniel Hudson  x  x  x   x  x   x   67     0  15   69  4.36   
381  Pitcher         Wade Davis  x  x  x   x  x   x   65     0  24   65  5.11   
382  Pitcher       Joshua James  x  x  x   x  x   x   60     0   1   84  3.38   
383  Pitcher    Trevor Williams  x  x  x   x  x   x  167    12   0  135  5.02   
384  Pitcher       Homer Bailey  x  x  x   x  x   x  124    12   0   99  5.02   
385  Pitcher          Seth Lugo  x  x  x   x  x   x   65     0   2   75  3.54   
386  Pitcher     Michael Kopech  x  x  x   x  x   x   91   7.5   0  103  4.78   
387  Pitcher       Gio Gonzalez  x  x  x   x  x   x  133    12   0  114   4.9   
388  Pitcher   Kendall Graveman  x  x  x   x  x   x  137  10.5   0  103  4.78   
389  Pitcher     Tyler Chatwood  x  x  x   x  x   x  116  10.5   0  109  4.51   
390  Pitcher         Alec Mills  x  x  x   x  x   x   95   7.5   0   85  4.64   
391  Pitcher   Justus Sheffield  x  x  x   x  x   x  148  10.5   0  131   4.8   
392  Pitcher        Will Harris  x  x  x   x  x   x   67     0   4   70  3.56   
393  Pitcher       Tommy Kahnle  x  x  x   x  x   x   68     0   0   86  3.61   
394  Pitcher      Kyle Freeland  x  x  x   x  x   x  154  13.5   0  128  5.08   
395  Pitcher     Ross Stripling  x  x  x   x  x   x   80     0   0   80   3.7   
396  Pitcher         John Means  x  x  x   x  x   x  183    12   0  142  5.33   
397  Pitcher      Andrew Miller  x  x  x   x  x   x   68     0   2   80  3.65   
398  Pitcher      Hunter Harvey  x  x  x   x  x   x   60     0  14   62  4.61   
399  Pitcher      Blake Treinen  x  x  x   x  x   x   62     0   4   67  3.45   
400  Pitcher        Zach Plesac  x  x  x   x  x   x  119  10.5   0   97  5.07   
401  Pitcher      Drew Pomeranz  x  x  x   x  x   x   58     0   1   73   3.4   
402  Pitcher          Alex Cobb  x  x  x   x  x   x  181  13.5   0  129  5.17   
403  Pitcher     Shaun Anderson  x  x  x   x  x   x   60     0   8   55  3.85   
404  Pitcher           Joe Ross  x  x  x   x  x   x  108     9   0   90  4.85   
405  Pitcher       Corey Knebel  x  x  x   x  x   x   45     0   4   62  3.53   

     WHIP  Total_SGPs  Dollar_Value  True_Dollar_Value  
356  1.38    3.156425      4.361533                1.0  
357   1.4    3.148294      4.350298                1.0  
358  1.47    3.072151      4.245083                1.0  
359  1.39    3.059649      4.227809                1.0  
360  1.23    3.047430      4.210924                1.0  
361  1.34    3.040476      4.201315                1.0  
362  1.29    3.031641      4.189106                1.0  
363  1.15    3.005444      4.152908                1.0  
364   1.3    2.992172      4.134569                1.0  
365  1.38    2.976146      4.112425                1.0  
366  1.43    2.951746      4.078709                1.0  
367  1.37    2.940514      4.063188                1.0  
368   1.4    2.937932      4.059621                1.0  
369  1.17    2.903895      4.012588                1.0  
370   1.5    2.837873      3.921360                1.0  
371  1.09    2.835963      3.918720                1.0  
372  1.38    2.821200      3.898321                1.0  
373  1.22    2.817134      3.892702                1.0  
374  1.39    2.816882      3.892354                1.0  
375  1.15    2.738786      3.784442                1.0  
376  1.07    2.717006      3.754346                1.0  
377   1.2    2.716080      3.753066                1.0  
378  1.31    2.707418      3.741097                1.0  
379  1.46    2.679097      3.701963                1.0  
380  1.36    2.649379      3.660899                1.0  
381   1.5    2.644517      3.654180                1.0  
382  1.17    2.632308      3.637311                1.0  
383  1.42    2.604461      3.598832                1.0  
384   1.4    2.579940      3.564948                1.0  
385  1.15    2.556118      3.532032                1.0  
386  1.39    2.534549      3.502228                1.0  
387  1.47    2.490584      3.441477                1.0  
388   1.4    2.489041      3.439345                1.0  
389  1.53    2.467459      3.409523                1.0  
390  1.34    2.441441      3.373571                1.0  
391  1.48    2.392291      3.305657                1.0  
392  1.24    2.374723      3.281381                1.0  
393  1.21    2.358801      3.259380                1.0  
394  1.49    2.351697      3.249563                1.0  
395  1.18    2.345932      3.241597                1.0  
396  1.38    2.340762      3.234454                1.0  
397  1.24    2.319840      3.205544                1.0  
398  1.35    2.318940      3.204300                1.0  
399  1.25    2.318301      3.203417                1.0  
400  1.39    2.312246      3.195050                1.0  
401  1.19    2.296901      3.173846                1.0  
402  1.43    2.229399      3.080573                1.0  
403  1.26    2.216245      3.062397                1.0  
404  1.41    2.148623      2.968957                1.0  
405  1.22    2.137992      2.954267                1.0  

Further smoothing

Things look pretty good! We have about 50 players valued at $1 (a few more if we round down, but whatever). And everyone else’s values have increased.

But something is still off. If you’re following along with my data, Mike Trout is only at $36, which feels too low to me. Once I add in inflation of around 20\%, that will go up to $43. But last year in my league he was drafted for $70! Now, I didn’t value him that high, but someone did and I’m worried that if I run with these numbers, I’m going to end up with none of the top-25 hitters because everyone is paying more.

So I’m going to reallocate some of the money at the top and move it away from the middle. This follows my general stars & scrubs strategy. If you prefer a more balanced approach, then you might want to skip this step (and decrease the number of $1 players). In [18]:

# Note: you may have to run this a few times and then restart the kernel and redo it a few times to get things right.
# Until I find a more magical way to do this, it's going to involve some trial and error.
# For this example, I'm going to run it once with a lower_bound of 15 and then again with a lower_bound of 25. This
# will push more money into the top of the distribution.

upper_bound = 100 # players with True_Dollar_Value BELOW this number will be adjusted
lower_bound = 25  # players with True_Dollar_Value ABOVE this number will be adjusted

adjustment_rate = 1.4 # the multiplier to apply to the players in the range

adjust_top_players = lambda x: x * adjustment_rate if x < upper_bound and x > lower_bound else x
draft_pool['True_Dollar_Value'] = draft_pool.True_Dollar_Value.apply(adjust_top_players)

adjusted_dollar_value_after_subtraction = draft_pool.True_Dollar_Value.sum()

dollar_value_multiplier = total_dollar_pool / adjusted_dollar_value_after_subtraction

multiply_it = lambda x: x * dollar_value_multiplier if x > 1 else x
draft_pool['True_Dollar_Value'] = draft_pool.True_Dollar_Value.apply(multiply_it)

In [19]:

print(draft_pool.head(50))
print(draft_pool.tail())
print(draft_pool.True_Dollar_Value.sum())
      Class                Name   PA    R  HR  RBI  SB    OBP   IP    QS SV  \
0    Hitter          Mike Trout  675  124  44  112  14  0.439    x     x  x   
1    Hitter    Christian Yelich  672  112  36  100  22    0.4    x     x  x   
2    Hitter    Ronald Acuna Jr.  680  107  37   94  29  0.363    x     x  x   
3    Hitter        Mookie Betts  688  118  32   95  18   0.39    x     x  x   
4    Hitter      Cody Bellinger  640   99  42  115  12  0.385    x     x  x   
5    Hitter         Trea Turner  687  103  22   75  39  0.354    x     x  x   
6    Hitter    Francisco Lindor  692  112  35   95  22  0.354    x     x  x   
7    Hitter        Bryce Harper  664  102  41  102  11  0.383    x     x  x   
8    Hitter           Juan Soto  646   97  34  106   9  0.406    x     x  x   
9    Hitter        Jose Ramirez  648   95  31  101  23  0.362    x     x  x   
10   Hitter        Alex Bregman  658  106  32  103   6  0.396    x     x  x   
11   Hitter       J.D. Martinez  649  102  39  120   3  0.378    x     x  x   
12   Hitter        Trevor Story  652   98  36  101  20  0.346    x     x  x   
13   Hitter         Aaron Judge  670  107  41  101   6  0.373    x     x  x   
14  Pitcher         Gerrit Cole    x    x   x    x   x      x  202  22.5  0   
15   Hitter   Giancarlo Stanton  612  101  48  115   3  0.355    x     x  x   
16   Hitter       Nolan Arenado  657  100  40  114   3  0.371    x     x  x   
17   Hitter     Freddie Freeman  662   98  33  103   6  0.385    x     x  x   
18  Pitcher        Max Scherzer    x    x   x    x   x      x  202  22.5  0   
19  Pitcher    Justin Verlander    x    x   x    x   x      x  204    24  0   
20   Hitter     George Springer  687  112  35   91   8  0.365    x     x  x   
21   Hitter       Anthony Rizzo  660   97  32   99   6  0.388    x     x  x   
22   Hitter      Josh Donaldson  644  100  36  103   4  0.379    x     x  x   
23  Pitcher        Jacob deGrom    x    x   x    x   x      x  205    21  0   
24   Hitter   Adalberto Mondesi  649   82  20   76  49  0.293    x     x  x   
25   Hitter         Nelson Cruz  641   97  40  114   1  0.363    x     x  x   
26   Hitter          Joey Gallo  616   90  43  101   8  0.349    x     x  x   
27   Hitter      Yordan Alvarez  602   93  39  107   3  0.366    x     x  x   
28   Hitter  Fernando Tatis Jr.  671   93  31   82  23  0.333    x     x  x   
29   Hitter       Rafael Devers  636   95  32  103   9  0.355    x     x  x   
30   Hitter          Tommy Pham  652   92  25   76  18  0.366    x     x  x   
31   Hitter         Kris Bryant  669  100  31   91   4  0.377    x     x  x   
32   Hitter         Jose Altuve  673  100  24   93  12  0.361    x     x  x   
33   Hitter        Peter Alonso  663   98  44  105   2  0.343    x     x  x   
34  Pitcher          Chris Sale    x    x   x    x   x      x  177    21  0   
35   Hitter        Rhys Hoskins  656   93  36   96   4  0.365    x     x  x   
36   Hitter       Manny Machado  664   93  37  102   7  0.343    x     x  x   
37   Hitter      Starling Marte  649   85  22   81  26  0.337    x     x  x   
38   Hitter      Anthony Rendon  655   93  28   99   4  0.374    x     x  x   
39   Hitter       Carlos Correa  637   93  33  105   3  0.359    x     x  x   
40   Hitter    Paul Goldschmidt  663   91  31   91   5  0.369    x     x  x   
41   Hitter    Charlie Blackmon  660  101  30   84   8  0.356    x     x  x   
42   Hitter      Carlos Santana  654   92  29   93   3  0.375    x     x  x   
43   Hitter     Xander Bogaerts  649   92  25   96   6  0.368    x     x  x   
44   Hitter       Shohei Ohtani  555   80  29   89  13  0.353    x     x  x   
45   Hitter         Bo Bichette  663   92  22   74  24  0.328    x     x  x   
46   Hitter      Kyle Schwarber  585   86  37   90   5  0.354    x     x  x   
47   Hitter          Matt Olson  642   91  38  103   1  0.344    x     x  x   
48   Hitter       Marcus Semien  678   98  25   79  12  0.347    x     x  x   
49   Hitter      Austin Meadows  624   87  28   84  15  0.337    x     x  x   

     SO   ERA  WHIP  Total_SGPs  Dollar_Value  True_Dollar_Value  
0     x     x     x   19.705009     27.173618          49.107161  
1     x     x     x   17.535716     24.182119          42.845690  
2     x     x     x   16.667379     22.984664          40.339311  
3     x     x     x   16.318321     22.503305          39.331784  
4     x     x     x   15.940307     21.982016          38.240681  
5     x     x     x   15.498767     21.373124          36.966216  
6     x     x     x   15.383605     21.214312          36.633809  
7     x     x     x   15.336310     21.149092          36.497298  
8     x     x     x   15.225999     20.996970          36.178894  
9     x     x     x   14.931597     20.590985          35.329130  
10    x     x     x   14.541766     20.053398          34.203915  
11    x     x     x   14.453657     19.931895          24.249712  
12    x     x     x   14.446120     19.921501          24.234173  
13    x     x     x   14.409209     19.870600          24.158072  
14  280  3.26  1.04   14.112785     19.500979          23.605466  
15    x     x     x   14.072346     19.406060          23.463555  
16    x     x     x   13.967811     19.261903          23.248032  
17    x     x     x   13.849842     19.099221          23.004812  
18  268  3.28  1.04   13.808610     19.080672          22.977080  
19  266  3.46  1.04   13.759256     19.012475          22.875121  
20    x     x     x   13.699255     18.891559          22.694344  
21    x     x     x   13.683763     18.870194          22.662403  
22    x     x     x   13.633914     18.801452          22.559630  
23  258  3.13  1.06   13.423138     18.548028          22.180745  
24    x     x     x   13.398905     18.477370          22.075106  
25    x     x     x   13.206795     18.212446          21.679029  
26    x     x     x   13.112132     18.081904          21.483860  
27    x     x     x   13.021316     17.956666          21.296621  
28    x     x     x   12.973844     17.891202          21.198748  
29    x     x     x   12.812234     17.668338          20.865553  
30    x     x     x   12.785291     17.631184          20.810005  
31    x     x     x   12.715604     17.535083          20.666328  
32    x     x     x   12.652668     17.448294          20.536572  
33    x     x     x   12.631782     17.419491          20.493510  
34  233  3.22  1.03   12.539905     17.327581          20.356100  
35    x     x     x   12.532399     17.282440          20.288611  
36    x     x     x   12.389126     17.084863          19.993221  
37    x     x     x   12.376786     17.067846          19.967779  
38    x     x     x   12.290491     16.948844          19.789864  
39    x     x     x   12.193362     16.814901          19.589610  
40    x     x     x   12.128043     16.724825          19.454941  
41    x     x     x   12.106473     16.695078          19.410468  
42    x     x     x   12.007923     16.559176          19.207286  
43    x     x     x   11.899798     16.410070          18.984362  
44    x     x     x   11.787255     16.254871          18.752330  
45    x     x     x   11.768332     16.228775          18.713315  
46    x     x     x   11.724804     16.168749          18.623573  
47    x     x     x   11.607333     16.006755          18.381382  
48    x     x     x   11.557202     15.937623          18.278024  
49    x     x     x   11.546488     15.922848          18.255936  
       Class            Name PA  R HR RBI SB OBP   IP    QS SV   SO   ERA  \
401  Pitcher   Drew Pomeranz  x  x  x   x  x   x   58     0  1   73   3.4   
402  Pitcher       Alex Cobb  x  x  x   x  x   x  181  13.5  0  129  5.17   
403  Pitcher  Shaun Anderson  x  x  x   x  x   x   60     0  8   55  3.85   
404  Pitcher        Joe Ross  x  x  x   x  x   x  108     9  0   90  4.85   
405  Pitcher    Corey Knebel  x  x  x   x  x   x   45     0  4   62  3.53   

     WHIP  Total_SGPs  Dollar_Value  True_Dollar_Value  
401  1.19    2.296901      3.173846                1.0  
402  1.43    2.229399      3.080573                1.0  
403  1.26    2.216245      3.062397                1.0  
404  1.41    2.148623      2.968957                1.0  
405  1.22    2.137992      2.954267                1.0  
4201.4478703686655

Looks much better! Now let’s see how it looks when graphed out. In [20]:

# First, let's convert our dollar values into a numpy array and then round them to the nearest integer. 
dollar_values_array = np.rint(draft_pool['Dollar_Value'].to_numpy())
true_dollar_values_array = np.rint(draft_pool['True_Dollar_Value'].to_numpy())

# print(dollar_values_array)
# print(true_dollar_values_array)

# And convert the numpy array to a normal list because I want to use list.count() below.
dollar_values_list = [int(value) for value in dollar_values_array]
true_dollar_values_list = [int(value) for value in true_dollar_values_array]

# x_values should be the number of times a dollar value appears
x_values = list(range(1,51))

# y_values should be the dollar value
y_values = []
y2_values = []
for number in x_values:
    if dollar_values_list.count(number) == 0:
        y_values.append(0)
    else:
        y_values.append(dollar_values_list.count(number))
for number in x_values:
    if true_dollar_values_list.count(number) == 0:
        y2_values.append(0)
    else:
        y2_values.append(true_dollar_values_list.count(number))

# Plot the distribution
plt.plot(x_values, y_values, label="Original Dollar Values")
plt.bar(x_values, y2_values, color="orange",label="New Dollar Values")

plt.xlabel('Dollar Values')
plt.ylabel('Players')
plt.title("Player Value Distribution")
plt.legend()

plt.show()

So, this method isn’t perfect. It’s looking better, although there are some weird gaps in there. I’m feeling OK with it for now but I might come back and tinker with it to smooth things out a bit.

You’ll have to play with it a bit to get numbers that work for you, or email me and tell me of a better way to handle this. In [21]:

# Uncomment to export your data as a CSV and you're ready to draft...
# draft_pool.to_csv('Exports/my_draft_pool_smoothed_adjusted.csv')

Calculating Inflation

A lot of auction leagues are also keeper leagues. The players held out of the draft have a huge impact on the auction prices and if you’ve ever drafted in a keeper league without accounting for inflation, you may have had a wicked surprise — the market values are way different from what the ‘true’ values are for a given player.

Sure, paying $70 for Mike Trout might seem outrageous, but the money all has to get spent and leaving the draft with unspent money is a mistake that could haunt you. Better to ‘overpay’ and build a better roster than refuse to overpay and leave with money that you can’t take with you.

Preparing inflation dataΒΆ

Before we can begin, we need to leave the notebook and do some work in Excel so we know which players were kept. Since my league hasn’t set its keepers yet, I’m going to create a fake version on my own. In [22]:

from shutil import copyfile
# Uncomment this and run it to create your file. Be careful not to run it again or you might overwrite your data!
# copyfile("Exports/my_draft_pool_smoothed_adjusted.csv", "League_Data/blank_keeper_data.csv")

Now that you have your file, open it up in Excel. Before doing anything else, save the file under a new name, so you don’t accidentally overwrite it.

Next, you’re going to manually add your league’s keepers in the spreadsheet. Create two new columns, one called ‘Salary’ and one called ‘Keeper’ (you can look at my_keeper_data.csv to see how I filled it out).

In the Keeper column, add True if the player was kept, then add their salary in the Salary column. In [23]:

# Import the keeper list CSV as a dataframe
my_league_keeper_list = pd.read_csv('League_Data/my_keeper_data.csv',thousands=',')
my_league_keeper_list.drop(labels="Unnamed: 0",axis=1,inplace=True)
print(my_league_keeper_list.head())
    Class              Name   PA    R  HR  RBI  SB    OBP IP QS SV SO ERA  \
0  Hitter        Mike Trout  675  124  44  112  14  0.439  x  x  x  x   x   
1  Hitter  Christian Yelich  672  112  36  100  22    0.4  x  x  x  x   x   
2  Hitter  Ronald Acuna Jr.  680  107  37   94  29  0.363  x  x  x  x   x   
3  Hitter      Mookie Betts  688  118  32   95  18   0.39  x  x  x  x   x   
4  Hitter    Cody Bellinger  640   99  42  115  12  0.385  x  x  x  x   x   

  WHIP  Total_SGPs  Dollar_Value  True_Dollar_Value Keeper  Salary  
0    x   19.705009     27.173618          51.206485    NaN     NaN  
1    x   17.535716     24.182119          44.677337   True    20.0  
2    x   16.667379     22.984664          42.063811   True     1.0  
3    x   16.318321     22.503305          41.013213   True     3.0  
4    x   15.940307     21.982016          39.875465    NaN     NaN  

The dataframe should now have columns for Keeper and Salary. Now we can calculate inflation. In [24]:

# Create a dataframe to get the keeper salary/value aggregates
keeper_data = my_league_keeper_list.groupby(['Class','Keeper']).sum()

# Hitters
hitter_kept_value = keeper_data.loc['Hitter','True_Dollar_Value']
hitter_kept_salary = keeper_data.loc['Hitter','Salary']

hitter_remaining_dollars = hitter_dollar_allocation - hitter_kept_salary
hitter_remaining_value = hitter_dollar_allocation - hitter_kept_value

# If you don't have past league data, you can manually assign a multiplier (e.g. 1.25) to hitter_inflation_rate
hitter_inflation_rate = hitter_remaining_dollars / hitter_remaining_value

# Pitchers
pitcher_kept_value = keeper_data.loc['Pitcher','True_Dollar_Value']
pitcher_kept_salary = keeper_data.loc['Pitcher','Salary']

pitcher_remaining_dollars = pitcher_dollar_allocation - pitcher_kept_salary
pitcher_remaining_value = pitcher_dollar_allocation - pitcher_kept_value

# If you don't have past league data, you can manually assign a multiplier (e.g. 1.25) to pitcher_inflation_rate
pitcher_inflation_rate = pitcher_remaining_dollars / pitcher_remaining_value

# Let's apply the inflation rates
apply_inflation = lambda row: row['True_Dollar_Value'] * hitter_inflation_rate \
    if row['Class'] == 'Hitter' \
    else row['True_Dollar_Value'] * pitcher_inflation_rate

my_league_keeper_list['Inflated_Value'] = my_league_keeper_list.apply(apply_inflation,axis=1)

# Export a new spreadsheet
# my_league_keeper_list.to_csv('Exports/my_draft_pool_smoothed_adjusted_inflated.csv')

print(my_league_keeper_list.head(3))
print(my_league_keeper_list.tail(3))
    Class              Name   PA    R  HR  RBI  SB    OBP IP QS SV SO ERA  \
0  Hitter        Mike Trout  675  124  44  112  14  0.439  x  x  x  x   x   
1  Hitter  Christian Yelich  672  112  36  100  22    0.4  x  x  x  x   x   
2  Hitter  Ronald Acuna Jr.  680  107  37   94  29  0.363  x  x  x  x   x   

  WHIP  Total_SGPs  Dollar_Value  True_Dollar_Value Keeper  Salary  \
0    x   19.705009     27.173618          51.206485    NaN     NaN   
1    x   17.535716     24.182119          44.677337   True    20.0   
2    x   16.667379     22.984664          42.063811   True     1.0   

   Inflated_Value  
0       63.661140  
1       55.543945  
2       52.294746  
       Class            Name PA  R HR RBI SB OBP   IP QS SV  SO   ERA  WHIP  \
403  Pitcher  Shaun Anderson  x  x  x   x  x   x   60  0  8  55  3.85  1.26   
404  Pitcher        Joe Ross  x  x  x   x  x   x  108  9  0  90  4.85  1.41   
405  Pitcher    Corey Knebel  x  x  x   x  x   x   45  0  4  62  3.53  1.22   

     Total_SGPs  Dollar_Value  True_Dollar_Value Keeper  Salary  \
403    2.216245      3.062397                1.0    NaN     NaN   
404    2.148623      2.968957                1.0    NaN     NaN   
405    2.137992      2.954267                1.0    NaN     NaN   

     Inflated_Value  
403        1.066684  
404        1.066684  
405        1.066684  

Finishing up

There should now be a column called Inflated_Value with inflated value numbers in “my_draft_pool_smoothed_adjusted_inflated.csv”.

Before we call it a day, let’s drop a couple unnecessary columns. In [25]:

# First, let's get rid of the Keepers, they're not going to be in our final draft pool.
for row in my_league_keeper_list.itertuples():
    if row.Keeper == True:
        my_league_keeper_list.drop([row.Index],inplace=True)

my_league_keeper_list.drop(['Keeper','Salary','Total_SGPs','Dollar_Value'], inplace=True,axis=1)
print(my_league_keeper_list.head())
     Class              Name   PA    R  HR  RBI  SB    OBP IP QS SV SO ERA  \
0   Hitter        Mike Trout  675  124  44  112  14  0.439  x  x  x  x   x   
4   Hitter    Cody Bellinger  640   99  42  115  12  0.385  x  x  x  x   x   
5   Hitter       Trea Turner  687  103  22   75  39  0.354  x  x  x  x   x   
6   Hitter  Francisco Lindor  692  112  35   95  22  0.354  x  x  x  x   x   
10  Hitter      Alex Bregman  658  106  32  103   6  0.396  x  x  x  x   x   

   WHIP  True_Dollar_Value  Inflated_Value  
0     x          51.206485       63.661140  
4     x          39.875465       49.574141  
5     x          38.546517       47.921961  
6     x          38.199899       47.491037  
10    x          35.666127       44.340991  

In [26]:

# Finally, we can export our clean data into a new file
#my_league_keeper_list.to_csv('Exports/my_draft_pool_smoothed_adjusted_inflated_clean.csv')

And, we’re done! Going through it the first time may take a while, but once you understand how to use it, you should be saving many hours every year. And there are a lot of ways you can further customize the notebook.

Here are some ideas:

  • Creating composite projections with multiple projection systems.
  • Adjusting the value distribution based on different strategies.
  • Handling for position scarcity and rules to make sure that enough players of every position are in the draft pool (e.g. catchers).

  1. ( team_historical_on_bases + (hitters['PA'] * hitters['OBP']) ) / \ ( first_x_hitter_plate_appearances + hitters['PA'] ) ) - historical_mean_obp) / my_league_history['OBP']) # Let's total up all of the SGPs in a single cell, with this completely clunky code sum_the_hitter_sgps = lambda row: ( row['R_sgp'] + row['HR_sgp'] + row['RBI_sgp'] + row['SB_sgp'] + row['OBP_sgp']) hitters['Total_SGPs'] = hitters.apply(sum_the_hitter_sgps,axis=1) # You should have a bunch of new columns now, ending in '_sgp', plus a column at the end with the SGP total print(hitters.head( 

  2. historical_mean_whip - ((team_historical_walks_plus_hits + (pitchers['IP'] * pitchers['WHIP'] 

  3. pitchers['IP'] / 9 * pitchers['ERA'] + team_historical_earned_runs) * \ (9 / ( pitchers['IP'] + estimated_mean_ip_per_team 

Jump to a specific topic:

Acting & Casting // AI // Art & Commerce // Books // Coding // Comedy // Culture & Ideas // Directing // Editing // Film Festivals // Humor & Satire // Marketing // Movies, Music, TV // Non-fiction // Places // Producing // Projects // Weird Hemingway // Work & Personal Development // Writing & Storytelling //