## Games Playoffs Ranks Teams Players Goalies About |

The data used to predict each game includes each team's performance of a range of statistics in the season up until the date of the given game. Older games are given less weighting for the statistic. The weighting each game is given is linear. For example, when predicting the result of a team's 41st game of the season, the team's 40th game is given twice the weight as a team's 20th game. Several game weighting techniques were tested as part of building the model, from weighting each game equally to weighting recent games exponentially more. Also, using just the last 20 or 30 games data was evaluated in combination with each one of these methods. Ultimately, using full season to date data with a linear decay of game importance showed to have the most predictive power.

There are three main components of the win predictions model: the Home submodel, the Away submodel, and the Meta model:

The Home submodel uses only statistics describing the home team and predicts the likelihood the home team will win the game.

The Away submodel similarly uses only statistics describing the away team and predicts the likelihood the away team will win the game.

The Meta Model combines the Home submodel, Away model, home ice advantage, and each team's days of rest into one overall score.

Also, we use a simple 'Tie Game' model, which predicts the probability the game will go to overtime. This model is simply a function of the meta model score. The default Tie Game model score is 25%. For every 1% probability the Meta Model is away from a 50/50 odds game, the chance of the game going to OT goes down by 0.2%. For example, a game with 55/45 odds is given a 24% chance of going to OT. If a game goes to OT, we give each team equal odds of winning the game, whether in regular OT or a shootout.

The home team's overall odds of winning the game are then calculated as follows:

Home Team chances of winning In regulation = (1 - [Tie Game Model Score]) * [Meta Model Score]

Home Team chances of winning in OT = [Tie Game Model Score] * 50%

Home Team chances of winning = Chance of winning in regulation + chance of winning in OT

By using the 2015-2016 season as a test to see if the model works, the 15% of shots the model rated the highest contributed to over 50% of the goals that season:

In general, the shots with the highest goal probability are quick rebounds shots close to the net where there has been a large change in shot angle from the original shot:

1.) Shot Distance From Net

2.) Time Since Last Game Event

3.) Shot Type (Slap, Wrist, Backhand, etc)

4.) Speed From Previous Event

5.) Shot Angle

6.) East-West Location on Ice of Last Event Before the Shot

7.) If Rebound, difference in shot angle divided by time since last shot

8.) Last Event That Happened Before the Shot (Faceoff, Hit, etc)

9.) Other team's # of skaters on ice

10.) East-West Location on Ice of Shot

11.) Man Advantage Situation

12.) Time since current Powerplay started

13.) Distance From Previous Event

14.) North-South Location on Ice of Shot

15.) Shooting on Empty Net

The definition of a flurry adjusted expected goal is:

Flurry Adjusted Expected Goal Value = Chance of Not Scoring in Flurry Yet * Regular Expected Goal Value of Shot

Here's a video below using an example from the Boston Bruins vs. Ottawa Senators game on March 6th, 2017. On the first shot Bruins have a 33% chance of scoring. That means there's a 67% chance of having not scored after the first shot. The rebound shot has an 82% chance of being a goal, thanks to it being a 77° change in direction from the 1st shot. For the rebound shot, the expected goal value of it is multiplied by 0.67 to get its adjusted expected goal value. Instead of being worth 0.82 expected goals, the rebound shot is worth 0.55 expected goals. The flurry adjusted expected goal value of the whole flurry is 0.88 instead of 1.15 for regular expected goals. The flurry adjusted metric has the nice attribute of it not being possible to have more than 1.0 flurry adjusted expected goals in one flurry. This video is also an example of the limitations of expected goals, as the slap shot was recorded to be closer to the net than it actually was, increasing its expected goal value.

We can also calculate the expected goals that are likely to come from a rebound of a shot. This metric is called 'expected goals of expected rebounds' (xGoals of xRebounds). The rebound shot does not need to be taken by the same player. In fact, the rebound does not need to actually even occur. The shot just needs to have attributes that are more likely to generate a rebound. As there is a lot of luck in getting a rebound or not, this metric credits players who have shots that are likely to produce rebounds in general.

Expected Goals Of Expected Rebounds = Probability of the Shot Generating a Rebound * The Expected Goals of The Possible Rebound Shot

Some shots actually have a higher xGoals of xRebounds than the xGoals of the shot itself. These are usually shots that occur far from the net by defensemen.

By combining xGoals from non-rebound shots and xGoals of xRebounds, we can create a metric called 'Created Expected Goals'. This metric attempts to give credit to the player who does the work generating the xGoals. Compared to the xGoals metric, it punishes players who just feed on the rebounds of other's shots. Defensemen tend to do better in this metric than xGoals, while some centres often due worse. While we cannot accurately always assign credit for 'creating' an xGoal, this metric tries to make it more fair than just giving all the credit to the shooter. xGoals from rebounds are given no direct credit in this metric. Rather, credit is given to players who take shots that are likely to generate juicy rebounds.

Created Expected Goals = xGoals of Non-Rebound Shots + xGoals of xRebounds

By leveraging the season simulations in the event of each of a regulation win, regulation loss, OT loss or OT win, we can see the impact of playoff odds in real time as the odds of different outcomes of the game change.

Below is a graph of the % of time the team that were considered the 'favorites' ended up winning the game. Before games start we can predict ~57% correctly. As games continue our confidence gradually goes up until ~89% at the end of regulation. Of the games that go to OT, the outcome is basically a coinflip.

For the 2015-2016 season, the MoneyPuck model had the Penguins as the most likely to win the Stanley Cup from March onwards. This was partly due to them having the likely match-up of the New York Rangers in the first round, which greatly improved their Cup chances.

#Penguins now have highest Cup odds. #LAKings still best team but have harder path to finals https://t.co/Xm8baqGqGI pic.twitter.com/apeEpiTlO7

— MoneyPuck.com (@MoneyPuckdotcom) March 27, 2016

How our playoff odds did this season. Had all the playoff teams right at Christmas except for the #LAKings. Sorry Kings! pic.twitter.com/sXMXE3VDrP

— MoneyPuck.com (@MoneyPuckdotcom) April 9, 2017

A data dictionary which explains all of the columns in the datasets can be downloaded here.

Data is available summarized on the season level and on a game by game level going back to 2008-2009. Season level data is below:

## Year |
## Skaters |
## Goalies |
## Lines/Pairings |
## Team Level |
---|---|---|---|---|

## 2008-2009 |
||||

## 2009-2010 |
||||

## 2010-2011 |
||||

## 2011-2012 |
||||

## 2012-2013 |
||||

## 2013-2014 |
||||

## 2014-2015 |
||||

## 2015-2016 |
||||

## 2016-2017 |
||||

## 2017-2018 |
||||

## 2018-2019 |

You can also download all game level data for all teams for all seasons in one file here.

## Atlantic |
## Metro |
## Central |
## Pacific |
---|---|---|---|

A full description of all the variables and more details can be found in the data dictionary. The data is in csv files contained within zip files.

Download the Files Below:

All Past Seasons (2007-2017 Seasons) (1,173,844 shots)

Recent Seasons (2010-2017 Seasons) (846,683 shots -Recommended For Excel Users)

2007-2008 Season (106,243 Shots)

2008-2009 Season (110,023 Shots)

2009-2010 Season (110,895 Shots)

2010-2011 Season (111,405 Shots)

2011-2012 Season (108,753 Shots)

2012-2013 Season (66,087 Shots)

2013-2014 Season (110,682 Shots)

2014-2015 Season (109,627 Shots)

2015-2016 Season (109,461 Shots)

2016-2017 Season (110,953 Shots)

2017-2018 Season (119,715 Shots)

To determine what is the optimal time to pull the goal, we simulate tens of millions of games in scenarios where a team is trailing with 500 seconds left in the game. Before each simulation starts we decide what time we'll pull the goalie if still losing at that time. We then calculate the average number of points the trailing team got in the game based on their strategy. (2 Points for a regulation win, 1.5 points if the game went to OT, and 0 points if they lost in regulation). The goalie pull time where the trailing team gets the maximum number of points on average is determined to be the optimal time to pull the goalie.

For example, the graph below shows the expected points a home team trailing by one goal with 500 seconds left in the game is expected to get in the game based on different goalie pulling strategies. If they pull their goalie immediately they'll on average get 0.40 points in the game and if they never pull their goalie they get 0.36 points on average. The optimal time is with 231 seconds left in the game where get 0.46 points on average.

Here is summary of the optimal goalie pull times:

Home Team Down By One Goal: 231 Seconds Left (3:51 remaing)

Home Team Down By Two Goals: 329 Seconds Left (5:29 remaing)

Away Team Down By One Goal: 268 Seconds Left (4:28 remaing)

Away Team Down By Two Goals: 427 Seconds Left (7:07 remaing)

Overall, these times are significantly sooner than most teams pull their goalie, though teams have been getting more aggressive in recent years. Coaches may be discouraged to pull their goalie so soon as the average incremental benefit to the team is small compared to risk of looking foolish in the likely scenario the strategy does not work out. Also, teams may not be factoring in the incentive of drawing penalties which Beaudoin and Swartz found. However, by pulling their goalies just 30 seconds sooner than usual teams could reap most of the upside from the more aggressive strategy.

The optimal pull time for home teams is less than the time for away teams as scoring rates for home teams at 5-on-5 are higher than for away teams. Also, the goalie pull bot also makes slight adjustments depending on the relative strength of teams going into the game, which usually only has a few seconds impact on the recommended pull time. The Pull Bot can be followed @ThePullBot on Twitter.

## Atlantic |
## Metro |
## Central |
## Pacific |
---|---|---|---|

Tweet