-
THEORETICAL RESULTS
To test the in-game win odds model, I used six years of data to develop the model and six years as a validation set to see how good of a job the model did when predicting on new data. Odd-numbered years (2003, 2005, 2007, etc) were used to build the model and the even-numbered years were used as the validation set.
The regression shown here predicted actual game outcomes from the in-game win odds results. As you can see here the p-value indicates statistical significance, while an estimate of 0.9925 indicates a strong positive correlation between the in-game win odds and the actual game outcomes. Specifically, this tells us that a 10% increase in the in-game win odds corresponds to a 9.9% increase in the teams' win probability for the validation set, which is nearly a one-to-one relationship.
[CLICK]
-
WP of different score-states across time (A)
To further illustrate the results of the in-game win odds model, here is a graph that shows the teams' predicted in-game win probabilities across five different score-states for when they have a first down with 10 yards to go at their own 20 yard line throughout the course of a game. This situation was chosen because it is the most common situation in football (n=11,423), nearly four times as many occurances as the second most common situation (n=3,079).
[CLICK]
-
WP of different score-states across time (B)
The five score-states are when a team is leading by two possessions, leading by one possession, tied, down by one possession and down by two possessions. The maximum number of points a team is able to score on a single possession is eight, so being up or down by two possessions is when they are up or down by nine or more points, one-possession is from one to eight points, and tied is when the score margin is zero.
This graph is extremely logical, which demonstrates the reliability of the in-game win odds model. The five lines never cross, and it shows that a team up by two possessions always has a higher win probability than being up by one possession, which is always better than being tied, and so on.
[CLICK]
-
WP reliability test across different validation sets (A)
As I mentioned before, the in-game win odds model was developed usign six years of data and validated using a separate six years of data. This was done in four different ways to ensure the best model was selected. The model was built using the first six years of data, the last six years, the odd-numbered years, and the even numbered years. In each case the validation set was comprised of the six other years.
[CLICK]
-
WP reliability test across different validation sets (B)
As you can see, all variations yielded similar results in terms of their parameter estimates, standard errors and p-values. I ended up choosing the odd years for the training set to help negate any underlying effects that could have directly or indirectly impacted the coaches' decision-making such as rules changes and the improved quality of kickers.
The same tests were conducted for the inclusion (or exclusion) or certain variables as well. A further example of this is shown below where you can see the results for when the model was developed with respect to the home team or with respect to the possession team.
[CLICK]
|
|