Thesis Proposal (all)

  1. TITLE FRAME
    Good morning, for those of you that don't know me, my name is Weller Ross and I am a masters student in sport management, studying under Dr. Kevin Mongeon. Today I am proposing the topic of my thesis research, titled "An Examination of Decision-Making Biases On Fourth Down in the NFL”

    [CLICK]
  2. OUTLINE
    I will start by discussing the motivation and will then go over the related research before discussing the methodology and reviewing the data set that I will be working with. At that point I will provide some concluding remarks and open it up to questions.

    [CLICK]
  3. MOTIVATION 1A
    When I came to Brock I had some ideas about what I could potentially do for a thesis project and once I started talking with Kevin and learning more from him, that list continued to grow. Basically all of these thoughts and ideas that I had could fall under one umbrella of how to evaluate teams' performance.
  4. MOTIVATION 1B
    Breaking that down a little further, the real question was what impacts a team's ability to win? To answer that question, I needed to think about who the people are that influence the outcome of a game and why teams perform differently. This can be addressed with a wide variety of answers. The first and most obvious answer is the players, who certainly impact the outcome of a game.
  5. MOTIVATION 1C
    What about the refs? They can impact the outcome of a game, and there is certainly research being done on officiating biases, and of course the fans of the losing team will gladly tell you just how much of an impact the refs had on the outcome of a game. Speaking of the fans, do they influence the outcome of a game? They seem to think so and there has been research done on the impact of home field advantage, but does that impact have more to do with the fans or with the team not having to travel and play in an unfamiliar location/climate/stadium/etc?
  6. MOTIVATION 1D
    What about the front office personnel? The owner? The general manager? Team doctors? How about the marketing departments? Event management? Stadium ops? We're getting further and further away from the action (so to speak), but the point I'm trying to illustrate is that players are just a small component of the overall process. There's more to sport analytics than just player evaluation, despite what Bard Pitt and Jonah Hill may have led you to believe in Moneyball. Player evaluation gets the most attention because it's what's “sexy.”
  7. MOTIVATION 1E
    People want to know about who got the better deal in a headline trade, or if a team over-payed for their big free agency acquisition, and which players their team should hope to land in the draft, but there is so much more to examine beyond the quality of players. What I will be focusing on is decision-making. Some of you may have noticed that there is a person (or a group of people) -a role- that I failed to mention when going through those who could influence the outcome of a game: the coaches.

    [CLICK]
  8. MOTIVATION 2A
    The argument could be made that coaches' decisions have the largest impact on game outcomes. At the very least, most people would agree that the coaches and their decisions would rank pretty high on the list, especially in key moments and high-pressure situations. An example of a situation like this in football is when a coach must make the decision of whether to kick or go-for-it on fourth down. This is the decision-making process that I will be evaluating.
  9. MOTIVATION 2B
    I will be finding out whether or not coaches in the National Football League succumb to a subconscious psychological bias when making decisions. This bias is called the representativeness heuristic. So what is the representativeness heuristic? It's part of Prospect Theory, which in short, explains why people make the decisions that they do. Prospect Theory is comprised of four well-established psychological biases (or heuristics) and the representativeness heuristic focuses on people overweighting new information relative to prior information.
  10. MOTIVATION 2C
    A very simplified example of this is if somebody tells me they are going to flip a coin 10 times and have me make a guess before each flip and no matter what I guess they flip heads. It might be after three heads in a row or five, but eventually I will just start guessing heads. I knew going into it that the probability of a coin landing on heads was 0.5 but the new information made me biased toward picking heads. This is the bias that I will be focusing on.
  11. MOTIVATION 2D
    That example illustrates the representativeness heuristic on a conscious level. I KNEW that I decided to start picking heads because of the recent turn of events. That's why I used it as an example, because with it being on a conscious level, it's something people can easily relate to, but it reflects what happens on a subconscious level all the time, and it regularly impacts the decisions that people make on a daily basis without them even realizing it.
  12. MOTIVATION 2E
    This research will be useful for general managers' evaluations because if general managers and other high-level decision-makers are more aware of this bias then it could aid them in their decisions when choosing to hire and/or fire a coach. Beyond just general managers and high-level decision-makers, representativeness impacts everybody so if HR people are made aware of these issues it could help them as well, because coaches and players are employees too.

    [CLICK]
  13. PREVIOUS RESEARCH 1A
    In the previous literature I read that focused on fourth down decision-making in the NFL, all of the researchers came to the same conclusion: that coaches make suboptimal decisions by acting too conservatively and opting to kick on fourth down more often than they should.
  14. PREVIOUS RESEARCH 1B
    They speculated that coaches could be profit-maximizing rather than win-maximizing, or that they might just be systematically imperfect maximizers, or that they might prefer to lose as a result of playing it safe rather than lose from the result of taking a gamble.
  15. PREVIOUS RESEARCH 1C
    The problem was that none of them were able to figure out why they were making suboptimal decisions. Instead, the researchers had to make inferences from the contexts they analyzed rather than being able to directly test for specific causes.

    [CLICK]
  16. PREVIOUS RESEARCH 2A
    I included a few quotes that I thought best captured this point. The first is from David Romer when he stated:

    “there is little evidence about whether conservative behaviors arise because individuals have nonstandard objective functions or because they are imperfect maximizers.”

    He's saying they act conservatively but there's little evidence as to WHY this is happening.
  17. PREVIOUS RESEARCH 2B
    Carter & Machol wrote:

    “We believe the reason for this paradox is that coaches do not have sufficient intuitive feel for the negative value imposed on the opposition.”

    Notice that they say that they "BELIEVE" this is the reason, but they weren't able to directly test for it.

    [CLICK]
  18. PREVIOUS RESEARCH 3A
    These two quotes are both from Soham Patel, who wrote:

    “individuals are more sensitive to losses than to gains. In football terms, a coach might find the disutility of a play allowing the opponent to score points surpasses the utility of a play that allows his own team to score.”
  19. PREVIOUS RESEARCH 3B
    He later states:

    “coaches might value losses of a play higher than they would the corresponding gains of a play”

    Again, he says they "MIGHT" be doing this. Patel actually mentions Prospect Theory specifically in his paper, but he states it as a quote-“potential” reason for the suboptimal decisions.
  20. PREVIOUS RESEARCH 3C
    All of the previous researchers were handcuffed by their methodologies, which they were able to use to determine whether or not coaches were making optimal decisions, but limited them to only make guesses as to why coaches were making those suboptimal decisions. They were, of course, educated guesses and well-informed guesses, but guesses nonetheless.

    [CLICK]
  21. THEORETICAL MODEL 1A
    For my research, I will use a Bayesian approach, which provides the flexibility to keep the prior odds separate from the conditional likelihood. This will make it possible to measure how much weight the coaches are giving new information compared to the original information when making decisions, in turn allowing me to directly test for the representativeness heuristic.
  22. THEORETICAL MODEL 1B
    This is Bayes' rule in odds form.

    On the left hand side of the equation we have what is called the posterior, which in this circumstance is the in-game odds of team C winning the game given the fourth down decision that the coach made.
  23. THEORETICAL MODEL 1C
    The first term on the right hand side is called the prior, which is the odds of team C winning the game given the game-state, while the second term is called the conditional likelihood, which is the inverse conditional odds of team C making that decision given that they won the game.

    In other words these two components are the original information (the coin being a 50|50 chance of landing on heads) and the new information (the fact that the first five tosses happened to land on heads).
  24. THEORETICAL MODEL 1D
    It's the second term on the right-hand side (the conditional likelihood) that can be hard to wrap your mind around, because it's rather counter-intuitive. You aren't looking at the odds of winning the game, you're looking at the odds of that decision being made GIVEN that they actually win the game.

    [pause here – let it sink in]

    So you take teams that have been in similar situations in other games and found how likely (or unlikely) they were to make that decision given that they did or didn't win the game.

    [CLICK]
  25. EMPIRICAL MODEL 1A
    In this frame, we have the equation for the testing the hypothesis. In this equation Beta-One represents the parameter estimates for prior information while Beta-Two represents the parameter estimates for the new information. To test for representativeness we will compare those two numbers.
  26. EMPIRICAL MODEL 1B
    As you can see down here, the null hypothesis is that coaches are equally weighting the two components and are therefore not guilty of the representativeness heuristic. The first alternative hypothesis is that they are under-weighting new information, while the second alternative hypothesis is that they are over-weighting new information, which indicates the presence of the representativeness heuristic in their decision-making process.

    [CLICK]
  27. PRIOR PRE-DECISION PROBABILITY 1A
    If this frame looks similar it's because we are again utilizing Bayes' rule, but this time it is with the posterior as the PRE-DECISION odds of team C winning the game given the game-state, which is shown on the left-hand side of the equation.
  28. PRIOR PRE-DECISION PROBABILITY 1B
    On the right-hand side we again have the prior and conditional likelihood (original and new information). The first term is literally the prior (pre-game) odds of team C winning the game while the second term is the odds of team C being in that game-state given that they ended up winning the game.
  29. PRIOR PRE-DECISION PROBABILITY 1C
    The next step is to calculate these two components. Down here you can see that to estimate the first component (the prior) we will be using a logistic regression to predict game outcomes from closing point spreads.
  30. PRIOR PRE-DECISION PROBABILITY 1D
    For the second component we will use a multinomial logistic regression to predict the probability of the team being in that situation given that they won the game. In this case X-Prime represents the group of variables that make up the game-state: the score margin, time remaining in the game, field position with respect to the offensive team, current down, yards to go to gain a new set of downs, and an indicator variable for possession with respect to the home team.

    [CLICK]
  31. GRAPH (WIN PROB from POINT SPREAD)
    This graph illustrates the results from the equation used to estimate the prior. On the y-axis we have the estimated probability that the team will win the game given the closing point spread, which is on the x-axis.

    Basically, I'm using this to tell us the probability of a team winning a game based on what the closing point spread was. For example if we have a team that is favored by three points (-3), we can see that the probability that they win the game is almost exactly 0.6 or a 60% chance.

    [CLICK]
  32. 3D GRAPH (WINNERS)
    These next two frames show 3D graphs to illustrate the number of times a score margin occurred across the minutes remaining in a game, given that the home team won or lost the game. This first one is for teams that won. So you have minutes remaining on the x-axis with the score margin on the y-axis and the number of occurrences on the z-axis.

    Notice how it spreads to the right where the positive score-margins are as the time remaining in the game gets closer to zero.

    [CLICK]
  33. 3D GRAPH (LOSERS)
    You can see that this graph does the opposite by spreading to the negative score-margins as time goes on. That's because this graph is the exact same as the previous one but is for teams that lost rather than won.

    These graphs aren't actually showing the results of the conditional likelihood equation, because in reality, a graph representing that equation would require seven dimensions instead of just three. I generated these graphs because I wanted to at least be able to give you an idea for what the spread of the data looks like with respect to two of the variables across wins and losses.

    [CLICK]
  34. DATA 1A
    Speaking of the data, these graphs and the research I am conducting is all done using a data set comprised of readily available NFL play-by-play data for every regular season game that has been played in the past 13 years, other than this most recent season, because I started the research before that season had started.
  35. DATA 1B
    That gives us data for every year from 2003 through 2014. Those 12 seasons provided every play from 3,072 games for a total of 468,699 observation.

    [CLICK]
  36. NEXT STEPS
    The next steps for this research will be to generate the estimations for the conditional likelihood regressions and the hypothesis test, to consider potential extensions of this study, such as looking at individual coaches or specific characteristics of coaches (like how many years of experience they have) and make the findings more consumable and put into a format that would make the information more implementable for teams.

    Thank you for listening and I will now open it up to questions.

    [CLICK]
Author
wellerross
ID
318215
Card Set
Thesis Proposal (all)
Description
script
Updated