Good morning dear readers (i.e., dummies in my league and maybe one other person who found this by mistake),
Ever wondered whether day of the week has an effect on Fantasy Football data? I know I have! I’m going to chop up the data a little here and see if we can find anything interesting. I’ll take a look first at whether day and/or home vs. away has any effect on team score then I’ll look for any effect on fantasy DEF scores.
[Note that this is just 2015 (weeks 1 through 17) and 2016 (weeks 1 through 9) with the one Saturday (Christmas 2015) game removed. It includes the 3 Thanksgiving games, so that’s 3 out of 27 Thursday games in the last year and a half. It might throw things off a little if Thanksgiving is indeed a weird day, but I’d doubt I have enough statistics with just those 3 points to argue one way or another.]
Team score by day of the week
Above is the team score (not fantasy points, reality points) by day of the week. The means of the 3 days are slightly different (Thu = 22.6, Sun = 23.1, Mon = 20.4) so let’s do a quick 1-way ANOVA to see whether any of these are statistically different.
## Analysis of Variance Table ## ## Response: Score ##Df Sum Sq Mean Sq F value Pr(>F) ## day 2345 172.3071.8615 0.1561 ## Residuals 7737155092.562
The 1-way ANOVA test tells you whether the mean of at least one of your levels (i.e., Sun, Mon, or Thu) is significantly different from the overall mean. Even though the mean score of Monday games is about 2.5 points lower than Sun or Thu games that p-value of 0.1561 makes me think it’s not stastically significant. It also makes me realize I really need to nail down a p-value for significance. Some scientific fields (usually ones with living subjects and smaller data sets) tend to use higher p-values, so in that case a p-value below 0.2 might be considered significant, or at least it might be worth a further look.
Team score by Home/Away
##Welch Two Sample t-test ## ## data:filter(data_Def, HomeAway == "Home")$Score and filter(data_Def, HomeAway == "Away")$Score ## t = 2.7737, df = 773.65, p-value = 0.005676 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ##0.5581635 3.2614241 ## sample estimates: ## mean of x mean of y ##23.8659821.95619
KABOOM! Finally, a textbook example of statistical significance with no caveats! Teams score about 2 points less in away games than in home games (p = 0.0057). I showed in another post that this doesn’t have a significant effect on the DEF score, but it is interesting.
Team score by day of the week and home vs. away
Huh. Here I’ve split by home vs away for all 3 days (Sun, Mon, Thu). In all 3 days the home team is a little more likely to have a higher score, but on Monday and Thursday this effect is increased, making the home team even that much more likely to win on those days. I guess that’s why the expression starts with “any given Sunday…” and not “any given Thursday or Monday…”, right?
Definitely interesting in terms of predicting scores and which team will win, but what effect does the day of the week have on fantasy DEF scores?
DEF score by day of the week
None. No effect.
The mean DEF score for all 3 days is almost exactly the same. I’m not even bothering with an ANOVA test here given how close all the means are. There’s a hint of a higher peak on Monday than Sunday which probably leads to an increase in anecdotal evidence, but the statistical difference is negligible to nonexistent.
This is telling me that there is no effect of day of the week on fantasy DEF score. Definitely a helpful conclusion as it might let me ignore another potential factor from my models. However, just a minute ago we found that on Monday and Thursday the split between home and away scores increased. Let’s try that again with fantasy DEF scores:
DEF score by day and home/away
Sure enough, it looks like on Mondays and Thursdays team defenses might score about a point more when at home. Keep in mind, though, that these numbers come from smaller data sets (about 25 games each) and are likely not significant. In fact, we have a way to test for significance:
t-test for Thursday:
## ##Welch Two Sample t-test ## ## data:ThuHome and ThuAway ## t = 0.69595, df = 48.065, p-value = 0.4898 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ##-2.0988164.321038 ## sample estimates: ## mean of x mean of y ##7.2222226.111111
t-test for Monday:
## ##Welch Two Sample t-test ## ## data:MonHome and MonAway ## t = 1.0932, df = 45.749, p-value = 0.28 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ##-1.3802374.660237 ## sample estimates: ## mean of x mean of y ##7.686.04
So the t-test from Thursday reveals a difference in the means of home and away DEF points of 1.11 with a p-value of 0.49. The t-test from the Monday data has a slightly high effect size (1.64 points) and a slightly lower (more significant) p-value (p = 0.28), but I’d still hesitate to call either significant, especially once we correct for the family-wise error rate (FWER). Remember that we’re looking for lower p-values to show that there is actually a difference. If I just give up and admit that football players are people and not just weighted random number generators I’d probably accept a p-value of 0.2 for significance. A conservative correction for FWER (Bonferroni correction) would then say that I have to be lower than 0.2/3 or 0.067, which I’m very clearly not.
Teams in real life tend to do a little better at home, and this effect is increased a little for Thursday and Monday games. It might have something to do with players only getting 4 days between games, but that only explains Thursday while on Mondays they would actually have an extra day of rest. It may also be related to the effect of throwing off the players’ schedules. I remember reading something about there being an even larger difference when the travel is from East coast to West coast or vice versa, but I didn’t actually see the analysis of that data. I wonder if I could do that analysis, actually. I could bin teams into their time zones and see what happens when they play away games in other time zones…
Dammit Kevin! Focus!
As for fantasy football team defenses, however, I really don’t see a significant effect of day of the week. There’s maybe a little something to defenses doing better at home on Thursday or Monday games, but with only 25 games the effect size isn’t statistically significant.
Because I’m concerned about overfitting I doubt I’ll include day of the week in future models. Either the day of the week term or the home/away term alone wouldn’t actually be of any use, but if I included the day:HomeAway interaction term in the model I might tease out a little more accuracy for Monday or Thursday games. Then again, I might also just overfit. When I ultimately try to add this term to future models (I know I just said I wouldn’t use this term, but I’m definitely going to try it out sometime. Don’t you judge me) I’ll have to be very careful to make sure that the effect size is large enough to warrant the addition.
Some of these arguments about significance might be solved if I had more data, but since there are rule, strategy, and even equipment changes from year to year who knows what I’d find. 2015 and 2016 will have to suffice for now, though I’d love to pick up 2014 as well some time. In theory I’ll be writing a Methodology post one of these days to talk about how I got all this data and why I haven’t taken a few hours to collect and then tidy up 2014. Again, if anyone knows a good way to get a lot of fantasy football (or better yet, real football) data I’d love to know.
X’s and O’s,