Data scientist, physicist, and fantasy football champion

DEF Home vs Away: Does it matter?

DEF: Home vs Away

Kevin

November 12, 2016

Hello dataphiles (dataphiliacs?),

I haven’t posted much about how I’m actually collecting the data or scoring it or anything, but I’ll get to that later. What I want to talk about right now is whether a team defense being home or away has any effect on its weekly fantasy score. Conventional wisdom says that it does and that teams do better at home. But is that true and how do we tell?

One way to tell is by actually comparing the data for the past year and a half and seeing if there are any consistent differences between playing at home or away. To this end I have collected football scoring data from the 2015 and 2016 seasons. As an aside: if anyone knows a fast, easy way to get more data please let me know. I’ve been using import.io and various statistics sites, and it’s fine but it’s still a little time consuming. I could write something myself, but that would take even more time, so if anyone knows of a way to get the data quickly and cheaply please let me know.

Let’s start first with the quickest cut of the data that you can do.

 

Well that’s… what is that? Those really look almost exactly the same to me. There are a few Home outliers higher than the Away ones, but that’s about it. Then again, “those look similar” isn’t super scientific. So let’s talk about what tests we can do to determine if two distributions (or means) are the same.

A very common test is the t-test. It tests whether the means of two distributions are the same. Imagine that you had two normal distributions, x and y, each with 40 points with the given means (vertical lines):

 

Obviously the two lines aren’t on top of each other and the means are different, but are they statistically significantly different? By which I mean could the difference in means just be due to random noise? To answer that we use the t-test.

## 
##Welch Two Sample t-test
## 
## data:filter(normdf, dist == "x")$val and filter(normdf, dist == "y")$val
## t = -2.6061, df = 77.855, p-value = 0.01097
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##-0.9317785 -0.1247028
## sample estimates:
##mean of xmean of y 
## 0.09202618 0.62026684

That code output is from R (awesome statistical software that does most of this work for me). It says tha thte mean of distribution x is 0.092 and the mean of y is 0.62. The relevant number that I want you to look at is the “p-value”, here it’s p = 0.01097. The p-value is the probability that you would see means with that size difference given that the two sample distributions (x and y) are actually from the same distribution. A low p-value means that it is unlikely (a 1% change here) that x and y are actually the same. Therefore, we conclude that they are actually different.

This takes a few steps, but it’s worth it. If I repeated the experiment with fewer points, let’s say 10 for each distribution, we would get the following:

 

## 
##Welch Two Sample t-test
## 
## data:filter(normdf, dist == "x")$val and filter(normdf, dist == "y")$val
## t = -1.4727, df = 16.469, p-value = 0.1597
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##-1.50221690.2689325
## sample estimates:
## mean of x mean of y 
## 0.1322028 0.7488450

Okay, the difference in the means is still about the same size (mean(x) = 0.13, mean(y) = 0.75), but now the p-value is much higher (p = 0.1597). So the odds that you could just randomly pull 20 values from the same distribution, split it into two sets of 10, and get the values above are just about 16%. That’s not great. In science we commonly look for p-values below 0.05 to judge statistical significance, meaning that there’s only a 5% chance that the means are actually the same. Let’s revisit that Home/Away data set again:

 

## 
##Welch Two Sample t-test
## 
## data:filter(data_Def, HomeAway == "Home")$Totalfpts.number and filter(data_Def, HomeAway == "Away")$Totalfpts.number
## t = 0.73422, df = 775.99, p-value = 0.463
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##-0.52489151.1521409
## sample estimates:
## mean of x mean of y 
##6.8894606.575835

The means are on top of each other and the p-value is 0.463. Somewhat against conventional wisdom there is no stastical difference between teams playing at home or away. Caveats: remember that this is fantasy footbal scores that we’re talking about. Teams may be more likely to win at home or just generally “play better”, but as far as fantasy scores are concerned, overall there isn’t much of a difference.

“But wait Kevin,” you say, “I love team [X] and they always play better at home. Teams [32 - X] are all a mess, but team [X] on its own isn’t. What about them?” Well Hypothetical Football Fan, let’s break it out a little more:

 

(First of all, shout out to ggplot2, a package for R that makes these beautiful figures so easily. 4 lines of code for that figure! And the colors, Duke, the colors!)

OK, now we’re on to something here. It looks like DAL, SEA, and DEN might actually play a little better away while NYJ and to a lesser degree LA and CLE might play better at home. If Hypothetical Football Fan was talking about the Jets then he/she may have a point. Let’s look at those p-values again:

##Teams pval signif0.1
## 1DAL 0.01365525TRUE
## 2NYJ 0.04591227TRUE
## 3JAC 0.06261760TRUE
## 4SEA 0.07438312TRUE
## 5OAK 0.12329439 FALSE
## 6ATL 0.14402383 FALSE
## 7 SF 0.20161027 FALSE
## 8DEN 0.21359620 FALSE
## 9PIT 0.21943126 FALSE
## 10 CLE 0.27745805 FALSE
## 11TB 0.32464553 FALSE
## 12 HOU 0.35624119 FALSE
## 13 BAL 0.40248681 FALSE
## 14 IND 0.44427961 FALSE
## 15LA 0.45435839 FALSE
## 16 WAS 0.47550466 FALSE
## 17 CIN 0.48891861 FALSE
## 18 MIN 0.48913982 FALSE
## 19KC 0.51300888 FALSE
## 20 TEN 0.53812159 FALSE
## 21NE 0.61239688 FALSE
## 22 BUF 0.63870379 FALSE
## 23 ARI 0.64745226 FALSE
## 24 CAR 0.67633961 FALSE
## 25NO 0.71138942 FALSE
## 26 DET 0.71472802 FALSE
## 27GB 0.72155855 FALSE
## 28 PHI 0.76834697 FALSE
## 29 NYG 0.77276710 FALSE
## 30 MIA 0.90139795 FALSE
## 31SD 0.92565653 FALSE
## 32 CHI 0.95271840 FALSE

So if we define significance as p values less than 0.1 (let’s say 0.1 because we’re dealing with humans and because I said so, dammit) then DAL and NYJ both have p-values less than 0.05 and JAC and SEA both less than 0.1. They may actually play better or worse at home, right?

BUT WAIT! There’s another holdup (because of course there is). Remember that a statistically significant p value of 0.1 means that if you broke up a data set into smaller groups you could randomly expect 10% of them to be statisticlly significant (illustrated wonderfully in this XKCD comic). This is called the family-wise error rate (FWER) and we need to take it into account before we start declaring things significant all willy-nilly.

I’m losing steam here, so let’s just look at two: Bonferroni correction and B-H correction. In the Bonferroni correction you take the significant p-value (let’s say 0.1) and divide it by the number of groups (32 teams here) to get the new p-value for significance p = 0.0031. You can also try the B-H correction where you put the p-values in order and vary the p-value for significance by the rank.

##Teams pval signif0.1 BonferonniBH
## 1DAL 0.01365525TRUEFALSE FALSE
## 2NYJ 0.04591227TRUEFALSE FALSE
## 3JAC 0.06261760TRUEFALSE FALSE
## 4SEA 0.07438312TRUEFALSE FALSE
## 5OAK 0.12329439 FALSEFALSE FALSE
## 6ATL 0.14402383 FALSEFALSE FALSE
## 7 SF 0.20161027 FALSEFALSE FALSE
## 8DEN 0.21359620 FALSEFALSE FALSE
## 9PIT 0.21943126 FALSEFALSE FALSE
## 10 CLE 0.27745805 FALSEFALSE FALSE
## 11TB 0.32464553 FALSEFALSE FALSE
## 12 HOU 0.35624119 FALSEFALSE FALSE
## 13 BAL 0.40248681 FALSEFALSE FALSE
## 14 IND 0.44427961 FALSEFALSE FALSE
## 15LA 0.45435839 FALSEFALSE FALSE
## 16 WAS 0.47550466 FALSEFALSE FALSE
## 17 CIN 0.48891861 FALSEFALSE FALSE
## 18 MIN 0.48913982 FALSEFALSE FALSE
## 19KC 0.51300888 FALSEFALSE FALSE
## 20 TEN 0.53812159 FALSEFALSE FALSE
## 21NE 0.61239688 FALSEFALSE FALSE
## 22 BUF 0.63870379 FALSEFALSE FALSE
## 23 ARI 0.64745226 FALSEFALSE FALSE
## 24 CAR 0.67633961 FALSEFALSE FALSE
## 25NO 0.71138942 FALSEFALSE FALSE
## 26 DET 0.71472802 FALSEFALSE FALSE
## 27GB 0.72155855 FALSEFALSE FALSE
## 28 PHI 0.76834697 FALSEFALSE FALSE
## 29 NYG 0.77276710 FALSEFALSE FALSE
## 30 MIA 0.90139795 FALSEFALSE FALSE
## 31SD 0.92565653 FALSEFALSE FALSE
## 32 CHI 0.95271840 FALSEFALSE FALSE

All those FALSEs are telling me that nothing here is actually significant. Finally, let’s look at a linear model with Team and Home/Away terms:

## 
## Call:
## lm(formula = Totalfpts.number ~ (Team * HomeAway), data = data_Def)
## 
## Residuals:
##Min 1Q Median 3QMax 
## -13.7692-3.8333-0.7596 2.875023.2500 
## 
## Coefficients:
##Estimate Std. Error t value Pr(>|t|)
## (Intercept)9.1818 1.7369 5.286 1.66e-07 ***
## TeamATL -4.7972 2.3600-2.0330.04245 *
## TeamBAL -2.6818 2.4046-1.1150.26510
## TeamBUF -1.5664 2.3600-0.6640.50706
## TeamCAR -0.1818 2.4046-0.0760.93975
## TeamCHI -4.2652 2.4046-1.7740.07653 .
## TeamCIN -2.8485 2.4046-1.1850.23657
## TeamCLE -6.1049 2.3600-2.5870.00988 ** 
## TeamDAL -2.5985 2.4046-1.0810.28023
## TeamDEN2.9015 2.4046 1.2070.22797
## TeamDET -3.1818 2.3600-1.3480.17800
## TeamGB-2.8182 2.4563-1.1470.25163
## TeamHOU -2.8182 2.4563-1.1470.25163
## TeamIND -2.1049 2.3600-0.8920.37274
## TeamJAC -7.4318 2.4046-3.0910.00207 ** 
## TeamKC 0.2348 2.4046 0.0980.92222
## TeamLA-2.3485 2.4046-0.9770.32907
## TeamMIA -3.5455 2.4563-1.4430.14935
## TeamMIN1.5682 2.4046 0.6520.51451
## TeamNE-2.4318 2.4046-1.0110.31221
## TeamNO-4.6818 2.4046-1.9470.05192 .
## TeamNYG -2.5985 2.4046-1.0810.28023
## TeamNYJ -5.1818 2.3210-2.2330.02589 *
## TeamOAK -2.4126 2.3600-1.0220.30698
## TeamPHI -1.1049 2.3600-0.4680.63980
## TeamPIT -3.3485 2.4046-1.3930.16420
## TeamSD-3.1818 2.3600-1.3480.17800
## TeamSEA1.7348 2.4046 0.7210.47086
## TeamSF-6.8182 2.4563-2.7760.00565 ** 
## TeamTB-2.5985 2.4046-1.0810.28023
## TeamTEN -4.0985 2.4046-1.7040.08874 .
## TeamWAS -1.5985 2.4046-0.6650.50642
## HomeAwayHome 1.5874 2.3600 0.6730.50139
## TeamATL:HomeAwayHome 1.7780 3.2996 0.5390.59016
## TeamBAL:HomeAwayHome-3.1707 3.3317-0.9520.34157
## TeamBUF:HomeAwayHome-2.7861 3.2996-0.8440.39874
## TeamCAR:HomeAwayHome-0.2541 3.3317-0.0760.93923
## TeamCHI:HomeAwayHome-1.6707 3.3317-0.5010.61619
## TeamCIN:HomeAwayHome-0.2541 3.3317-0.0760.93923
## TeamCLE:HomeAwayHome 0.9190 3.2996 0.2790.78070
## TeamDAL:HomeAwayHome-5.2541 3.3317-1.5770.11524
## TeamDEN:HomeAwayHome-4.9784 3.2996-1.5090.13179
## TeamDET:HomeAwayHome-2.3374 3.2996-0.7080.47893
## TeamGB:HomeAwayHome -0.7972 3.3375-0.2390.81128
## TeamHOU:HomeAwayHome 1.5105 3.3375 0.4530.65099
## TeamIND:HomeAwayHome-3.0810 3.2996-0.9340.35075
## TeamJAC:HomeAwayHome 3.4959 3.3317 1.0490.29440
## TeamKC:HomeAwayHome0.7459 3.3317 0.2240.82291
## TeamLA:HomeAwayHome0.4126 3.3317 0.1240.90148
## TeamMIA:HomeAwayHome-1.2238 3.3375-0.3670.71397
## TeamMIN:HomeAwayHome-3.5041 3.3317-1.0520.29327
## TeamNE:HomeAwayHome -0.4207 3.3317-0.1260.89954
## TeamNO:HomeAwayHome -2.3374 3.3317-0.7020.48318
## TeamNYG:HomeAwayHome-2.3374 3.3317-0.7020.48318
## TeamNYJ:HomeAwayHome 2.2308 3.3101 0.6740.50057
## TeamOAK:HomeAwayHome-4.0233 3.2996-1.2190.22312
## TeamPHI:HomeAwayHome-0.6643 3.3375-0.1990.84228
## TeamPIT:HomeAwayHome 1.5793 3.3317 0.4740.63564
## TeamSD:HomeAwayHome -1.7541 3.2996-0.5320.59517
## TeamSEA:HomeAwayHome-4.9207 3.3317-1.4770.14013
## TeamSF:HomeAwayHome0.8951 3.3375 0.2680.78862
## TeamTB:HomeAwayHome -3.7541 3.3317-1.1270.26021
## TeamTEN:HomeAwayHome-2.5938 3.2996-0.7860.43207
## TeamWAS:HomeAwayHome-2.6707 3.3317-0.8020.42304
## ---
## Signif. codes:0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.761 on 714 degrees of freedom
## Multiple R-squared:0.1402, Adjusted R-squared:0.06438 
## F-statistic: 1.849 on 63 and 714 DF,p-value: 0.0001301

Just look at the p-values for the TeamXXX:HomeAwayHome terms. Again, none statistically significant with p < 0.1. DAL, DEN, and SEA are close (p < 0.2), but not quite there. Now we can try and get a little weird and try a mixed effects model (pretty cool tutorial here) which lets me sort of compartmentalize the errors.

## Linear mixed model fit by REML ['lmerMod']
## Formula: Totalfpts.number ~ HomeAway + (1 | Team)
##Data: data_Def
## 
## REML criterion at convergence: 4961.9
## 
## Scaled residuals: 
## Min1QMedian3Q Max 
## -2.2680 -0.6944 -0.18340.53274.4579 
## 
## Random effects:
##Groups NameVariance Std.Dev.
##Team (Intercept)2.455 1.567 
##Residual 33.108 5.754 
## Number of obs: 778, groups:Team, 32
## 
## Fixed effects:
##Estimate Std. Error t value
## (Intercept)6.5880 0.402416.371
## HomeAwayHome 0.2992 0.4129 0.725
## 
## Correlation of Fixed Effects:
## (Intr)
## HomeAwayHom -0.513

Mixed effects models are new to me and I have a personal history of misapplying them or just misinterpreting them. If anyone knows how to properly interpret this, how to properly code the fit I’m trying to do, or whether I should even be using a mixed effectst model then please let me know. I always feel like there’s some stats magic in these things, but I’ve never successfully applied it in a useful way.

Conclusions

I can’t find any statistical significance here. It’s a little weird given that teams do tend to score more at home and win more at home. I took a quick look at that and it looks like teams tend to score about 2 more points at home (and subsequently win more) and that that effect is statistically significant (p = 0.0067). That’s another analysis for another time though. For now it seems like a team being home or away shouldn’t really concern you too much about whether their fantasy DEF will produce a good fantasy score that week. They’ll probably help them win and score a little more, but that may not translate to DEF scores.

I have 2 DEF models that I’ve been looking at the last few weeks, one of which uses Home/Away as a factor and one which doesn’t. The model that includes Home/Away has been a little better in weeks 8 and 9, but the effect size is small. It may be that including it has just helped for 2 weeks and won’t continue to do so, but it’s something to keep an eye out for. Adding too many non-significant factors can lead to overfitting, but at least for now this effect is so marginal that it doesn’t make a huge difference either way.

Effect of day of the week and home vs away on DEF scores