I’ve been dreading writing this. For one, nobody likes reliving their failures. For another, my models were really dull this year and hardly ever beat Yahoo or FantasyPros so I can’t picture anyone even caring about a rehashing. For a third, I didn’t have a good way to figure out how to automatically predict for the kicker that was playing each week (there were a few roster changes throughout the year), so automating an entire year is nontrivial. So here we are: a boring, difficult, rehashing of my year-long failure to predict kickers. I may try to make this one quick, so hang on to your hats.
Let me see if I can just get the accuracy for one week first. This will be model D from last year. It used data from 2015 and 2016 and included terms for:
- Predicted team score
- Predicted opponent score
- Which team is the home team (stadium)
- The player
- The opponent
##  "Model D accuracy for week 2 = 41"
Good, that worked. What you didn’t see was me trying to get model C (2016 only) working. Because I couldn’t get that. It kept failing when a new kicker came in. It’ll still have that problem when I go to run every week from 2016, but that’s slightly easier to figure out.
A quick reminder about accuracy: the accuracy score is actually more of an inaccuracy score where the lowest score is the best.
Let me try and run this for weeks 2 through 17 and see how my accuracy changes throughout the year:
Yeah, that’s about what I thought. The model may have even gotten just a little bit worse over the year, though if you look at the confidence intervals (the grey ribbon) it’s not clear that the model improved or worsened at all. This is exactly what I saw all year: some weeks were good, some were bad, and I had nothing to do with it either way.
Comparing model D to a random model
Yeah, that looks about right actually. If I just picked a random order I actually did almost exactly the same. I ran it a handful of times and sometimes the random order actually beat my carefully thought-out model. That’s… not good.
I can’t be the only one with this problem, can I?
Comparing to the pros
Phew! At least the pros over at Yahoo are having the same issues as me. I pulled the average of 6 Yahoo pros (they do the averaging for you!) off the Yahoo sports site for all of 2016 and used the same accuracy algorithm to score them as my model D and the random draw. These are all the exact same thing, statistically. So at least I share my failures with others. Misery loves company, and it seems like I’ll never be lonely as long as we’re all trying to predict kicker performance.
That last paragraph got a little grim toward the end. Sorry about that. It’s probably a good thing I’m ending this post here.
Within the error all three models are the same. So now what? I can think of three options for what to do next:
- Keep going with this analysis. Take a few hours to see if Model C was any better (it won’t be)
- Stop torturing myself by tring to predict kickers. Give up altogether. Focus on defenses, QBs, and maybe TEs one of these days.
- Create new models and see if they would have been any better
I can’t do option 1 without feeling really sorry for myself, and I can’t do option 2 because that’s for quitters. I can just see my brother and his friends now, mocking me, laughing at me as the sad Charlie Brown music plays. Or worse, the sad trombone.
So all that’s left is option 3. I’m going to put together models that resemble the ones I ran as well as ones that I wish I had run. More than anything I’m writing this post so that I can make up new models and say “hey look, if I had done this, I wouldn’t have been so terrible”. Yes, Ben, that is almost the definition of p-hacking, but I don’t care. Hopefully I’ll find a model that would have done well in 2016 and if I do I’ll try it out next year.
As before, keep checking in during the offseason. I’ll periodically post new articles and probably let you know about them on twitter,@FFdatastream. You know, probably. I’m still figuring out this whole twitter thing.