Cubs Quick Hits – 01/20/2017: What Exactly Does ‘Small Sample Size’ Mean?

The term “small sample size” has become ubiquitous in baseball writing, synonymous with luck — either good or bad — generated by an unreliable data set. But what actually is a small sample size?

In search of the answer to this question years ago, I became confused when writers talked about stats “stabilizing” or “normalizing” in the context of sample sizes. In order that you don’t have to walk that same path, I want to share with you what I learned. Stabilization is the point at which a sample is no longer too small.

This doesn’t mean, however, that luck is completely out of the picture. Quite the opposite, actually. Stabilization is the point at which talent and randomness/luck are equally responsible for a statistic. Statisticians will tell you that stabilization is when a player’s talent is 50 percent responsible for the given metric.

To put that into more concrete terms, here are the sample sizes at which offensive stats stabilize (via FanGraphs):

  • 50 FBs: HR per FB
  • 60 PA: Strikeout rate
  • 80 BIP: GB rate
  • 80 BIP: FB rate
  • 120 PA: Walk rate
  • 160 AB: ISO
  • 170 PA: HR rate
  • 240 PA: HBP rate
  • 290 PA: Single rate
  • 320 AB: SLG
  • 460 PA: OBP
  • 600 BIP: LD rate
  • 820 BIP: BABIP
  • 910 AB: AVG
  • 1610 PA: XBH rate

 

Glossary: AB = at-bat; BIP = ball in play; FB = fly ball; PA= plate appearance

Back to top button