Here are the slides from a nice talk by ace statistician Andrew Gelman, on how traditional statistical ideas, originally developed for dealing with small datasets, remain relevant in a world of Big Data.
It always annoys me when people argue that traditional statistics was developed in the days of mechanical adding machines, so must be outdated now. It’s like arguing that growing crops was developed in ancient Mesopotamia, so must be outdated now. True, the fact that things have long been done a certain way isn’t a reason to keep doing them that way. But it’s not a reason to do things differently, either.
Traditional statistics was invented to keep us from fooling ourselves, to keep us from seeing “patterns” that aren’t really there. It’s just as important that we not fool ourselves today as it was back in Fisher’s day. Ideas about statistical sampling remain relevant even if your sample is huge.