Skip to content

unreasonable, effective

I don’t want to bore you with the technical, academic stuff I’ve been wresting with lately, but there is one paper that is probably worth checking out even if you’re not a Machine Learning person. On the Unreasonable Effectiveness of Data is noteworthy because (i) it’s geared to a (science-literate but) general audience; (ii) it’s provocative; and (iii) one of the authors is Peter Norvig, Google Director of Research and one of the most prominent people in AI today.

The most interesting insight to me is that the authors come down against the kind of elegant, engineer-driven (parametric) models that are widely associated with AI, and embrace simple, data-driven (nonparametric) models. The difference is, in machine translation, say, the difference between designing a system that “understands” the grammar and semantics of the two languages and translates one to another trying to preserve it, and one that looks up words and phrases in an enormous table (which kind of reminds me of the Chinese Room thought experiment, though the point is somewhat different). It’s not exactly a new argument, but it’s great to see it so strongly and clearly expressed, and to hear how it arose from Google’s experience.