Google Ngram

This is super fun. Google has just released this tool for playing with word frequency data from a huge amount of scanned literature (5 million books dating as far back as 500 years). You can read more about it here, including some nice research that’s already being done with the full data set that’s also been released. (also here)

For example, here’s a graph of the appearance of the word “homeschool” in the collective Google corpus.

You can also compare the appearance of words. For example, here’s informal evidence that we care less about ancient Greek mathematicians (BC) and more about European mathematicians (17th and 18th century) than we did 100 years ago.

Not very rigorous, I’ll admit. But it’s an example of what kind of interesting trends can be instantly teased out. As this article quotes Erez Lieberman-Aiden of Harvard University, “It’s not just an answer machine. It’s a question machine.” I think that’s a nice way to put it.