Random Walks

Mr. Chase blogs about math

Google Ngram

Posted on December 20, 2010 by Mr. Chase

This is super fun. Google has just released this tool for playing with word frequency data from a huge amount of scanned literature (5 million books dating as far back as 500 years). You can read more about it here, including some nice research that’s already being done with the full data set that’s also been released. (also here)

For example, here’s a graph of the appearance of the word “homeschool” in the collective Google corpus.

You can also compare the appearance of words. For example, here’s informal evidence that we care less about ancient Greek mathematicians (BC) and more about European mathematicians (17th and 18th century) than we did 100 years ago.

Not very rigorous, I’ll admit. But it’s an example of what kind of interesting trends can be instantly teased out. As this article quotes Erez Lieberman-Aiden of Harvard University, “It’s not just an answer machine. It’s a question machine.” I think that’s a nice way to put it.

2 thoughts on “Google Ngram”

Mr. Chase on December 22, 2010 at 12:59 pm said:

Another nice post that came through today about this Google tool:

http://www.squarecirclez.com/blog/500-billion-words-visual-stats-give-us-cultural-insights/5519

Reply ↓
Pingback: Pythagorean Theorem « Random Walks

Leave a comment Cancel reply