这是Google扫描书本时生成的数据库:http://storage.googleapis.com/books/ngrams/books/datasetsv2.html
1-gram的数据库即单词的频率,例如:
circumvallate 1978 313 215 85
circumvallate 1979 183 147 77
The first line tells us that in 1978, the word "circumvallate" (which means "surround with a rampart or other fortification", in case you were wondering) occurred 313 times overall, on 215 distinct pages and in 85 distinct books from our sample.
关于N-Gram的介绍可参考:http://blog.sina.com.cn/s/blog_4b2ddd15010151th.html
评论