twitter-ebooks/NOTES.md
2013-11-08 06:02:05 +11:00

256 B

  • Files in text/ are preprocessed by rake consume and serialized
  • e.g. text/foo.tweets becomes consumed/foo.corpus
  • rake consume looks at hashes to know which it needs to update
  • Preprocessed corpus files are loaded at runtime by Corpus.load('foo')