Commit graph

75 commits

Author SHA1 Message Date
Jaiden Mispy
a2ca0da967 3.0.3 2014-12-06 22:37:10 +11:00
Stawberri
6bc89cd4fb Switched to .match for matching
Mispy taught me something about ruby precedence today!
2014-12-06 03:29:29 -08:00
Stawberri
c4ba9e139f Regex soft-retweet detection
I added a regex to detect 'RT @' anywhere in the content of a tweet, in
case someone hides a RT in the middle of a tweet! As a bonus, it also
detects a bunch of different types of quotes as well as other things
people might use, like 'via,' 'by,' and 'from'!
2014-12-06 01:38:38 -08:00
Jaiden Mispy
458b94a4c3 3.0.2 - Handle rate limitation in archiver 2014-12-06 00:07:49 +11:00
Jaiden Mispy
b738f1fe3a 3.0.1 2014-12-05 23:50:59 +11:00
Jaiden Mispy
5617544a30 Make sure we really have the right username 2014-12-05 23:50:07 +11:00
Jaiden Mispy
822f5e4c6c More cleanup 2014-12-05 22:57:32 +11:00
Jaiden Mispy
1977445b1c Lots of documentation and cleanup 2014-12-05 21:12:39 +11:00
Jaiden Mispy
efde0fd16f Conversation-based bot detection and politeness 2014-12-05 19:04:15 +11:00
Jaiden Mispy
daeda5d7eb Add ebooks auth command 2014-12-05 14:03:11 +11:00
Jaiden Mispy
401586471b Separate unprompted tracking from mention includes 2014-12-04 05:32:41 +11:00
Jaiden Mispy
1a40ef85f9 Slightly different pesters model 2014-11-19 10:08:32 +11:00
Jaiden Mispy
24e8ce5ae3 Block blacklisted users on contact 2014-11-18 14:31:59 +11:00
Jaiden Mispy
2e336fb9be On second thought, we can't use a cache system
Simply because the corpuses are too darn big to keep around
2014-11-18 13:51:31 +11:00
Jaiden Mispy
8135aaaabb Use can_pester? logic for timeline tweets 2014-11-18 13:31:59 +11:00
Jaiden Mispy
9d8e30d7f6 Don't be so hasty to consider people bots 2014-11-18 13:26:06 +11:00
Jaiden Mispy
b72a6db0e1 Threading! 2014-11-18 13:24:59 +11:00
Jaiden Mispy
29beb23502 Bot anti-bot measures
We assume a user is a bot if it has 'ebooks' in the name
or if it replies more than once in a 30-second window
2014-11-18 12:00:34 +11:00
Jaiden Mispy
5bfaac99de Some actual tests for the bot response logic 2014-11-15 03:55:32 +11:00
Jaiden Mispy
746d218896 Fix tweet event handling 2014-11-14 22:59:39 +11:00
Jaiden Mispy
e646e24744 Use new twitter gem streaming support
Made more complicated by the fact that this
is not inherently eventmachine-based, unlike
tweetstream
2014-11-14 22:46:07 +11:00
Geoffroy Couprie
0fe4b627d0 update deprecated code 2014-10-29 19:00:49 +01:00
Geoffroy Couprie
2698963fb1 consume multiple corpuses 2014-10-29 18:56:37 +01:00
Geoffroy Couprie
9731575a3d update twitter gem to 5.0 2014-10-27 13:57:07 +01:00
Jaiden Mispy
0cb7abcb52 Test that models save and load correctly 2014-10-25 06:59:34 -07:00
Jaiden Mispy
7e62000e37 2.3.2 2014-10-25 05:49:49 -07:00
Jaiden Mispy
302ea0229d grr stupid mistake 2014-10-25 05:49:23 -07:00
Jaiden Mispy
1f8ea676bd 2.3.1 2014-10-25 05:19:03 -07:00
Jaiden Mispy
4052d534b2 Save only necessary data into model 2014-10-25 04:26:52 -07:00
Jaiden Mispy
81b4f78187 2.3.0 2014-10-25 03:46:46 -07:00
Jaiden Mispy
3b1d6f856d Switch to using token indexes instead of strings 2014-10-24 09:55:49 -07:00
Jaiden Mispy
6ae1dd5dac 2.2.9 2014-10-20 00:16:10 -07:00
Jaiden Mispy
203c20f6f3 Seems we can't go to twitter 5 yet 2014-10-20 00:16:10 -07:00
Paul Friedman
927efe7f07 Fix parser swapping mentions and sentences 2014-10-19 22:33:17 -07:00
Jaiden Mispy
6f27d32bf1 2.2.8 2014-10-19 01:22:42 -07:00
Jaiden Mispy
2662558e1a Use new twitter gem style in archiver 2014-10-19 00:19:44 -07:00
Jaiden Mispy
bf1d6ae8a4 2.2.7 2014-10-18 23:00:50 -07:00
Jaiden Mispy
f4dbf89c15 Fix deprecations for twitter gem 5 2014-10-18 22:59:28 -07:00
Jaiden Mispy
228e0caa65 More memory profiling 2014-10-18 22:21:50 -07:00
Jaiden Mispy
b7f67ec0a6 Memory optimization 2014-10-16 03:02:39 -07:00
Jaiden Mispy
d09d968915 rspec and memory_profiler 2014-10-14 01:02:08 -07:00
Pira Wetton
3706ae0bbb picture shortcut 2014-09-23 20:26:48 -04:00
Mispy
a2b374f48c 2.2.6 2014-06-28 18:51:43 -07:00
Joel McCoy
be6ac9127f MODEL: Read in utf-8, only parse CSV once
Ran into `Encoding::CompatibilityError` issue trying to consume my corpus (tweets.csv) on Windows 7, but this likely affects other environments as well. 

Fix: force reading corpus file contents as utf-8.

Also a quick clean-up of the CSV flow to only parse the content once instead of double-dipping.
2014-06-27 18:42:51 -04:00
Mispy
8a5c4831ad 2.2.5 - encoding: utf-8 2014-05-07 16:45:17 +10:00
Brett O'Connor
2aac54c7aa csv import now looks for text column 2014-05-03 16:44:07 -06:00
Joel McCoy
872dabdbf8 Support consuming tweets.csv from official twitter archives 2014-04-30 20:32:51 -04:00
Mispy
17ef359de2 2.2.3 - Avoid some mention edge cases 2014-04-28 10:57:14 -07:00
Mispy
a836e00a87 2.2.2 2014-04-24 21:43:22 -07:00
Mispy
5d55d90f85 Be more paranoid about identifying mentions 2014-04-24 20:55:53 -07:00