11 Commits

Author SHA1 Message Date
Jake Poznanski
212d391933 More convservative filtering 2024-11-21 18:39:21 +00:00
Jake Poznanski
98e40143dd Adding mass filtering script 2024-11-21 16:56:19 +00:00
Jake Poznanski
dd4f9670b5 Filter refactor 2024-10-17 22:36:38 +00:00
Jake Poznanski
4bf6e7a430 Refactoring 2024-10-09 18:11:18 +00:00
Jake Poznanski
549e07bed0 filtering out stupid ads 2024-10-02 15:36:41 +00:00
Jake Poznanski
bab32aa9b3 Formatting 2024-09-18 22:52:42 +00:00
Jake Poznanski
af2126df99 450tok/sec/core with smollm that appears to work well 2024-09-17 19:59:02 +00:00
Jake Poznanski
2f71cb9232 Using SmolLM, seems a lot better and is able to pass some tests 2024-09-17 18:47:27 +00:00
Jake Poznanski
57e80aacd2 Testing coherence with distilgpt2, but it doesn't work great 2024-09-17 16:58:45 +00:00
Jake Poznanski
cb9b6efb3c Trying distilgpt2 instead of kenlm 2024-09-17 16:50:01 +00:00
Jake Poznanski
01bc0b2f10 Moving a whole bunch of code over, still broken 2024-09-17 16:26:55 +00:00