1528 Commits

Author SHA1 Message Date
Jake Poznanski
5556e204cf Testing 2025-08-21 17:17:43 +00:00
Jake Poznanski
892429965a Hmm 2025-08-21 17:14:14 +00:00
Jake Poznanski
de719edf49 Logs 2025-08-21 17:11:33 +00:00
Jake Poznanski
6fe630516a Score longer 2025-08-21 17:09:09 +00:00
Jake Poznanski
7f2dd85e5b More configs 2025-08-21 17:06:18 +00:00
Jake Poznanski
dc4c0864ac Dont default to tensorboard 2025-08-21 17:03:01 +00:00
Jake Poznanski
a443b89854 Cleanup 2025-08-21 17:01:40 +00:00
Jake Poznanski
d08068218c Setting some defaults 2025-08-21 16:59:16 +00:00
Jake Poznanski
f865e49624 Options 2025-08-21 16:55:24 +00:00
Jake Poznanski
3c8410f22c fix 2025-08-21 16:51:51 +00:00
Jake Poznanski
1dbb4332c0 FIxing up 2025-08-21 16:50:56 +00:00
Jake Poznanski
7c446e1679 Trying to fix script 2025-08-20 22:44:10 +00:00
Jake Poznanski
a2ee4d46c0 gpro trainer test 1 2025-08-20 22:35:19 +00:00
Jake Poznanski
77164e909f Decent grpo script 2025-08-20 22:25:26 +00:00
Jake Poznanski
cc918ca03e Setting up GRPO trainer 2025-08-20 22:18:38 +00:00
Jake Poznanski
d046ba554a Mining math tests too 2025-08-20 21:05:56 +00:00
Jake Poznanski
c32dced59c More fixes to data gen script 2025-08-20 20:24:38 +00:00
Jake Poznanski
becbfdc62d adding basic markdown output, will need to adjust it 2025-08-20 19:54:21 +00:00
Jake Poznanski
ce86aff80a Refreshing the claude sonnet synth miner 2025-08-20 16:23:33 +00:00
Jake Poznanski
34d7f6e1c5 Preempt 2025-08-19 22:02:14 +00:00
Jake Poznanski
1a4cf6d8e1 A few more nice configs to test 2025-08-19 21:48:49 +00:00
Jake Poznanski
3e6be9ad5f Merge branch 'jakep/new_data_promptv4' into jakep/new_data 2025-08-19 21:46:28 +00:00
Jake Poznanski
9868a63756 Adding a new pipeline 2025-08-19 21:46:12 +00:00
Jake Poznanski
41201b6317 Lints 2025-08-19 21:30:41 +00:00
Jake Poznanski
768cb33937 Better filtering coming in 2025-08-19 21:22:54 +00:00
Jake Poznanski
1cafa779a3 More filtering stages 2025-08-19 20:09:41 +00:00
Jake Poznanski
4d837b7db2 More filter rules 2025-08-19 20:01:42 +00:00
Jake Poznanski
17d131fce0 Some more filtering stuff 2025-08-19 18:54:04 +00:00
Jake Poznanski
a3d23d7de1 Adding a part of code to dataloader so you can see what is getting filtered out of your dataset 2025-08-19 18:45:01 +00:00
Jake Poznanski
84a0c432e7 Adding some filtering rules and tests for them 2025-08-19 18:14:15 +00:00
Jake Poznanski
cd09e190b5 Fixes 2025-08-19 17:50:23 +00:00
Jake Poznanski
798335c88e Setting pipeline touse new prompt too 2025-08-19 17:46:23 +00:00
Jake Poznanski
f2db62b0f8 Train a run with adjusted prompt 2025-08-19 17:45:41 +00:00
Jake Poznanski
1be5cea567 Merge branch 'main' into jakep/new_data 2025-08-19 17:41:45 +00:00
Jake Poznanski
702f8996a9 2epoch 2025-08-16 21:34:16 +00:00
Jake Poznanski
c075f3071f New configs for new data 2025-08-16 17:31:42 +00:00
Jake Poznanski
cffbb82b0b Fix for iabooks 2025-08-16 17:26:51 +00:00
Jake Poznanski
0a9c82927f Adding strip 2025-08-16 17:05:09 +00:00
Jake Poznanski
c492615355 Bump version to v0.3.3 for release v0.3.3 2025-08-15 19:45:17 +00:00
Jake Poznanski
cee12ccc9f New version 2025-08-15 19:45:07 +00:00
Jake Poznanski
76405b53db Lints 2025-08-15 19:44:47 +00:00
Jake Poznanski
69c33abfcc Trying to keep queue loaded more 2025-08-15 18:44:45 +00:00
Jake Poznanski
7c98673972 Pipeline fixes for OMP_NUM_THREADS 2025-08-15 18:30:00 +00:00
Jake Poznanski
b9238b8638 Fix for floaty amount 2025-08-14 22:27:26 +00:00
Jake Poznanski
618777c17e Bump version to v0.3.2 for release v0.3.2 2025-08-14 20:58:11 +00:00
Jake Poznanski
5532493ec8 Pipeline should be improved to limit CPU usage on page renders 2025-08-14 20:57:57 +00:00
Jake Poznanski
3a36ee239d Cleanup 2025-08-14 20:13:52 +00:00
Jake Poznanski
a863d04e6e Cleanup page rendering cpu limits 2025-08-14 20:11:26 +00:00
Jake Poznanski
482030f286 Script to process batch outputs 2025-08-14 19:54:29 +00:00
Jake Poznanski
53c0e57e4a openai batch data writer 2025-08-14 19:40:36 +00:00