mirror of
https://github.com/allenai/olmocr.git
synced 2025-08-15 20:32:45 +00:00
11 lines
449 B
Python
11 lines
449 B
Python
# Step 1, load the data
|
|
# Probably, we want to see just a folder with openai batch input jsonls, plus the batch output jsonls
|
|
# TODO: Figure out hyperparameters for image sizing
|
|
|
|
# Step 2. Load those prompts through and do a forward pass to calculate the loss
|
|
|
|
# Step 3. Add hugging face accelerate for training
|
|
|
|
# Step 4. Checkpointing code, both saving and reloading to restart
|
|
|
|
# Step 5. Move over from interactive session to gantry launch script |