mirror of
https://github.com/allenai/olmocr.git
synced 2025-11-19 20:08:19 +00:00
Added deepinfra usage to readme
This commit is contained in:
parent
2a5792e5ed
commit
90589e16de
@ -1,64 +0,0 @@
|
|||||||
# Using olmOCR with DeepInfra
|
|
||||||
|
|
||||||
This guide explains how to use olmOCR with DeepInfra's hosted API service for cloud-based inference.
|
|
||||||
|
|
||||||
## Prerequisites
|
|
||||||
|
|
||||||
1. **DeepInfra Account**: Sign up at https://deepinfra.com/
|
|
||||||
2. **API Key**: Get your API key from the DeepInfra dashboard
|
|
||||||
3. **olmOCR**: Ensure you have the modified version with authentication support
|
|
||||||
|
|
||||||
## Setup
|
|
||||||
|
|
||||||
### 1. Get your DeepInfra API Key
|
|
||||||
|
|
||||||
1. Log in to https://deepinfra.com/
|
|
||||||
2. Navigate to your dashboard
|
|
||||||
3. Generate or copy your API key
|
|
||||||
4. Store it securely (recommended: as an environment variable)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
export DEEPINFRA_API_KEY="your-api-key-here"
|
|
||||||
```
|
|
||||||
|
|
||||||
### 2. Usage
|
|
||||||
|
|
||||||
Run olmOCR with the DeepInfra server endpoint:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python -m olmocr.pipeline ./localworkspace \
|
|
||||||
--server https://api.deepinfra.com/v1/openai \
|
|
||||||
--api_key $DEEPINFRA_API_KEY \
|
|
||||||
--model allenai/olmOCR-7B-0725-FP8 \
|
|
||||||
--markdown \
|
|
||||||
--pdfs path/to/your/*.pdf
|
|
||||||
```
|
|
||||||
|
|
||||||
### Command Line Arguments
|
|
||||||
|
|
||||||
- `--server`: DeepInfra's OpenAI-compatible endpoint: `https://api.deepinfra.com/v1/openai`
|
|
||||||
- `--api_key`: Your DeepInfra API key (or use environment variable)
|
|
||||||
- `--model`: The model identifier on DeepInfra: `allenai/olmOCR-7B-0725-FP8`
|
|
||||||
- Other arguments work the same as with local inference
|
|
||||||
|
|
||||||
### Example with S3 Storage
|
|
||||||
|
|
||||||
For large-scale processing with S3:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python -m olmocr.pipeline s3://your-bucket/workspace \
|
|
||||||
--server https://api.deepinfra.com/v1/openai \
|
|
||||||
--api_key $DEEPINFRA_API_KEY \
|
|
||||||
--model allenai/olmOCR-7B-0725-FP8 \
|
|
||||||
--pdfs s3://your-bucket/pdfs/*.pdf \
|
|
||||||
--workers 10 \
|
|
||||||
--markdown
|
|
||||||
```
|
|
||||||
|
|
||||||
## Pricing
|
|
||||||
|
|
||||||
As of 2024, DeepInfra charges for the olmOCR model:
|
|
||||||
- Input tokens: ~$0.27 per million tokens
|
|
||||||
- Output tokens: ~$0.81 per million tokens
|
|
||||||
|
|
||||||
Check current pricing at: https://deepinfra.com/pricing
|
|
||||||
20
README.md
20
README.md
@ -249,6 +249,26 @@ For example:
|
|||||||
```bash
|
```bash
|
||||||
python -m olmocr.pipeline s3://my_s3_bucket/pdfworkspaces/exampleworkspace --pdfs s3://my_s3_bucket/jakep/gnarly_pdfs/*.pdf --beaker --beaker_gpus 4
|
python -m olmocr.pipeline s3://my_s3_bucket/pdfworkspaces/exampleworkspace --pdfs s3://my_s3_bucket/jakep/gnarly_pdfs/*.pdf --beaker --beaker_gpus 4
|
||||||
```
|
```
|
||||||
|
### Using DeepInfra
|
||||||
|
Signup at [DeepInfra](https://deepinfra.com/) and get your API key from the DeepInfra dashboard.
|
||||||
|
Store the API key as an environment variable.
|
||||||
|
```bash
|
||||||
|
export DEEPINFRA_API_KEY="your-api-key-here"
|
||||||
|
```
|
||||||
|
#### Run olmOCR with the DeepInfra server endpoint:
|
||||||
|
```bash
|
||||||
|
python -m olmocr.pipeline ./localworkspace \
|
||||||
|
--server https://api.deepinfra.com/v1/openai \
|
||||||
|
--api_key $DEEPINFRA_API_KEY \
|
||||||
|
--model allenai/olmOCR-7B-0725-FP8 \
|
||||||
|
--markdown \
|
||||||
|
--pdfs path/to/your/*.pdf
|
||||||
|
```
|
||||||
|
- `--server`: DeepInfra's OpenAI-compatible endpoint: `https://api.deepinfra.com/v1/openai`
|
||||||
|
- `--api_key`: Your DeepInfra API key
|
||||||
|
- `--model`: The model identifier on DeepInfra: `allenai/olmOCR-7B-0725-FP8`
|
||||||
|
- Other arguments work the same as with local inference
|
||||||
|
|
||||||
|
|
||||||
### Using Docker
|
### Using Docker
|
||||||
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user