mirror of
https://github.com/allenai/olmocr.git
synced 2025-10-10 15:52:31 +00:00
Deepinfra readme
This commit is contained in:
parent
0c6d889863
commit
a0bc5a4690
44
README.md
44
README.md
@ -210,6 +210,29 @@ The served model name should be `olmocr`. An example vLLM launch command would b
|
||||
vllm serve allenai/olmOCR-7B-0825-FP8 --served-model-name olmocr --max-model-len 16384
|
||||
```
|
||||
|
||||
#### Run olmOCR with the DeepInfra server endpoint:
|
||||
Signup at [DeepInfra](https://deepinfra.com/) and get your API key from the DeepInfra dashboard.
|
||||
Store the API key as an environment variable.
|
||||
```bash
|
||||
export DEEPINFRA_API_KEY="your-api-key-here"
|
||||
```
|
||||
|
||||
```bash
|
||||
python -m olmocr.pipeline ./localworkspace \
|
||||
--server https://api.deepinfra.com/v1/openai \
|
||||
--api_key $DEEPINFRA_API_KEY \
|
||||
--pages_per_group 100 \
|
||||
--model allenai/olmOCR-7B-0725-FP8 \
|
||||
--markdown \
|
||||
--pdfs path/to/your/*.pdf
|
||||
```
|
||||
- `--server`: DeepInfra's OpenAI-compatible endpoint: `https://api.deepinfra.com/v1/openai`
|
||||
- `--api_key`: Your DeepInfra API key
|
||||
- `--pages_per_group`: You may want a smaller number of pages per group as many external provides have lower concurrent request limits
|
||||
- `--model`: The model identifier on DeepInfra: `allenai/olmOCR-7B-0725-FP8`
|
||||
- Other arguments work the same as with local inference
|
||||
|
||||
|
||||
#### Viewing Results
|
||||
|
||||
The `./localworkspace/` workspace folder will then have both [Dolma](https://github.com/allenai/dolma) and markdown files (if using `--markdown`).
|
||||
@ -249,27 +272,6 @@ For example:
|
||||
```bash
|
||||
python -m olmocr.pipeline s3://my_s3_bucket/pdfworkspaces/exampleworkspace --pdfs s3://my_s3_bucket/jakep/gnarly_pdfs/*.pdf --beaker --beaker_gpus 4
|
||||
```
|
||||
### Using DeepInfra
|
||||
Signup at [DeepInfra](https://deepinfra.com/) and get your API key from the DeepInfra dashboard.
|
||||
Store the API key as an environment variable.
|
||||
```bash
|
||||
export DEEPINFRA_API_KEY="your-api-key-here"
|
||||
```
|
||||
#### Run olmOCR with the DeepInfra server endpoint:
|
||||
```bash
|
||||
python -m olmocr.pipeline ./localworkspace \
|
||||
--server https://api.deepinfra.com/v1/openai \
|
||||
--api_key $DEEPINFRA_API_KEY \
|
||||
--pages_per_group 100 \
|
||||
--model allenai/olmOCR-7B-0725-FP8 \
|
||||
--markdown \
|
||||
--pdfs path/to/your/*.pdf
|
||||
```
|
||||
- `--server`: DeepInfra's OpenAI-compatible endpoint: `https://api.deepinfra.com/v1/openai`
|
||||
- `--api_key`: Your DeepInfra API key
|
||||
- `--pages_per_group`: You may want a smaller number of pages per group as many external provides have lower concurrent request limits
|
||||
- `--model`: The model identifier on DeepInfra: `allenai/olmOCR-7B-0725-FP8`
|
||||
- Other arguments work the same as with local inference
|
||||
|
||||
|
||||
### Using Docker
|
||||
|
Loading…
x
Reference in New Issue
Block a user