olmocr/olmocr/check.py

import importlib.util
import logging
import subprocess
import sys

logger = logging.getLogger(__name__)


def check_poppler_version():
    try:
        result = subprocess.run(["pdftoppm", "-h"], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
        if result.returncode == 0 and result.stderr.startswith("pdftoppm"):
            logger.info("pdftoppm is installed and working.")
        else:
            logger.error("pdftoppm is installed but returned an error.")
            sys.exit(1)
    except FileNotFoundError:
        logger.error("pdftoppm is not installed.")
        logger.error("Check the README in the https://github.com/allenai/olmocr/blob/main/README.md for installation instructions")
        sys.exit(1)


def check_sglang_version():
    if importlib.util.find_spec("sglang") is None:
        logger.error("Please make sure sglang is installed according to the latest instructions here: https://docs.sglang.ai/start/install.html")
        logger.error("Sglang needs to be installed with a separate command in order to find all dependencies properly.")
        sys.exit(1)


def check_torch_gpu_available(min_gpu_memory: int = 20 * 1024**3):
    try:
        import torch
    except:
        logger.error("Pytorch must be installed, visit https://pytorch.org/ for installation instructions")
        raise

    try:
        gpu_memory = torch.cuda.get_device_properties(0).total_memory
        assert gpu_memory >= min_gpu_memory
    except:
        logger.error(f"Torch was not able to find a GPU with at least {min_gpu_memory // (1024 ** 3)} GB of RAM.")
        raise


if __name__ == "__main__":
    check_poppler_version()
    check_sglang_version()
Better check for separate sglang installation step 2025-01-28 13:56:00 -08:00			`import importlib.util`
isort 2025-01-29 15:25:10 -08:00			`import logging`
			`import subprocess`
			`import sys`
Add check for poppler installation 2024-11-01 16:57:19 +00:00
			`logger = logging.getLogger(__name__)`

Black formatting 2025-01-29 15:30:39 -08:00
Add check for poppler installation 2024-11-01 16:57:19 +00:00			`def check_poppler_version():`
			`try:`
Black formatting 2025-01-29 15:30:39 -08:00			`result = subprocess.run(["pdftoppm", "-h"], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)`
Checkfix 2024-11-01 17:13:11 +00:00			`if result.returncode == 0 and result.stderr.startswith("pdftoppm"):`
Add check for poppler installation 2024-11-01 16:57:19 +00:00			`logger.info("pdftoppm is installed and working.")`
			`else:`
Ruff 2025-01-29 15:47:57 -08:00			`logger.error("pdftoppm is installed but returned an error.")`
Add check for poppler installation 2024-11-01 16:57:19 +00:00			`sys.exit(1)`
			`except FileNotFoundError:`
			`logger.error("pdftoppm is not installed.")`
Massive refactor from pdelfin to olmocr 2025-01-27 18:30:41 +00:00			`logger.error("Check the README in the https://github.com/allenai/olmocr/blob/main/README.md for installation instructions")`
Checkfix 2024-11-01 17:13:11 +00:00			`sys.exit(1)`

Black formatting 2025-01-29 15:30:39 -08:00
Better check for separate sglang installation step 2025-01-28 13:56:00 -08:00			`def check_sglang_version():`
			`if importlib.util.find_spec("sglang") is None:`
Ruff 2025-01-29 15:47:57 -08:00			`logger.error("Please make sure sglang is installed according to the latest instructions here: https://docs.sglang.ai/start/install.html")`
Better check for separate sglang installation step 2025-01-28 13:56:00 -08:00			`logger.error("Sglang needs to be installed with a separate command in order to find all dependencies properly.")`
			`sys.exit(1)`

Black formatting 2025-01-29 15:30:39 -08:00
Probably need at least 20GB GPU ram to have a good time with olmocr 2025-03-03 15:54:47 -08:00			`def check_torch_gpu_available(min_gpu_memory: int = 20 * 1024**3):`
Add gpu message 2025-01-29 21:48:56 +00:00			`try:`
			`import torch`
			`except:`
			`logger.error("Pytorch must be installed, visit https://pytorch.org/ for installation instructions")`
			`raise`

			`try:`
Typo 2025-01-29 22:03:37 +00:00			`gpu_memory = torch.cuda.get_device_properties(0).total_memory`
Add gpu message 2025-01-29 21:48:56 +00:00			`assert gpu_memory >= min_gpu_memory`
			`except:`
			`logger.error(f"Torch was not able to find a GPU with at least {min_gpu_memory // (1024 ** 3)} GB of RAM.")`
			`raise`

Better check for separate sglang installation step 2025-01-28 13:56:00 -08:00
Checkfix 2024-11-01 17:13:11 +00:00			`if __name__ == "__main__":`
Better check for separate sglang installation step 2025-01-28 13:56:00 -08:00			`check_poppler_version()`
Black formatting 2025-01-29 15:30:39 -08:00			`check_sglang_version()`