Description
Toolkit for linearizing PDFs for LLM datasets/training
Readme Apache-2.0 373 MiB
Languages
Python 87.7%
Shell 6.5%
HTML 5.7%
Dockerfile 0.1%