2024-11-12 12:21:48 +01:00
## Get started
2024-11-21 17:23:04 +01:00
Docling is used by the [Data Prep Kit ](https://ibm.github.io/data-prep-kit/ ) open-source toolkit for preparing unstructured data for LLM application development ranging from laptop scale to datacenter scale.
2024-11-12 12:21:48 +01:00
Below you find the Data Prep Kit modules powered by Docling.
## PDF ingestion to Parquet
2024-11-21 17:23:04 +01:00
- 💻 [PDF-to-Parquet GitHub ](https://github.com/IBM/data-prep-kit/tree/dev/transforms/language/pdf2parquet )
- 📖 [PDF-to-Parquet Docs ](https://ibm.github.io/data-prep-kit/transforms/language/pdf2parquet/python/ )
2024-11-12 12:21:48 +01:00
## Document chunking
2024-11-21 17:23:04 +01:00
- 💻 [Doc Chunking GitHub ](https://github.com/IBM/data-prep-kit/tree/dev/transforms/language/doc_chunk )
- 📖 [Doc Chunking Docs ](https://ibm.github.io/data-prep-kit/transforms/language/doc_chunk/python/ )