docling/docs/concepts/architecture.md
Panos Vagenas 25fd149c38
docs: add architecture outline (#341)
Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>
2024-11-15 12:52:41 +01:00

1.1 KiB

docling_architecture

In a nutshell, Docling's architecture is outlined in the diagram above.

For each document format, the document converter knows which format-specific backend to employ for parsing the document and which pipeline to use for orchestrating the execution, along with any relevant options.

!!! tip

While the document converter holds a default mapping, this configuration is parametrizable, so e.g. for the PDF format, different backends and different pipeline options can be used — see [Usage](../usage.md#adjust-pipeline-features).

The conversion result contains the Docling document, Docling's fundamental document representation.

Some typical scenarios for using a Docling document include directly calling its export methods, such as for markdown, dictionary etc., or having it chunked by a chunker.

For more details on Docling's architecture, check out the Docling Technical Report.

!!! note

The components illustrated with dashed outline indicate base classes that can be subclassed for specialized implementations.