Maxim Lysak 2f72167ff6
feat: updated vlm pipeline (with latest changes from docling-core) (#1158)
* Draft implementation of Doctag backend

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Updated VLM pipeline doctags to docling conversion, now properly supports lists

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* preparing to migrate to new doctags deserializer

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* re-using DocTagsDocument.from_doctags_and_image_pairs

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* satisfying mypy and other checks

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Added support for force_backend_text parameter

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* removed unnecessary transformation

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Cleaned up

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Update tests

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Updated readme

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

---------

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
2025-03-18 15:44:51 +01:00
..
2024-08-30 14:08:20 +02:00