unstructured/example-docs/grid_offset_error.docx
Klaijan dab79b0c83
fix: add try/except wrap over row.cells to failproof tc grid_offset (#4033)
This PR fixes the issue with `docx` with
complex/recursive/merged/malformed tables by skipping cells that could
not trace back to a valid `<w:tc>` element used by the `python-docx` due
to missing or improperly merged rows.

Accessing row.cells in such cases can raise a `ValueError` when
`python-docx` fails to resolve the full logical table layout. This PR
wraps those calls in `try/except` to skip problematic rows while
continuing to extract usable content from the rest of the document.
2025-06-30 14:20:18 +00:00

810 KiB