unstructured/test_unstructured
Sebastian Laverde Alfonso e90a979f45
fix: Better logic for setting category_depth metadata for Title elements (#1517)
This PR promotes the `category_depth` metadata for `Title` elements from
`None` to 0, whenever `Headline` and/or `Subheadline` types (that are
also mapped to `Title` elements with depth 1 and 2) are present. An
additional test to `test_common.py` has been added to check on the
improvement. More test of how this logic fixes the behaviour can be
found in a adapted version on the colab
[here](https://colab.research.google.com/drive/1LoScFJBYUhkM6X7pMp8cDaJLC_VoxGci?usp=sharing).

---------

Co-authored-by: qued <64741807+qued@users.noreply.github.com>
2023-10-05 17:51:06 +00:00
..