mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-06-27 02:30:08 +00:00

This PR fixes a bug when using `partition` to partition an email with image attachments with hi_res and allow table structure inference -> the partitioning of the image would encounter a value error: `got multiple values for keyword argument 'infer_table_structure'`. This is because pass `kwargs` into partition "other" types of files in this [block](50ea6fe7fc/unstructured/partition/auto.py (L270-L280)
) `infer_table_structure` is packaged into `partitioning_kwargs`. Then for email at least when there are attachments that can be partitioned with `hi_res` we pass that dict of `kwargs` right back into `partition` entry -> so when we get [here](50ea6fe7fc/unstructured/partition/auto.py (L222-L235)
) we are both specifying explicitly `infer_table_structure` and have it in `kwargs` variable The fix is to detect first if `kwargs` already contains `infer_table_structure` and if yes use that and pop it from `kwargs`. --------- Co-authored-by: Kamil Plucinski <kamil.plucinski@deepsense.ai> Co-authored-by: christinestraub <christinemstraub@gmail.com> Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: christinestraub <christinestraub@users.noreply.github.com>