Sri Sudarsan 349728162e
Matches prefix to verify presence of DOCX,PPTX,XLSX files instead of standard file names (#3959)
Instead of looking for presence of `word/document.xml` ,
`ppt/presentation.xml` and `xl/workbook.xml` to identify DOCX,PPTX and
XLSX files, we look for prefix `word/document*.xml`,
`ppt/presentation*.xml` and `xl/workbook*.xml` as certain files
generated from office365 has files with different names.
Fixes https://github.com/Unstructured-IO/unstructured/issues/3937

---------

Co-authored-by: Yao You <theyaoyou@gmail.com>
2025-03-21 16:27:13 +00:00
..