fix: initialize uri before try-except (#1690)

Fix github issue
https://github.com/Unstructured-IO/unstructured/issues/1686
This commit is contained in:
Klaijan 2023-10-10 13:29:10 -04:00 committed by GitHub
parent 3e101d3e4f
commit 1d80beaaf2
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 4 additions and 2 deletions

View File

@ -25,6 +25,7 @@ setting UNSTRUCTURED_INCLUDE_DEBUG_METADATA=true is needed.
* **Fixes badly initialized Formula** Problem: YoloX contain new types of elements, when loading a document that contain formulas a new element of that class
should be generated, however the Formula class inherits from Element instead of Text. After this change the element is correctly created with the correct class
allowing the document to be loaded. Fix: Change parent class for Formula to Text. Importance: Crucial to be able to load documents that contain formulas.
* **Fixes pdf uri error** An error was encountered when URI type of `GoToR` which refers to pdf resources outside of its own was detected since no condition catches such case. The code is fixing the issue by initialize URI before any condition check.
## 0.10.19

View File

@ -959,13 +959,14 @@ def get_uris_from_annots(
uri_dict = try_resolve(annotation_dict["A"])
uri_type = str(uri_dict["S"])
uri = None
try:
if uri_type == "/'URI'":
uri = try_resolve(try_resolve(uri_dict["URI"])).decode("utf-8")
if uri_type == "/'GoTo'":
uri = try_resolve(try_resolve(uri_dict["D"])).decode("utf-8")
except (KeyError, AttributeError, TypeError, UnicodeDecodeError):
uri = None
except Exception:
pass
points = ((x1, y1), (x1, y2), (x2, y2), (x2, y1))