Update pdfminer_utils.py (#3974)

Fix for 'PSSyntaxError' import error:
"cannot import name 'PSSyntaxError' from 'pdfminer.pdfparser'"

Latest pdfminer-six doesn't import PSSyntaxError into
`pdfminer.pdfparser` anymore. It must now be directly imported from its
source (`pdfminer.psexceptions`)
This commit is contained in:
Nathan 2025-04-08 17:47:24 +10:00 committed by GitHub
parent d570f4624b
commit 27f503ce31
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 2 additions and 1 deletions

View File

@ -19,6 +19,7 @@ This makes it impossible to write stable unit tests, for example, or to obtain r
### Fixes
- **Removed out of date ubuntu Dockerfile.** The Dockerfile was out of date and non-functional.
- **Fix for 'PSSyntaxError' import error: "cannot import name 'PSSyntaxError' from 'pdfminer.pdfparser'"** PSSyntaxError needed to be imported from its source 'pdfminer.psexceptions'.
## 0.17.4

View File

@ -6,7 +6,7 @@ from pdfminer.converter import PDFPageAggregator
from pdfminer.layout import LAParams, LTContainer, LTImage, LTItem, LTTextLine
from pdfminer.pdfinterp import PDFPageInterpreter, PDFResourceManager
from pdfminer.pdfpage import PDFPage
from pdfminer.psparser import PSSyntaxError
from pdfminer.psexceptions import PSSyntaxError
from pydantic import BaseModel
from unstructured.logger import logger