Migrate to modern bs4 interface (#4025)

## PR Summary
This small PR fixes the bs4 deprecation warnings which you can find in
the [CI
logs](https://github.com/Unstructured-IO/unstructured/actions/runs/15491657572/job/43729960936#step:3:2615):
```python
/app/unstructured/metrics/table/table_extraction.py:53: DeprecationWarning: Call to deprecated method findAll. (Replaced by find_all) -- Deprecated since version 4.0.0.
/app/unstructured/metrics/table/table_extraction.py:57: DeprecationWarning: Call to deprecated method findAll. (Replaced by find_all) -- Deprecated since version 4.0.0.
```

---------

Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
This commit is contained in:
Emmanuel Ferdman 2025-06-16 21:44:20 +03:00 committed by GitHub
parent 6ef2fc1ec6
commit 531490d013
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 5 additions and 4 deletions

View File

@ -1,4 +1,4 @@
## 0.17.11-dev2
## 0.17.11-dev3
### Enhancements
@ -8,6 +8,7 @@
- Fix type error when `result_file_type` is expected to be a `FileType` but is `None`
- Fix chunking for elements with None text that has AttributeError 'NoneType' object has no attribute 'strip'.
- Invalid elements IDs are not visible in VLM output. Parent-child hierarchy is now retrieved based on unstructured element ID, instead of id injected into HTML code of element.
- Fix bs4 deprecation warnings by updating `findAll()` with `find_all()`.
## 0.17.10
- Drop Python 3.9 support as it reaches EOL in October 2025

View File

@ -1 +1 @@
__version__ = "0.17.11-dev2" # pragma: no cover
__version__ = "0.17.11-dev3" # pragma: no cover

View File

@ -50,11 +50,11 @@ def html_table_to_deckerd(content: str) -> List[Dict[str, Any]]:
soup = BeautifulSoup(content, "html.parser")
table = soup.find("table")
rows = table.findAll(["tr"])
rows = table.find_all(["tr"])
table_data = []
for i, row in enumerate(rows):
cells = row.findAll(["th", "td"])
cells = row.find_all(["th", "td"])
for j, cell_data in enumerate(cells):
cell = {
"y": i,