haystack/releasenotes/notes/trafilatura-html-conversion-e9b9044d31fec794.yaml
Stefano Fiorucci 7181f6b7e9
feat: change HTML conversion backend from boilerpy3 to Trafilatura (#7705)
* change HTML conversion backed to Trafilatura

* rm unused var
2024-05-17 10:38:47 +02:00

10 lines
323 B
YAML

---
enhancements:
- |
`HTMLToDocument`: change the HTML conversion backend from `boilerpy3` to `trafilatura`,
which is more robust and better maintained.
deprecations:
- |
The following parameters of `HTMLToDocument` are ignored and will be removed in Haystack 2.4.0:
`extractor_type` and `try_others`.