mirror of
https://github.com/deepset-ai/haystack.git
synced 2025-07-24 17:30:38 +00:00

* feat: add unicode normalization & ascii_only mode for DocumentCleaner. * feat: add unicode_normalization parameter valdiation to DocumentCleaner. * test: fix the unit test to work after code linting.
7 lines
318 B
YAML
7 lines
318 B
YAML
---
|
|
enhancements:
|
|
- |
|
|
Added `unicode_normalization` parameter to the DocumentCleaner, allowing to normalize the text to NFC, NFD, NFKC, or NFKD.
|
|
- |
|
|
Added `ascii_only` parameter to the DocumentCleaner, transforming letters with diacritics to their ASCII equivalent and removing other non-ASCII characters.
|