haystack/docs/pydoc/config/preprocessors_api.yml
Sebastian Husch Lee 19cf220136
feat: integrate two ready-made SuperComponents from haystack-experimental (#9235)
* Add super component decorator

* Add reno

* MultiFileConverter

* Add DocumentPreprocessor

* Add reno

* Add tests and change doc preprocessor to split first then clean

* Remove code from merge

* Add to pydoc and missing test file

* PR comments

* Lint fix

* Fix mypy

* Fix mypy

* Add comment

* PR comments

* Update haystack/components/converters/multi_file_converter.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/preprocessors/document_preprocessor.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/converters/multi_file_converter.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* PR comments

* PR comment

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2025-04-17 10:02:26 +00:00

36 lines
1.0 KiB
YAML

loaders:
- type: haystack_pydoc_tools.loaders.CustomPythonLoader
search_path: [../../../haystack/components/preprocessors]
modules: [
"csv_document_cleaner",
"csv_document_splitter",
"document_cleaner",
"document_preprocessor",
"document_splitter",
"hierarchical_document_splitter",
"recursive_splitter",
"text_cleaner"]
ignore_when_discovered: ["__init__"]
processors:
- type: filter
expression:
documented_only: true
do_not_filter_modules: false
skip_empty_modules: true
- type: smart
- type: crossref
renderer:
type: haystack_pydoc_tools.renderers.ReadmeCoreRenderer
excerpt: Preprocess your Documents and texts. Clean, split, and more.
category_slug: haystack-api
title: PreProcessors
slug: preprocessors-api
order: 100
markdown:
descriptive_class_title: false
classdef_code_block: false
descriptive_module_title: true
add_method_class_prefix: true
add_member_class_prefix: false
filename: preprocessors_api.md