haystack/docs/v1.3.0/_src/api/api/file_classifier.md
Julian Risch bf71f03ff2
release v1.3.0 and re-add Makefile (#2354)
* release v1.3.0 and re-add Makefile

* Update Documentation & Code Style

* make BaseKnowledgeGraph abstract to remove it from the JSON schema

* Logging paths for JSON schema generation

* Add debug command in autoforma.yml

* Typo

* Update Documentation & Code Style

* Fix schema path in CI

* Update Documentation & Code Style

* Remove debug statement from autoformat.yml

* Reintroduce compatibility between 1.3.0 and 1.2.1rc0 schema

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
2022-03-23 17:22:06 +01:00

1.1 KiB

Module file_type

FileTypeClassifier

class FileTypeClassifier(BaseComponent)

Route files in an Indexing Pipeline to corresponding file converters.

__init__

def __init__(supported_types: List[str] = DEFAULT_TYPES)

Node that sends out files on a different output edge depending on their extension.

Arguments:

  • supported_types: the file types that this node can distinguish. Note that it's limited to a maximum of 10 outgoing edges, which correspond each to a file extension. Such extension are, by default txt, pdf, md, docx, html. Lists containing more than 10 elements will not be allowed. Lists with duplicate elements will also be rejected.

run

def run(file_paths: Union[Path, List[Path], str, List[str], List[Union[Path, str]]])

Sends out files on a different output edge depending on their extension.

Arguments:

  • file_paths: paths to route on different edges.