haystack/docs/_src/api/api/file_classifier.md
Branden Chan caf1336424
Adjust pydoc markdown config so methods shown with classes (#2511)
* add_member_class_prefix: true

* Update Documentation & Code Style

* Trigger redeploy

* Trigger redeploy

* Fix pydoc param

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-06 16:00:08 +02:00

1.1 KiB

Module file_type

FileTypeClassifier

class FileTypeClassifier(BaseComponent)

Route files in an Indexing Pipeline to corresponding file converters.

FileTypeClassifier.__init__

def __init__(supported_types: List[str] = DEFAULT_TYPES)

Node that sends out files on a different output edge depending on their extension.

Arguments:

  • supported_types: the file types that this node can distinguish. Note that it's limited to a maximum of 10 outgoing edges, which correspond each to a file extension. Such extension are, by default txt, pdf, md, docx, html. Lists containing more than 10 elements will not be allowed. Lists with duplicate elements will also be rejected.

FileTypeClassifier.run

def run(file_paths: Union[Path, List[Path], str, List[str], List[Union[Path, str]]])

Sends out files on a different output edge depending on their extension.

Arguments:

  • file_paths: paths to route on different edges.