mirror of
https://github.com/deepset-ai/haystack.git
synced 2025-08-04 22:58:56 +00:00

* add_member_class_prefix: true * Update Documentation & Code Style * Trigger redeploy * Trigger redeploy * Fix pydoc param * Update Documentation & Code Style Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
48 lines
1.1 KiB
Markdown
48 lines
1.1 KiB
Markdown
<a id="file_type"></a>
|
|
|
|
# Module file\_type
|
|
|
|
<a id="file_type.FileTypeClassifier"></a>
|
|
|
|
## FileTypeClassifier
|
|
|
|
```python
|
|
class FileTypeClassifier(BaseComponent)
|
|
```
|
|
|
|
Route files in an Indexing Pipeline to corresponding file converters.
|
|
|
|
<a id="file_type.FileTypeClassifier.__init__"></a>
|
|
|
|
#### FileTypeClassifier.\_\_init\_\_
|
|
|
|
```python
|
|
def __init__(supported_types: List[str] = DEFAULT_TYPES)
|
|
```
|
|
|
|
Node that sends out files on a different output edge depending on their extension.
|
|
|
|
**Arguments**:
|
|
|
|
- `supported_types`: the file types that this node can distinguish.
|
|
Note that it's limited to a maximum of 10 outgoing edges, which
|
|
correspond each to a file extension. Such extension are, by default
|
|
`txt`, `pdf`, `md`, `docx`, `html`. Lists containing more than 10
|
|
elements will not be allowed. Lists with duplicate elements will
|
|
also be rejected.
|
|
|
|
<a id="file_type.FileTypeClassifier.run"></a>
|
|
|
|
#### FileTypeClassifier.run
|
|
|
|
```python
|
|
def run(file_paths: Union[Path, List[Path], str, List[str], List[Union[Path, str]]])
|
|
```
|
|
|
|
Sends out files on a different output edge depending on their extension.
|
|
|
|
**Arguments**:
|
|
|
|
- `file_paths`: paths to route on different edges.
|
|
|