haystack/docs/_src/api/api/file_classifier.md
Daniel Bichuetti e1f399284f
refactor: update dependencies and remove pins (#3147)
* refactor: remove azure-core, pydoc and hf-hub pins

* fix: remove extra-comma

* fix: force minimum version of azure forms recognizer

* refactor: allow newer ocr libs

* refactor: update more dependencies and container versions

* refactor: remove extra comment

* docs: pre-commit manual run

* refactor: remove unnecessary dependency

* tests: update weaviate container image version
2022-09-05 14:30:35 +02:00

46 lines
1.1 KiB
Markdown

<a id="file_type"></a>
# Module file\_type
<a id="file_type.FileTypeClassifier"></a>
## FileTypeClassifier
```python
class FileTypeClassifier(BaseComponent)
```
Route files in an Indexing Pipeline to corresponding file converters.
<a id="file_type.FileTypeClassifier.__init__"></a>
#### FileTypeClassifier.\_\_init\_\_
```python
def __init__(supported_types: List[str] = DEFAULT_TYPES)
```
Node that sends out files on a different output edge depending on their extension.
**Arguments**:
- `supported_types`: The file types that this node can distinguish between.
The default values are: `txt`, `pdf`, `md`, `docx`, and `html`.
Lists with duplicate elements are not allowed.
<a id="file_type.FileTypeClassifier.run"></a>
#### FileTypeClassifier.run
```python
def run(file_paths: Union[Path, List[Path], str, List[str], List[Union[Path,
str]]])
```
Sends out files on a different output edge depending on their extension.
**Arguments**:
- `file_paths`: paths to route on different edges.