haystack/releasenotes/notes/named-entity-extractor-component-8fd647ee748892ca.yaml
Madeesh Kannan e6d6ce1c73
feat: Add NamedEntityExtractorcomponent (#6689)
* feat: Add `NamedEntityExtractor`component

This component accepts a list of `Document`s which it annotates with named entities. The annotations are stored in the `meta` dictionary of each `Document` under a specific key.

The component currently support two backends for the annotation models: Hugging Face `transformers` and spaCy.

* Address comments

* Expand release note

* Add the `[torch]` extra package specifier to the lazy import

* Remove dead code

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2024-01-09 17:56:20 +01:00

7 lines
563 B
YAML

---
features:
- |
Added a new extractor component, namely NamedEntityExtractor. This component accepts a list of Documents as its input - the raw text in the documents are annotated by the extractor and the annotations are stored in the document's meta dictionary (under the key named_entities).
The component is designed to support multiple NER backends, and the current implementations support two at the moment: Hugging Face and spaCy. These two backends implement support for any HF/spaCy model that supports token classification/NER respectively.