haystack/requirements.txt
Lalit Pagaria f46b09c756
Using text hash as id to prevent document duplication (#1000)
* using text hash as id to prevent document duplication. Also providing a way customize it.

* Add latest docstring and tutorial changes

* Fixing duplicate value test when text is same

* Adding test for duplicate ids in document store

* Changing exception to generic Exception type

* add exception for inmemory. update docstring Document. remove id_hash_keys from object attribute

* Add latest docstring and tutorial changes

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-05-17 17:51:52 +02:00

33 lines
720 B
Plaintext

farm==0.7.1
--find-links=https://download.pytorch.org/whl/torch_stable.html
fastapi
uvicorn
gunicorn
pandas
sklearn
psycopg2-binary; sys_platform != 'win32' and sys_platform != 'cygwin'
elasticsearch>=7.7,<=7.10
elastic-apm
tox
coverage
langdetect # for PDF conversions
# optional: sentence-transformers
python-multipart
python-docx
sqlalchemy>=1.4.2
sqlalchemy_utils
# for using FAISS with GPUs, install faiss-gpu
faiss-cpu>=1.6.3
tika
uvloop==0.14; sys_platform != 'win32' and sys_platform != 'cygwin'
httptools
nltk
more_itertools
networkx
# Refer milvus version support matrix at https://github.com/milvus-io/pymilvus#install-pymilvus
pymilvus
# Optional: For crawling
#selenium
#webdriver-manager
SPARQLWrapper
mmh3