Timo Moeller
6892955e95
Add execute permissions ( #1666 )
2021-10-27 17:35:34 +02:00
Timo Moeller
6da2c73611
Add nltk download, add folder for file upload ( #1633 )
2021-10-22 16:03:33 +02:00
Malte Pietsch
bb9ec90d3c
Fix tesseract installation in Dockerfile ( #1405 )
...
* Fix Dockerfile
* Update Dockerfile-GPU
2021-09-02 11:09:30 +02:00
Shahrukh Khan
4822536886
Add ImageToTextConverter and PDFToTextOCRConverter that utilize OCR ( #1349 )
...
* add image.py converter
* add PDFtoImageConverter
* add init to PDFtoImageConverter and classes to __init__
* update imagetotext pipeline
* update imagetotext pipeline
* update imagetotext pipeline
* update imagetotext pipeline
* update imagetotext pipeline
* update imagetotext pipeline
* update imagetotext pipeline
* revert change in base.py in file_conv
* Update base.py
* Update pdf.py
* add ocr file_converter testcase & update dockerfile
* fix tesseract exception message typo
* fix _image_to_text doctstring
* add tesseract installation to CI
* add tesseract installation to CI
* add content test for PDF OCR converter
* update PDFToTextOCRConverter constructor doctsring
* replace image files with tmp paths for image.py convert
* replace image files with tmp paths for image.py convert
* Update README.md
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-09-01 16:42:25 +02:00
Alvise Sembenico
6326cf5710
🐳 add PDF converter dependencies to Docker ( #1107 )
2021-05-31 19:01:02 +02:00
oryx1729
406f7fa679
Disable Gunicorn preload option ( #960 )
2021-04-12 12:46:52 +02:00
oryx1729
6d00eff796
Add PDF converter in Dockerfiles ( #877 )
2021-03-08 09:55:11 +01:00
Malte Pietsch
46530e86f8
Fix sentencepiece dependency in dockerfiles ( #553 )
2020-11-05 12:01:27 +01:00
Guillim
7a43d1a72d
Update readme path in Dockerfile ( #537 )
...
* Update Dockerfile
forgot to change the extension i believe
* Update Dockerfile
* Update Dockerfile-GPU
2020-11-03 10:19:18 +01:00
Malte Pietsch
a92ca04648
Update GPU docker & fix race condition with multiple workers ( #436 )
...
* fix gpu CMD and set tag to latest
* udpate dockerfiles. resolve race condition of index creation with multiple workers
* update dockerfiles for preload. remove try catch for elastic index creation
* add back try/catch. disable multiproc in default config to comply with --preload of gunicorn
* change to pip3 for GPU dockerfile
* remove --preload for gpu
2020-09-29 21:12:44 +02:00
Malte Pietsch
9727829cc6
Rename and restructure modules (database, indexing, schemas) ( #379 )
...
* rename database to documentstore
* move document, label, multilabel to haystack/schema.py
* rename documentstore -> document_store
* split indexing modules -> file_converter + preprocessor
* fix order of imports
* Update tutorial notebooks
* fix torch version in tutorial 4
2020-09-16 18:33:23 +02:00
Malte Pietsch
4da480aa15
Fix dockerfiles
2020-07-16 15:58:49 +02:00
Guillim
c45d54959f
Fix Dockerfile to build successfully without models directory ( #210 )
2020-07-08 17:12:20 +02:00
Guillim
8a616dae75
Adjust Docker and REST API to allow TransformsReader Class ( #180 )
2020-07-07 16:25:36 +02:00
Guillim
27b8c98227
Fix rest api in Docker image after refactoring ( #178 )
2020-06-26 17:52:46 +02:00
Tanay Soni
51a3851f93
Update Dockerfiles to use Gunicorn for deployment ( #69 )
2020-04-21 16:14:51 +02:00
Malte Pietsch
76c5c1d6aa
Improve deployment of REST API (Configs, logging, minor bugs) ( #40 )
...
* remove env variables from dockerfiles
* add more config options to rest api. make fields optional. change to elasticsearch as default
* skip reader if retriever doesn't return anything
* add more config params to farm reader. fix top_k_per_sample
* update FARM version
2020-03-18 12:26:13 +01:00
Malte Pietsch
2164e8550f
Add gpu dockerfile, improve logging, fix minor bug with filtering ( #36 )
...
* add gpu dockerfile. improve logging. fix minor bug with filtering
* fix path
2020-03-12 18:30:42 +01:00
Malte Pietsch
eee2676cb0
update docker for fastAPI
2020-02-28 17:49:08 +01:00
Malte Pietsch
3367b46348
switch name from farm_haystack to haystack
2019-11-27 13:56:03 +01:00
Tanay Soni
f5921548ba
Initial Commit
2019-11-14 11:42:51 +01:00