25 Commits

Author SHA1 Message Date
Sara Zan
be8f50c9e3
Add DELETE /feedback for testing and make the label's id generate server-side (#2159)
* Add DELETE /feedback for testing and make the ID generate server-side

* Make sure to delete only user generated labels

* Reduce fixture scope, was too broad

* Make test a bit more generic

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-02-14 11:43:26 +01:00
Sara Zan
40328a57b6
Introduce pylint & other improvements on the CI (#2130)
* Make mypy check also ui and rest_api, fix ui

* Remove explicit type packages from extras, mypy now downloads them

* Make pylint and mypy run on every file except tests

* Rename tasks

* Change cache key

* Fix mypy errors in rest_api

* Normalize python versions to avoid cache misses

* Add all exclusions to make pylint pass

* Run mypy on rest_api and ui as well

* test if installing the package really changes outcome

* Comment out installation of packages

* Experiment: randomize tests

* Add fallback installation steps on cache misses

* Remove randomization

* Add comment on cache

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-02-09 18:27:12 +01:00
Sara Zan
9dc89d2bd2
Fix dependency related build issues in Dockerfiles (#2135)
* Fix a path issue in Dockerfile-GPU

* Fix paths in Dockerfile-GPU

* Add workflow_dispatch to docker build task

* Remove reference to optional component from ui/, not needed anymore

* Move pytorch installation last to avoid replacing it later

* Remove optional import from rest_api too, no more needed

* Change path in ui/Dockerfile

* ui container works again

* Complete review of import paths

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-02-09 17:35:18 +01:00
Sara Zan
a59bca3661
Apply black formatting (#2115)
* Testing black on ui/

* Applying black on docstores

* Add latest docstring and tutorial changes

* Create a single GH action for Black and docs to reduce commit noise to the minimum, slightly refactor the OpenAPI action too

* Remove comments

* Relax constraints on pydoc-markdown

* Split temporary black from the docs. Pydoc-markdown was obsolete and needs a separate PR to upgrade

* Fix a couple of bugs

* Add a type: ignore that was missing somehow

* Give path to black

* Apply Black

* Apply Black

* Relocate a couple of type: ignore

* Update documentation

* Make Linux CI run after applying Black

* Triggering Black

* Apply Black

* Remove dependency, does not work well

* Remove manually double trailing commas

* Update documentation

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-02-03 13:43:18 +01:00
Sara Zan
d470b9d0bd
Improve dependency management (#1994)
* Fist attempt at using setup.cfg for dependency management

* Trying the new package on the CI and in Docker too

* Add composite extras_require

* Add the safe_import function for document store imports and add some try-catch statements on rest_api and ui imports

* Fix bug on class import and rephrase error message

* Introduce typing for optional modules and add type: ignore in sparse.py

* Include importlib_metadata backport for py3.7

* Add colab group to extra_requires

* Fix pillow version

* Fix grpcio

* Separate out the crawler as another extra

* Make paths relative in rest_api and ui

* Update the test matrix in the CI

* Add try catch statements around the optional imports too to account for direct imports

* Never mix direct deps with self-references and add ES deps to the base install

* Refactor several paths in tests to make them insensitive to the execution path

* Include tstadel review and re-introduce Milvus1 in the tests suite, to fix

* Wrap pdf conversion utils into safe_import

* Update some tutorials and rever Milvus1 as default for now, see #2067

* Fix mypy config


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-01-26 18:12:55 +01:00
Fabrice Depaulis
77d52ad215
Rely api healthcheck on status code rather than json decoding (#1871)
* Rely api healthcheck on status code rather than json decoding

* Install UI dependencies on the Linux and Windows CI

Co-authored-by: Fabrice Depaulis <fabrice.depaulis@orange.com>
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
2021-12-10 18:05:23 +01:00
Sara Zan
983b20f28d
Demo UI fix debug info (#1846)
* Fix debug info

* Make enter to run work better

* Reintroduce default question in the eval dataset

* Outputting valid json instead of a Python dict
2021-12-06 18:55:39 +01:00
Sara Zan
99365e1d8e
Add backlink below the context, if available in the doc's meta (#1834) 2021-12-02 13:37:23 +01:00
Sara Zan
935689e630
Demo UI add env vars & other small fixes (#1828)
* Add more env vars to the streamlit ui

* Add some more questions to the random ones

* Relax a statuscode check and rename env vars

* Make query error message more descriptive

* Add log message

* Align docker-compose with and without GPU

* Typo in pipeline filename

* Remove prefix from var in docker_compose

* Align docker-compose.yml and add small sleep to the initialized poller to prevent spamming

* Fix the name of the dockerfile used to build the GPU image
2021-11-30 18:11:54 +01:00
Sara Zan
fb511dc4a3
Remove feedback from no-answers (#1827)
* Fix some miscopied code

* Remove feedback from the no-answer, seems the backend can't take it

* Try to raise concurrent requests per worker

* Remove the actual number of workers
2021-11-29 19:42:10 +01:00
Sara Zan
c29f960c47
Fix UI demo feedback (#1816)
* Fix the feedback function of the demo with a workaround

* Some docstring

* Update tests and rename methods in feedback.py

* Fix tests

* Remove operation_ids

* Add a couple of status code checks
2021-11-29 17:03:54 +01:00
Sara Zan
7167a26483
Small fixes to the public demo (#1781)
* Make strealit tolerant to haystack not knowing its version, and adding special error for docstore issues

* Add workaround for a Streamlit bug

* Make default filters value an empty dict

* Return more context for each answer in the rest api

* Make the hs_version call not-blocking by adding a very quick timeout

* Add disclaimer on low confidence answer

* Use the no-answer feature of the reader to highlight questions with no good answer
2021-11-22 19:06:08 +01:00
Sara Zan
d81897535e
Public demo (#1747)
* Queries now run only when pressing RUN. File upload hidden. Question is not sent if the textbox is empty.

* Add latest docstring and tutorial changes

* Tidy up: remove needless state, add comments, fix minor bugs

* Had to add results to the status to avoid some bugs in eval mode

* Added 'credits'

* Add footers, update requirements, some random questions for the evaluation

* Add requested changes

* Temporary rollback the UI to the old GoT dataset

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-11-19 11:34:32 +01:00
Sara Zan
bb066c0a2c
Fix for the Streamlit demo (was sending parameters to a non-existing node of the pipeline) (#1620) 2021-10-20 11:55:29 +02:00
Malte Pietsch
caba590576
Fix answer format in ui (#1591)
* fix answer format in ui

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-10-13 16:48:33 +02:00
Malte Pietsch
4a6c9302b3
Redesign primitives - Document, Answer, Label (#1398)
* first draft / notes on new primitives

* wip label / feedback refactor

* rename doc.text -> doc.content. add doc.content_type

* add datatype for content

* remove faq_question_field from ES and weaviate. rename text_field -> content_field in docstores. update tutorials for content field

* update converters for . Add warning for empty

* renam label.question -> label.query. Allow sorting of Answers.

* WIP primitives

* update ui/reader for new Answer format

* Improve Label. First refactoring of MultiLabel. Adjust eval code

* fixed workflow conflict with introducing new one (#1472)

* Add latest docstring and tutorial changes

* make add_eval_data() work again

* fix reader formats. WIP fix _extract_docs_and_labels_from_dict

* fix test reader

* Add latest docstring and tutorial changes

* fix another test case for reader

* fix mypy in farm reader.eval()

* fix mypy in farm reader.eval()

* WIP ORM refactor

* Add latest docstring and tutorial changes

* fix mypy weaviate

* make label and multilabel dataclasses

* bump mypy env in CI to python 3.8

* WIP refactor Label ORM

* WIP refactor Label ORM

* simplify tests for individual doc stores

* WIP refactoring markers of tests

* test alternative approach for tests with existing parametrization

* WIP refactor ORMs

* fix skip logic of already parametrized tests

* fix weaviate behaviour in tests - not parametrizing it in our general test cases.

* Add latest docstring and tutorial changes

* fix some tests

* remove sql from document_store_types

* fix markers for generator and pipeline test

* remove inmemory marker

* remove unneeded elasticsearch markers

* add dataclasses-json dependency. adjust ORM to just store JSON repr

* ignore type as dataclasses_json seems to miss functionality here

* update readme and contributing.md

* update contributing

* adjust example

* fix duplicate doc handling for custom index

* Add latest docstring and tutorial changes

* fix some ORM issues. fix get_all_labels_aggregated.

* update drop flags where get_all_labels_aggregated() was used before

* Add latest docstring and tutorial changes

* add to_json(). add + fix tests

* fix no_answer handling in label / multilabel

* fix duplicate docs in memory doc store. change primary key for sql doc table

* fix mypy issues

* fix mypy issues

* haystack/retriever/base.py

* fix test_write_document_meta[elastic]

* fix test_elasticsearch_custom_fields

* fix test_labels[elastic]

* fix crawler

* fix converter

* fix docx converter

* fix preprocessor

* fix test_utils

* fix tfidf retriever. fix selection of docstore in tests with multiple fixtures / parameterizations

* Add latest docstring and tutorial changes

* fix crawler test. fix ocrconverter attribute

* fix test_elasticsearch_custom_query

* fix generator pipeline

* fix ocr converter

* fix ragenerator

* Add latest docstring and tutorial changes

* fix test_load_and_save_yaml for elasticsearch

* fixes for pipeline tests

* fix faq pipeline

* fix pipeline tests

* Add latest docstring and tutorial changes

* fix weaviate

* Add latest docstring and tutorial changes

* trigger CI

* satisfy mypy

* Add latest docstring and tutorial changes

* satisfy mypy

* Add latest docstring and tutorial changes

* trigger CI

* fix question generation test

* fix ray. fix Q-generation

* fix translator test

* satisfy mypy

* wip refactor feedback rest api

* fix rest api feedback endpoint

* fix doc classifier

* remove relation of Labels -> Docs in SQL ORM

* fix faiss/milvus tests

* fix doc classifier test

* fix eval test

* fixing eval issues

* Add latest docstring and tutorial changes

* fix mypy

* WIP replace dataclasses-json with manual serialization

* Add latest docstring and tutorial changes

* revert to dataclass-json serialization for now. remove debug prints.

* update docstrings

* fix extractor. fix Answer Span init

* fix api test

* keep meta data of answers in reader.run()

* fix meta handling

* adress review feedback

* Add latest docstring and tutorial changes

* make document=None for open domain labels

* add import

* fix print utils

* fix rest api

* adress review feedback

* Add latest docstring and tutorial changes

* fix mypy

Co-authored-by: Markus Paff <markuspaff.mp@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-10-13 14:23:23 +02:00
Sara Zan
2de5385ac2
Add "API is loading" message in the UI (#1493)
* Create the /initialized endpoint

* Now showing an error message if the connection fails, and a 'Haystack is loading' message while workers are starting up

* Improve the appearance of the various messages

* Newline at the end of file
2021-09-27 16:40:25 +02:00
oryx1729
9dd7c74f4f
Refactor communication between Pipeline Components (#1321) 2021-09-10 11:41:16 +02:00
Julian Risch
eb990c9688
Removing probability field from answers in favor of score field (#1340)
* Removing probability field from reader and from test cases

* Add switch to FARMReader to choose score/probability

* Remove probability field from doc returned by doc store

* Relax assertion testing joined es and dpr predictions

* Use switch for confidence scores also for no_answer

* Add test that checks switching to old answer scores > 10

* Normalize score in elastic doc store and reset reader.md

* Scale weights of JoinDocuments to sum to 1 and adapt test case
2021-08-17 10:27:11 +02:00
Ikram Ali
29e140196b
[pipeline] Allow for batch indexing when using Pipelines fix #1168 (#1231)
* [pipeline] Allow for batch indexing when using Pipelines fix #1168

* [pipeline] Test case fixed fix #1168

* [file_converter] Path.suffix updated #1168

* [file_converter] meta can be one of these three cases:
                 A single dict that is applied to all files
                 One dict for each file being converted
                 None #1168

* [file_converter] mypy error fixed.

* [file_converter] mypy error fixed.

* [rest_api] batch file upload introduced in indexing API.

* [test_case] Test_api file upload parameter name updated.

* [ui] Streamlit file upload parameter updated.
2021-06-30 14:13:46 +02:00
Bhadresh Savani
37a72d2f45
Add File Upload Functionality in UI (#995) 2021-04-30 10:46:30 +02:00
Markus Paff
cf8a622e35
Streamlit UI Evaluation mode (#920)
* first running version of eval mode

* restructuring, new naming of elements and testing

* add new files to Docker, how to start with Haystack reference, remove not needed dependencies

* Add latest docstring and tutorial changes

* merged changes

* fixing bugs after breaking changes from last release

* newser version of states in streamlit, more docs for eval mode, eval file as env virable

* eval file as env variable

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-04-22 17:30:17 +02:00
oryx1729
8c68699e1c
Refactor REST APIs to use Pipelines (#922) 2021-04-07 17:53:32 +02:00
Malte Pietsch
0eaae3c0dd
Fix UI when API returns fewer answers than expected (#828)
* fix ui for few answers from api. add top_k_per_sample env

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-02-15 14:27:17 +01:00
Tanmay Laud
7cd9e09491
Add basic demo UI via streamlit (#671)
* Added starter code for frontend demo

* worked on comments

* Added Docker config for frontend

* update docker file. restructure folder structure. minimal renamings and defaults

* add screenshot to readme

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2020-12-27 13:36:09 +01:00