Sebastian Husch Lee
2bc7fe1a08
test: reactivate unit tests in test_eval.py
( #5255 )
...
* Activate tests that follow unit test and integration test rules
* Adding more integration labels
* Change name to better reflect complexity of test
* Remove mark integration tags, move test to doc store test for add_eval_data
* Removing incorrect integration label
* Deactivated document store test b/c it fails for Weaviate and pinecone
* Remove unit label since test needs to be refactored to be considered a unit test
* Undo changes
* Undo change
* Check every field in the load evaluation result
* Add back label and add skip reason
* Use pytest skip instead of TODO
2023-07-24 17:07:45 +02:00
bogdankostic
0697f5c63e
fix: Support isolated node eval in run_batch in Generators ( #5291 )
...
* Add isolated node eval to BaseGenerator's run_batch
* Add unit tests
2023-07-07 10:32:43 +02:00
Massimiliano Pippi
4974bf7ab3
chore: remove deprecated MilvusDocumentStore ( #4951 )
...
* remove deprecated MilvusDocumentStore
* remove leftovers
* fix pylint
2023-05-19 16:37:38 +02:00
tstadel
7625829684
fix: EvaluationResult
serialization changes dataframes ( #4906 )
...
* fix nan and index values
* add test
* make test for None values after evalresult read explicit
2023-05-16 16:03:09 +02:00
Vladimir Blagojevic
aebc22d27e
Upgrade transformers to 4.28.1 ( #4665 )
...
* Upgrade to transformers 4.28.1
* Commenting out failing piece of test
* trailing-whitespace
* Adjust regex for error match - it changed between releases
* Remove RAG tests failing with transformers update
2023-04-27 12:55:21 +02:00
Sebastian
8d9136bad4
feat: Implementation of Table Cell Proposal ( #4616 )
...
* Starting adding support for TableCell
* Update tests to use row and col
* Added schema test to check to_dict and from_dict works for Table documents. Also updated Doc.__eq__ to work for tables.
* Update eval test to use TableCell
* Added more schema tests for table docs, labels and answers.
* Add boolean to toggle between Span and TableCell
* Add deprecation message
* Test that table answers work as responses in the rest API
---------
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-04-19 13:14:49 +02:00
Silvano Cerza
5ac3dffbef
test: Rework conftest ( #4614 )
...
* Split root conftest into multiple ones and remove unused fixtures
* Remove some constants and make them fixtures
* Remove unnecessary fixture scoping
* Fix failing whisper tests
* Fix image_file_paths fixture
2023-04-11 10:33:43 +02:00
tstadel
4f90e59796
feat: expose prompts to Answer and EvaluationResult ( #4341 )
...
* store prompt in Answer
* store prompt in eval csv
* fix tests
* chore: fix context offset loadingQ
* add tests
* add test from PR #4476
* fix tests after merge
2023-03-27 17:54:20 +02:00
ju-gu
a3409c7da6
fix: issue evaluation check for content type ( #4181 )
...
* fix: issue evaluation check for content type
Evaluation currently breaks, when the content type is not a str.
* add black
* add test table eval
* add black formatting
* Expand integration test
---------
Co-authored-by: Sebastian Lee <sebastian.lee@deepset.ai>
2023-03-16 17:36:53 +01:00
tstadel
19311119db
fix: EvalResult load migration ( #4289 )
...
* fix evalresult load migration
* handle none values correctly
* better None check
* improve logic and add test
2023-03-06 20:05:02 +01:00
Stefano Fiorucci
e8f9b1b65d
test: replace ElasticsearchDS
with InMemoryDS
when it makes sense; support scale_score
in InMemoryDS
( #4283 )
...
* replace elasticds with imds - first draft
* fix
* fix tests and implement scale_score in imds bm25
* add docstrings for scale_score
2023-03-01 11:35:10 +01:00
Stefano Fiorucci
5e85f33bd3
refactor: Remove deprecated nodes EvalDocuments
and EvalAnswers
( #4194 )
...
* remove deprecated classed and update test
* remove deprecated classed and update test
* remove unused code
* remove unused import
* remove empty evaluator node
* unused import :-)
* move sas to metrics
2023-02-23 15:26:17 +01:00
bogdankostic
05950719ba
fix: Deduplicate same Documents in isolated evaluation of Reader ( #4114 )
...
* Deduplicate same Documents in one MultiLabel
* Add tests
* Update label
* Update label
* Update test
* Update test
* Revert change to check CI
* Revert reversion
* Use deepcopy
* Update tests
2023-02-10 13:55:14 +01:00
Silvano Cerza
274746db07
style: Update black ( #4101 )
...
* Update black version
* Format file with new black style
* Update black pre-commit hook version
2023-02-08 15:34:43 +01:00
tstadel
92c58cfda1
feat: Support multiple document_ids in Answer object (for generative QA) ( #4062 )
...
* initial version without shapers
* set document_ids for BaseGenerator
* introduce question-answering-with-references template
* better prompt
* make PromptTemplate control output_variable
* update schema
* fix add_doc_meta_data_to_answer
* Revert "fix add_doc_meta_data_to_answer"
This reverts commit b994db423ad8272c140ce2b785cf359d55383ff9.
* fix add_doc_meta_data_to_answer
* fix eval
* fix pylint
* fix pinecone
* fix other tests
* fix test
* fix flaky test
* Revert "fix flaky test"
This reverts commit 7ab04275ffaaaca96b4477325ba05d5f34d38775.
* adjust docstrings
* make Label loading backward-compatible
* fix Label backward compatibility for pinecone
* fix Label backward compatibility for search engines
* fix Label backward compatibility for deepset Cloud
* fix tests
* fix None issue
* fix test_write_feedback
* add tests for legacy label support
* add document_id test for pinecone
* reduce unnecessary contents
* add comment to pinecone test
2023-02-08 08:37:22 +01:00
tstadel
9611b64ec5
fix: document retrieval metrics for non-document_id document_relevance_criteria ( #3885 )
...
* fix document retrieval metrics for all document_relevance_criteria
* fix tests
* fix eval_batch metrics
* small refactorings
* evaluate metrics on label level
* document retrieval tests added
* fix pylint
* fix test
* support file retrieval
* add comment about threshold
* rename test
2023-02-02 15:00:07 +01:00
Julian Risch
a2c160e7d8
bug: skip empty documents in reader ( #3773 )
...
* skip empty documents
* test eval_batch and account for tables
2023-01-03 15:50:14 +01:00
Julian Risch
adb580b6b7
feat: add offsets_in_context to evaluation result ( #3640 )
...
* add offsets_in_context to eval result
* extend test case
2022-11-30 11:43:42 +01:00
Stefano Fiorucci
3040e59c63
feat: add support for BM25Retriever
in InMemoryDocumentStore
( #3561 )
...
* very first draft
* implement query and query_batch
* add more bm25 parameters
* add rank_bm25 dependency
* fix mypy
* remove tokenizer callable parameter
* remove unused import
* only json serializable attributes
* try to fix: pylint too-many-public-methods / R0904
* bm25 attribute always present
* convert errors into warnings to make the tutorial 1 work
* add docstrings; tests
* try to make tests run
* better docstrings; revert not running tests
* some suggestions from review
* rename elasticsearch retriever as bm25 in tests; try to test memory_bm25
* exclude tests with filters
* change elasticsearch to bm25 retriever in test_summarizer
* add tests
* try to improve tests
* better type hint
* adapt test_table_text_retriever_embedding
* handle non-textual docs
* query only textual documents
2022-11-22 09:24:52 +01:00
tstadel
0d45cbce56
convert eval metrics to python float ( #3612 )
2022-11-22 09:05:10 +01:00
Massimiliano Pippi
6a48ace9b9
BREAKING CHANGE: remove Milvus1DocumentStore along with support for Milvus < 2.x ( #3552 )
...
* remove milvus1
* leftover
* revert deprecation process
2022-11-15 09:54:55 +01:00
Stefano Fiorucci
1a60e21137
refactor: simplify Summarizer, add Document Merger ( #3452 )
...
* remove generate_single_summary
* update schemas
* remove unused import
* fix mypy
* fix mypy
* test: summarizer doesnt change content
* other test correction
* move test_summarizer_translation to test_extractor_translation
* fix test
* first try for doc merger
* reintroduce and deprecate generate_single_summary
* progress in document merger
* document merger!
* mypy, pylint fixes
* use generator
* added test that will fail in 1.12
* adapt to review
* extended deprecation docstring
* Update test/nodes/test_extractor_translation.py
* Update test/nodes/test_summarizer.py
* Update test/nodes/test_summarizer.py
* black
* documents fixture
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
2022-11-03 16:04:53 +01:00
Sebastian
59857cb492
feat: Speed up reader tests ( #3476 )
...
* Use a smaller reader where possible
* Change scope to module of reader to get faster load times
2022-10-26 19:04:18 +02:00
tstadel
7fe5003c97
fix: eval() with add_isolated_node_eval=True
breaks if no node supports it ( #3347 )
...
* fix isolated eval for pipelines without a node supporting isolated mode
* reformat
* add test
2022-10-10 20:48:13 +02:00
Vladimir Blagojevic
938e6fda5b
Classify pipeline's type based on its components ( #3132 )
...
* Add pipeline get_type mehod
* Add pipeline uptime
* Add pipeline telemetry event sending
* Send pipeline telemetry once a day (at most)
* Add pipeline invocation counter, change invocation counter logic
* Update allowed telemetry parameters - allow pipeline parameters
* PR review: add unit test
2022-09-21 14:53:42 +02:00
Julian Risch
3e3ff33cdd
feat: add batch evaluation method for pipelines ( #2942 )
...
* add basic pipeline.eval_batch for qa without filters
* black formatting
* pydoc-markdown
* remove batch eval tests failing due to bugs
* remove comment
* explain commented out tests
* avoid code duplication
* black
* mypy
* pydoc markdown
* add batch option to execute_eval_run
* pydoc markdown
* Apply documentation suggestions from code review
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* Apply documentation suggestion from code review
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* add documentation based on review comments
* black
* black
* schema updates
* remove duplicate tests
* add separate method for column reordering
* merge _build_eval_dataframe methods
* pylint ignore in function
* change type annotation of queries to list only
* one-liner addressing review comment on params dict
* markdown files updated
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2022-08-25 17:50:57 +02:00
tstadel
0efad96e08
DC SDK: Add possibility to upload evaluation sets to DC ( #2610 )
...
* Add possibility to upload evaluation sets to DC
* fix test_eval sas comparisons
* quickwin docstring feedback changes
* Add hint about annotation tool and mark optional and required columns
* minor changes to docstrings
2022-05-31 17:08:19 +02:00
tstadel
7caca41c5d
Support context matching in pipeline.eval()
( #2482 )
...
* calculate context pred metrics
* Update Documentation & Code Style
* extend doc_relevance_col values
* fix import order
* Update Documentation & Code Style
* fix mypy
* fix typings literal import
* add option for custom document_id_field
* Update Documentation & Code Style
* fix tests and dataframe col-order
* Update Documentation & Code Style
* rename content to context in eval dataframe
* add backward compatibility to EvaluationResult.load()
* Update Documentation & Code Style
* add docstrings
* Update Documentation & Code Style
* support sas
* Update Documentation & Code Style
* add answer_scope param
* Update Documentation & Code Style
* rework doc_relevance_col and keep document_id col in case of custom_document_id_field
* Update Documentation & Code Style
* improve docstrings
* Update Documentation & Code Style
* rename document_relevance_criterion into document_scope
* Update Documentation & Code Style
* add document_scope and answer_scope to print_eval_report
* support all new features in execute_eval_run()
* fix imports
* fix mypy
* Update Documentation & Code Style
* rename pred_label_sas_grid into pred_label_matrix
* update dataframe schema and sorting
* Update Documentation & Code Style
* pass through context_matching params and extend document_scope test
* Update Documentation & Code Style
* add answer_scope tests
* fix context_matching_threshold for document metrics
* shorten dataframe apply calls
* Update Documentation & Code Style
* fix queries getting lost if nothing was retrieved
* Update Documentation & Code Style
* Update Documentation & Code Style
* use document_id scopes
* Update Documentation & Code Style
* fix answer_scope literal
* Update Documentation & Code Style
* update the docs (lg changes)
* Update Documentation & Code Style
* update tutorial 5
* Update Documentation & Code Style
* fix tests
* Add minor lg updates
* final docstring changes
* fix single quotes in docstrings
* Update Documentation & Code Style
* dataframe scopes added for each column
* better docstrings for context_matching params
* Update Documentation & Code Style
* fix summarizer eval test
* Update Documentation & Code Style
* fix test
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2022-05-24 18:11:52 +02:00
Sara Zan
ff4303c51b
[CI refactoring] Categorize tests into folders ( #2554 )
...
* Categorize tests into folders
* Fix linux_ci.yml and an import
* Wrong path
2022-05-17 09:55:53 +01:00