tstadel
9611b64ec5
fix: document retrieval metrics for non-document_id document_relevance_criteria ( #3885 )
...
* fix document retrieval metrics for all document_relevance_criteria
* fix tests
* fix eval_batch metrics
* small refactorings
* evaluate metrics on label level
* document retrieval tests added
* fix pylint
* fix test
* support file retrieval
* add comment about threshold
* rename test
2023-02-02 15:00:07 +01:00
Julian Risch
a2c160e7d8
bug: skip empty documents in reader ( #3773 )
...
* skip empty documents
* test eval_batch and account for tables
2023-01-03 15:50:14 +01:00
Julian Risch
adb580b6b7
feat: add offsets_in_context to evaluation result ( #3640 )
...
* add offsets_in_context to eval result
* extend test case
2022-11-30 11:43:42 +01:00
Stefano Fiorucci
3040e59c63
feat: add support for BM25Retriever
in InMemoryDocumentStore
( #3561 )
...
* very first draft
* implement query and query_batch
* add more bm25 parameters
* add rank_bm25 dependency
* fix mypy
* remove tokenizer callable parameter
* remove unused import
* only json serializable attributes
* try to fix: pylint too-many-public-methods / R0904
* bm25 attribute always present
* convert errors into warnings to make the tutorial 1 work
* add docstrings; tests
* try to make tests run
* better docstrings; revert not running tests
* some suggestions from review
* rename elasticsearch retriever as bm25 in tests; try to test memory_bm25
* exclude tests with filters
* change elasticsearch to bm25 retriever in test_summarizer
* add tests
* try to improve tests
* better type hint
* adapt test_table_text_retriever_embedding
* handle non-textual docs
* query only textual documents
2022-11-22 09:24:52 +01:00
tstadel
0d45cbce56
convert eval metrics to python float ( #3612 )
2022-11-22 09:05:10 +01:00
Massimiliano Pippi
6a48ace9b9
BREAKING CHANGE: remove Milvus1DocumentStore along with support for Milvus < 2.x ( #3552 )
...
* remove milvus1
* leftover
* revert deprecation process
2022-11-15 09:54:55 +01:00
Stefano Fiorucci
1a60e21137
refactor: simplify Summarizer, add Document Merger ( #3452 )
...
* remove generate_single_summary
* update schemas
* remove unused import
* fix mypy
* fix mypy
* test: summarizer doesnt change content
* other test correction
* move test_summarizer_translation to test_extractor_translation
* fix test
* first try for doc merger
* reintroduce and deprecate generate_single_summary
* progress in document merger
* document merger!
* mypy, pylint fixes
* use generator
* added test that will fail in 1.12
* adapt to review
* extended deprecation docstring
* Update test/nodes/test_extractor_translation.py
* Update test/nodes/test_summarizer.py
* Update test/nodes/test_summarizer.py
* black
* documents fixture
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
2022-11-03 16:04:53 +01:00
Sebastian
59857cb492
feat: Speed up reader tests ( #3476 )
...
* Use a smaller reader where possible
* Change scope to module of reader to get faster load times
2022-10-26 19:04:18 +02:00
tstadel
7fe5003c97
fix: eval() with add_isolated_node_eval=True
breaks if no node supports it ( #3347 )
...
* fix isolated eval for pipelines without a node supporting isolated mode
* reformat
* add test
2022-10-10 20:48:13 +02:00
Vladimir Blagojevic
938e6fda5b
Classify pipeline's type based on its components ( #3132 )
...
* Add pipeline get_type mehod
* Add pipeline uptime
* Add pipeline telemetry event sending
* Send pipeline telemetry once a day (at most)
* Add pipeline invocation counter, change invocation counter logic
* Update allowed telemetry parameters - allow pipeline parameters
* PR review: add unit test
2022-09-21 14:53:42 +02:00
Julian Risch
3e3ff33cdd
feat: add batch evaluation method for pipelines ( #2942 )
...
* add basic pipeline.eval_batch for qa without filters
* black formatting
* pydoc-markdown
* remove batch eval tests failing due to bugs
* remove comment
* explain commented out tests
* avoid code duplication
* black
* mypy
* pydoc markdown
* add batch option to execute_eval_run
* pydoc markdown
* Apply documentation suggestions from code review
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* Apply documentation suggestion from code review
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* add documentation based on review comments
* black
* black
* schema updates
* remove duplicate tests
* add separate method for column reordering
* merge _build_eval_dataframe methods
* pylint ignore in function
* change type annotation of queries to list only
* one-liner addressing review comment on params dict
* markdown files updated
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
2022-08-25 17:50:57 +02:00
tstadel
0efad96e08
DC SDK: Add possibility to upload evaluation sets to DC ( #2610 )
...
* Add possibility to upload evaluation sets to DC
* fix test_eval sas comparisons
* quickwin docstring feedback changes
* Add hint about annotation tool and mark optional and required columns
* minor changes to docstrings
2022-05-31 17:08:19 +02:00
tstadel
7caca41c5d
Support context matching in pipeline.eval()
( #2482 )
...
* calculate context pred metrics
* Update Documentation & Code Style
* extend doc_relevance_col values
* fix import order
* Update Documentation & Code Style
* fix mypy
* fix typings literal import
* add option for custom document_id_field
* Update Documentation & Code Style
* fix tests and dataframe col-order
* Update Documentation & Code Style
* rename content to context in eval dataframe
* add backward compatibility to EvaluationResult.load()
* Update Documentation & Code Style
* add docstrings
* Update Documentation & Code Style
* support sas
* Update Documentation & Code Style
* add answer_scope param
* Update Documentation & Code Style
* rework doc_relevance_col and keep document_id col in case of custom_document_id_field
* Update Documentation & Code Style
* improve docstrings
* Update Documentation & Code Style
* rename document_relevance_criterion into document_scope
* Update Documentation & Code Style
* add document_scope and answer_scope to print_eval_report
* support all new features in execute_eval_run()
* fix imports
* fix mypy
* Update Documentation & Code Style
* rename pred_label_sas_grid into pred_label_matrix
* update dataframe schema and sorting
* Update Documentation & Code Style
* pass through context_matching params and extend document_scope test
* Update Documentation & Code Style
* add answer_scope tests
* fix context_matching_threshold for document metrics
* shorten dataframe apply calls
* Update Documentation & Code Style
* fix queries getting lost if nothing was retrieved
* Update Documentation & Code Style
* Update Documentation & Code Style
* use document_id scopes
* Update Documentation & Code Style
* fix answer_scope literal
* Update Documentation & Code Style
* update the docs (lg changes)
* Update Documentation & Code Style
* update tutorial 5
* Update Documentation & Code Style
* fix tests
* Add minor lg updates
* final docstring changes
* fix single quotes in docstrings
* Update Documentation & Code Style
* dataframe scopes added for each column
* better docstrings for context_matching params
* Update Documentation & Code Style
* fix summarizer eval test
* Update Documentation & Code Style
* fix test
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2022-05-24 18:11:52 +02:00
Sara Zan
ff4303c51b
[CI refactoring] Categorize tests into folders ( #2554 )
...
* Categorize tests into folders
* Fix linux_ci.yml and an import
* Wrong path
2022-05-17 09:55:53 +01:00