* Add knowledge graph module
* Fix type hint
* Add graph retriver module
* Change type annotations, change return format
* Add graph retriever that executes questions as sparql queries
* Linking only those entities that are in the knowledge graph
* Added logging and using relations extracted from Knowledge graph for linking
* Preventing entity linking from linking the same token to multiple entities
* Pruning triples that have no variables for select and count queries
* Support knowledge graphs with Pipelines
* Add text2sparql
* Entity linking and relation linking consider more special cases now based on evaluation on labelled data
* Separating example code from KGQA implementation
* Add eval on combined extarctive and kg questions
* Remove references to hp-test
* Add fields sparql_query and long_answer_list to metadata
* Removing modular Question2SPARQL approach
* Removing additional classes used for modular kgqa approach
* preparing lcquad data
* change graph db
* Translating namespaces in knowledge graph queries
* Creating graphdb index and loading triples from .ttl file
* Fetching graph config files, triples and model from S3
* Fix incompatibility issues with BaseGraphRetriever and BaseComponent
* Removing unused utility functions
* Adding doc strings and tutorial header
* Adding sparqlwrapper dependency
* Moving tutorial header
* Sorting tutorials by number within name of notebook
* Add latest docstring and tutorial changes
* Creating test cases for knowledge graph
* Changing knowledge graph example to harry potter
* Add latest docstring and tutorial changes
* Adapting the tutorial notebook to harry potter example
* Add GraphDB fixture for tests
* Add latest docstring and tutorial changes
* Added GraphDB docker launch to CI
* Use correct GraphDB fixture
* Check if GraphDB instance is already running
* Renaming question/query and incorporating other feedback from Timo and Tanay
* Removed type annotation
* Add latest docstring and tutorial changes
Co-authored-by: oryx1729 <oryx1729@protonmail.com>
Co-authored-by: Timo Moeller <timo.moeller@deepset.ai>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
A proposal to reduce the precision shown in the `EvalRetriever.print` and `EvalReader.print` to 4 significant figures. If the user wants the full precision, they can access the class attributes directly.
Before
```
Retriever
-----------------
has_answer recall: 0.8739495798319328 (208/238)
no_answer recall: 1.00 (120/120) (no_answer samples are always treated as correctly retrieved)
recall: 0.9162011173184358 (328 / 358)
```
After
```
Retriever
-----------------
has_answer recall: 0.8739 (208/238)
no_answer recall: 1.00 (120/120) (no_answer samples are always treated as correctly retrieved)
recall: 0.9162 (328 / 358)
```
If the first query in the evaluation returns a document with `no_answer=True` we got a division by zero error because neither `self.has_answer_correct` or `self.has_answer_count` get incremented. This fix moves the `self.has_answer_recall` calculation within the if-else block.
* add fetch_data_from_url to extract data and store as files
* corrected a typo
* corrected variable name error
* correction of urlparse error
* type error
* added selenium, urllib to requirements
* removed urllib
* minor changes and added function to find out inpage navigation links
* quick duplicate links fix
* quick type annotation fix
* created seperate module for crawler
* type error fix
* type error fix
* import fix
* quick type error fix
* addee return description
* updated include type to list
* refactor modules. Add Crawler class. rename params.
* add basic pipeline compatibility
* update docstrings
* fix mypy issues
* update args, docstrings, return filepaths
* fix mypy
* make urls optional in init
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>