Add debug and debug_logs params to standard pipelines (#1586)

* add debug and debug_logs to standard pipelines

* Add latest docstring and tutorial changes

* fix params

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This commit is contained in:
Malte Pietsch 2021-10-12 16:00:48 +02:00 committed by GitHub
parent 6354528336
commit 9650f7aed1
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 132 additions and 29 deletions

View File

@ -352,14 +352,21 @@ Initialize a Pipeline for Extractive Question Answering.
#### run
```python
| run(query: str, params: Optional[dict] = None)
| run(query: str, params: Optional[dict] = None, debug: Optional[bool] = None, debug_logs: Optional[bool] = None)
```
**Arguments**:
- `query`: the query string.
- `params`: params for the `retriever` and `reader`. For instance,
params={"retriever": {"top_k": 10}, "reader": {"top_k": 5}}
- `query`: The search query string.
- `params`: Params for the `retriever` and `reader`. For instance,
params={"Retriever": {"top_k": 10}, "Reader": {"top_k": 5}}
- `debug`: Whether the pipeline should instruct nodes to collect debug information
about their execution. By default these include the input parameters
they received, the output they generated, and eventual logs (of any severity)
emitted. All debug information can then be found in the dict returned
by this method under the key "_debug"
- `debug_logs`: Whether all the logs of the node should be printed in the console,
regardless of their severity and of the existing logger's settings.
<a name="pipeline.DocumentSearchPipeline"></a>
## DocumentSearchPipeline Objects
@ -385,13 +392,20 @@ Initialize a Pipeline for semantic document search.
#### run
```python
| run(query: str, params: Optional[dict] = None)
| run(query: str, params: Optional[dict] = None, debug: Optional[bool] = None, debug_logs: Optional[bool] = None)
```
**Arguments**:
- `query`: the query string.
- `params`: params for the `retriever` and `reader`. For instance, params={"retriever": {"top_k": 10}}
- `debug`: Whether the pipeline should instruct nodes to collect debug information
about their execution. By default these include the input parameters
they received, the output they generated, and eventual logs (of any severity)
emitted. All debug information can then be found in the dict returned
by this method under the key "_debug"
- `debug_logs`: Whether all the logs of the node should be printed in the console,
regardless of their severity and of the existing logger's settings.
<a name="pipeline.GenerativeQAPipeline"></a>
## GenerativeQAPipeline Objects
@ -418,14 +432,21 @@ Initialize a Pipeline for Generative Question Answering.
#### run
```python
| run(query: str, params: Optional[dict] = None)
| run(query: str, params: Optional[dict] = None, debug: Optional[bool] = None, debug_logs: Optional[bool] = None)
```
**Arguments**:
- `query`: the query string.
- `params`: params for the `retriever` and `generator`. For instance,
params={"retriever": {"top_k": 10}, "generator": {"top_k": 5}}
params={"Retriever": {"top_k": 10}, "Generator": {"top_k": 5}}
- `debug`: Whether the pipeline should instruct nodes to collect debug information
about their execution. By default these include the input parameters
they received, the output they generated, and eventual logs (of any severity)
emitted. All debug information can then be found in the dict returned
by this method under the key "_debug"
- `debug_logs`: Whether all the logs of the node should be printed in the console,
regardless of their severity and of the existing logger's settings.
<a name="pipeline.SearchSummarizationPipeline"></a>
## SearchSummarizationPipeline Objects
@ -455,7 +476,7 @@ Initialize a Pipeline that retrieves documents for a query and then summarizes t
#### run
```python
| run(query: str, params: Optional[dict] = None)
| run(query: str, params: Optional[dict] = None, debug: Optional[bool] = None, debug_logs: Optional[bool] = None)
```
**Arguments**:
@ -463,6 +484,13 @@ Initialize a Pipeline that retrieves documents for a query and then summarizes t
- `query`: the query string.
- `params`: params for the `retriever` and `summarizer`. For instance,
params={"retriever": {"top_k": 10}, "summarizer": {"generate_single_summary": True}}
- `debug`: Whether the pipeline should instruct nodes to collect debug information
about their execution. By default these include the input parameters
they received, the output they generated, and eventual logs (of any severity)
emitted. All debug information can then be found in the dict returned
by this method under the key "_debug"
- `debug_logs`: Whether all the logs of the node should be printed in the console,
regardless of their severity and of the existing logger's settings.
<a name="pipeline.FAQPipeline"></a>
## FAQPipeline Objects
@ -488,13 +516,20 @@ Initialize a Pipeline for finding similar FAQs using semantic document search.
#### run
```python
| run(query: str, params: Optional[dict] = None)
| run(query: str, params: Optional[dict] = None, debug: Optional[bool] = None, debug_logs: Optional[bool] = None)
```
**Arguments**:
- `query`: the query string.
- `params`: params for the `retriever`. For instance, params={"retriever": {"top_k": 10}}
- `debug`: Whether the pipeline should instruct nodes to collect debug information
about their execution. By default these include the input parameters
they received, the output they generated, and eventual logs (of any severity)
emitted. All debug information can then be found in the dict returned
by this method under the key "_debug"
- `debug_logs`: Whether all the logs of the node should be printed in the console,
regardless of their severity and of the existing logger's settings.
<a name="pipeline.TranslationWrapperPipeline"></a>
## TranslationWrapperPipeline Objects

View File

@ -605,13 +605,24 @@ class ExtractiveQAPipeline(BaseStandardPipeline):
self.pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
self.pipeline.add_node(component=reader, name="Reader", inputs=["Retriever"])
def run(self, query: str, params: Optional[dict] = None):
def run(self,
query: str,
params: Optional[dict] = None,
debug: Optional[bool] = None,
debug_logs: Optional[bool] = None):
"""
:param query: the query string.
:param params: params for the `retriever` and `reader`. For instance,
params={"retriever": {"top_k": 10}, "reader": {"top_k": 5}}
:param query: The search query string.
:param params: Params for the `retriever` and `reader`. For instance,
params={"Retriever": {"top_k": 10}, "Reader": {"top_k": 5}}
:param debug: Whether the pipeline should instruct nodes to collect debug information
about their execution. By default these include the input parameters
they received, the output they generated, and eventual logs (of any severity)
emitted. All debug information can then be found in the dict returned
by this method under the key "_debug"
:param debug_logs: Whether all the logs of the node should be printed in the console,
regardless of their severity and of the existing logger's settings.
"""
output = self.pipeline.run(query=query, params=params)
output = self.pipeline.run(query=query, params=params, debug=debug, debug_logs=debug_logs)
return output
@ -625,12 +636,23 @@ class DocumentSearchPipeline(BaseStandardPipeline):
self.pipeline = Pipeline()
self.pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
def run(self, query: str, params: Optional[dict] = None):
def run(self,
query: str,
params: Optional[dict] = None,
debug: Optional[bool] = None,
debug_logs: Optional[bool] = None):
"""
:param query: the query string.
:param params: params for the `retriever` and `reader`. For instance, params={"retriever": {"top_k": 10}}
:param debug: Whether the pipeline should instruct nodes to collect debug information
about their execution. By default these include the input parameters
they received, the output they generated, and eventual logs (of any severity)
emitted. All debug information can then be found in the dict returned
by this method under the key "_debug"
:param debug_logs: Whether all the logs of the node should be printed in the console,
regardless of their severity and of the existing logger's settings.
"""
output = self.pipeline.run(query=query, params=params)
output = self.pipeline.run(query=query, params=params, debug=debug, debug_logs=debug_logs)
document_dicts = [doc.to_dict() for doc in output["documents"]]
output["documents"] = document_dicts
return output
@ -648,13 +670,24 @@ class GenerativeQAPipeline(BaseStandardPipeline):
self.pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
self.pipeline.add_node(component=generator, name="Generator", inputs=["Retriever"])
def run(self, query: str, params: Optional[dict] = None):
def run(self,
query: str,
params: Optional[dict] = None,
debug: Optional[bool] = None,
debug_logs: Optional[bool] = None):
"""
:param query: the query string.
:param params: params for the `retriever` and `generator`. For instance,
params={"retriever": {"top_k": 10}, "generator": {"top_k": 5}}
params={"Retriever": {"top_k": 10}, "Generator": {"top_k": 5}}
:param debug: Whether the pipeline should instruct nodes to collect debug information
about their execution. By default these include the input parameters
they received, the output they generated, and eventual logs (of any severity)
emitted. All debug information can then be found in the dict returned
by this method under the key "_debug"
:param debug_logs: Whether all the logs of the node should be printed in the console,
regardless of their severity and of the existing logger's settings.
"""
output = self.pipeline.run(query=query, params=params)
output = self.pipeline.run(query=query, params=params, debug=debug, debug_logs=debug_logs)
return output
@ -674,13 +707,24 @@ class SearchSummarizationPipeline(BaseStandardPipeline):
self.pipeline.add_node(component=summarizer, name="Summarizer", inputs=["Retriever"])
self.return_in_answer_format = return_in_answer_format
def run(self, query: str, params: Optional[dict] = None):
def run(self,
query: str,
params: Optional[dict] = None,
debug: Optional[bool] = None,
debug_logs: Optional[bool] = None):
"""
:param query: the query string.
:param params: params for the `retriever` and `summarizer`. For instance,
params={"retriever": {"top_k": 10}, "summarizer": {"generate_single_summary": True}}
:param debug: Whether the pipeline should instruct nodes to collect debug information
about their execution. By default these include the input parameters
they received, the output they generated, and eventual logs (of any severity)
emitted. All debug information can then be found in the dict returned
by this method under the key "_debug"
:param debug_logs: Whether all the logs of the node should be printed in the console,
regardless of their severity and of the existing logger's settings.
"""
output = self.pipeline.run(query=query, params=params)
output = self.pipeline.run(query=query, params=params, debug=debug, debug_logs=debug_logs)
# Convert to answer format to allow "drop-in replacement" for other QA pipelines
if self.return_in_answer_format:
@ -714,12 +758,23 @@ class FAQPipeline(BaseStandardPipeline):
self.pipeline = Pipeline()
self.pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
def run(self, query: str, params: Optional[dict] = None):
def run(self,
query: str,
params: Optional[dict] = None,
debug: Optional[bool] = None,
debug_logs: Optional[bool] = None):
"""
:param query: the query string.
:param params: params for the `retriever`. For instance, params={"retriever": {"top_k": 10}}
:param debug: Whether the pipeline should instruct nodes to collect debug information
about their execution. By default these include the input parameters
they received, the output they generated, and eventual logs (of any severity)
emitted. All debug information can then be found in the dict returned
by this method under the key "_debug"
:param debug_logs: Whether all the logs of the node should be printed in the console,
regardless of their severity and of the existing logger's settings.
"""
output = self.pipeline.run(query=query, params=params)
output = self.pipeline.run(query=query, params=params, debug=debug, debug_logs=debug_logs)
documents = output["documents"]
results: Dict = {"query": query, "answers": []}
@ -795,8 +850,13 @@ class QuestionGenerationPipeline(BaseStandardPipeline):
self.pipeline = Pipeline()
self.pipeline.add_node(component=question_generator, name="QuestionGenerator", inputs=["Query"])
def run(self, documents, params: Optional[dict] = None):
output = self.pipeline.run(documents=documents, params=params)
def run(self,
documents,
params: Optional[dict] = None,
debug: Optional[bool] = None,
debug_logs: Optional[bool] = None
):
output = self.pipeline.run(documents=documents, params=params, debug=debug, debug_logs=debug_logs)
return output
@ -810,8 +870,12 @@ class RetrieverQuestionGenerationPipeline(BaseStandardPipeline):
self.pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
self.pipeline.add_node(component=question_generator, name="Question Generator", inputs=["Retriever"])
def run(self, query, params: Optional[dict] = None):
output = self.pipeline.run(query=query, params=params)
def run(self,
query: str,
params: Optional[dict] = None,
debug: Optional[bool] = None,
debug_logs: Optional[bool] = None):
output = self.pipeline.run(query=query, params=params, debug=debug, debug_logs=debug_logs)
return output
@ -842,8 +906,12 @@ class QuestionAnswerGenerationPipeline(BaseStandardPipeline):
return kwargs, output_stream
return wrapper
def run(self, documents: List[Document], params: Optional[dict] = None): # type: ignore
output = self.pipeline.run(documents=documents, params=params)
def run(self,
documents: List[Document], # type: ignore
params: Optional[dict] = None,
debug: Optional[bool] = None,
debug_logs: Optional[bool] = None):
output = self.pipeline.run(documents=documents, params=params, debug=debug, debug_logs=debug_logs)
return output