Add debug and debug_logs params to standard pipelines (#1586)

* add debug and debug_logs to standard pipelines * Add latest docstring and tutorial changes * fix params Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-01-06 03:57:19 +00:00 · 2021-10-12 16:00:48 +02:00 · 2021-10-12 16:00:48 +02:00 · 9650f7aed1
commit 9650f7aed1
parent 6354528336
2 changed files with 132 additions and 29 deletions
--- a/docs/_src/api/api/pipelines.md
+++ b/docs/_src/api/api/pipelines.md
@ -352,14 +352,21 @@ Initialize a Pipeline for Extractive Question Answering.
 #### run

 ```python
- | run(query: str, params: Optional[dict] = None)
+ | run(query: str, params: Optional[dict] = None, debug: Optional[bool] = None, debug_logs: Optional[bool] = None)
 ```

 **Arguments**:

- `query`: the query string.
- `params`: params for the `retriever` and `reader`. For instance,
-               params={"retriever": {"top_k": 10}, "reader": {"top_k": 5}}
+- `query`: The search query string.
+- `params`: Params for the `retriever` and `reader`. For instance,
+               params={"Retriever": {"top_k": 10}, "Reader": {"top_k": 5}}
+- `debug`: Whether the pipeline should instruct nodes to collect debug information
+              about their execution. By default these include the input parameters
+              they received, the output they generated, and eventual logs (of any severity)
+              emitted. All debug information can then be found in the dict returned
+              by this method under the key "_debug"
+- `debug_logs`: Whether all the logs of the node should be printed in the console,
+                   regardless of their severity and of the existing logger's settings.

 <a name="pipeline.DocumentSearchPipeline"></a>
 ## DocumentSearchPipeline Objects
@ -385,13 +392,20 @@ Initialize a Pipeline for semantic document search.
 #### run

 ```python
- | run(query: str, params: Optional[dict] = None)
+ | run(query: str, params: Optional[dict] = None, debug: Optional[bool] = None, debug_logs: Optional[bool] = None)
 ```

 **Arguments**:

 - `query`: the query string.
 - `params`: params for the `retriever` and `reader`. For instance, params={"retriever": {"top_k": 10}}
+- `debug`: Whether the pipeline should instruct nodes to collect debug information
+      about their execution. By default these include the input parameters
+      they received, the output they generated, and eventual logs (of any severity)
+      emitted. All debug information can then be found in the dict returned
+      by this method under the key "_debug"
+- `debug_logs`: Whether all the logs of the node should be printed in the console,
+                   regardless of their severity and of the existing logger's settings.

 <a name="pipeline.GenerativeQAPipeline"></a>
 ## GenerativeQAPipeline Objects
@ -418,14 +432,21 @@ Initialize a Pipeline for Generative Question Answering.
 #### run

 ```python
- | run(query: str, params: Optional[dict] = None)
+ | run(query: str, params: Optional[dict] = None, debug: Optional[bool] = None, debug_logs: Optional[bool] = None)
 ```

 **Arguments**:

 - `query`: the query string.
 - `params`: params for the `retriever` and `generator`. For instance,
-               params={"retriever": {"top_k": 10}, "generator": {"top_k": 5}}
+               params={"Retriever": {"top_k": 10}, "Generator": {"top_k": 5}}
+- `debug`: Whether the pipeline should instruct nodes to collect debug information
+      about their execution. By default these include the input parameters
+      they received, the output they generated, and eventual logs (of any severity)
+      emitted. All debug information can then be found in the dict returned
+      by this method under the key "_debug"
+- `debug_logs`: Whether all the logs of the node should be printed in the console,
+                   regardless of their severity and of the existing logger's settings.

 <a name="pipeline.SearchSummarizationPipeline"></a>
 ## SearchSummarizationPipeline Objects
@ -455,7 +476,7 @@ Initialize a Pipeline that retrieves documents for a query and then summarizes t
 #### run

 ```python
- | run(query: str, params: Optional[dict] = None)
+ | run(query: str, params: Optional[dict] = None, debug: Optional[bool] = None, debug_logs: Optional[bool] = None)
 ```

 **Arguments**:
@ -463,6 +484,13 @@ Initialize a Pipeline that retrieves documents for a query and then summarizes t
 - `query`: the query string.
 - `params`: params for the `retriever` and `summarizer`. For instance,
               params={"retriever": {"top_k": 10}, "summarizer": {"generate_single_summary": True}}
+- `debug`: Whether the pipeline should instruct nodes to collect debug information
+      about their execution. By default these include the input parameters
+      they received, the output they generated, and eventual logs (of any severity)
+      emitted. All debug information can then be found in the dict returned
+      by this method under the key "_debug"
+- `debug_logs`: Whether all the logs of the node should be printed in the console,
+                   regardless of their severity and of the existing logger's settings.

 <a name="pipeline.FAQPipeline"></a>
 ## FAQPipeline Objects
@ -488,13 +516,20 @@ Initialize a Pipeline for finding similar FAQs using semantic document search.
 #### run

 ```python
- | run(query: str, params: Optional[dict] = None)
+ | run(query: str, params: Optional[dict] = None, debug: Optional[bool] = None, debug_logs: Optional[bool] = None)
 ```

 **Arguments**:

 - `query`: the query string.
 - `params`: params for the `retriever`. For instance, params={"retriever": {"top_k": 10}}
+- `debug`: Whether the pipeline should instruct nodes to collect debug information
+      about their execution. By default these include the input parameters
+      they received, the output they generated, and eventual logs (of any severity)
+      emitted. All debug information can then be found in the dict returned
+      by this method under the key "_debug"
+- `debug_logs`: Whether all the logs of the node should be printed in the console,
+                   regardless of their severity and of the existing logger's settings.

 <a name="pipeline.TranslationWrapperPipeline"></a>
 ## TranslationWrapperPipeline Objects
--- a/haystack/pipeline.py
+++ b/haystack/pipeline.py
@ -605,13 +605,24 @@ class ExtractiveQAPipeline(BaseStandardPipeline):
        self.pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
        self.pipeline.add_node(component=reader, name="Reader", inputs=["Retriever"])

-    def run(self, query: str, params: Optional[dict] = None):
+    def run(self,
+            query: str,
+            params: Optional[dict] = None,
+            debug: Optional[bool] = None,
+            debug_logs: Optional[bool] = None):
        """
-        :param query: the query string.
-        :param params: params for the `retriever` and `reader`. For instance,
-                       params={"retriever": {"top_k": 10}, "reader": {"top_k": 5}}
+        :param query: The search query string.
+        :param params: Params for the `retriever` and `reader`. For instance,
+                       params={"Retriever": {"top_k": 10}, "Reader": {"top_k": 5}}
+        :param debug: Whether the pipeline should instruct nodes to collect debug information
+                      about their execution. By default these include the input parameters
+                      they received, the output they generated, and eventual logs (of any severity)
+                      emitted. All debug information can then be found in the dict returned
+                      by this method under the key "_debug"
+        :param debug_logs: Whether all the logs of the node should be printed in the console,
+                           regardless of their severity and of the existing logger's settings.
        """
-        output = self.pipeline.run(query=query, params=params)
+        output = self.pipeline.run(query=query, params=params, debug=debug, debug_logs=debug_logs)
        return output


@ -625,12 +636,23 @@ class DocumentSearchPipeline(BaseStandardPipeline):
        self.pipeline = Pipeline()
        self.pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])

-    def run(self, query: str, params: Optional[dict] = None):
+    def run(self,
+            query: str,
+            params: Optional[dict] = None,
+            debug: Optional[bool] = None,
+            debug_logs: Optional[bool] = None):
        """
        :param query: the query string.
        :param params: params for the `retriever` and `reader`. For instance, params={"retriever": {"top_k": 10}}
+        :param debug: Whether the pipeline should instruct nodes to collect debug information
+              about their execution. By default these include the input parameters
+              they received, the output they generated, and eventual logs (of any severity)
+              emitted. All debug information can then be found in the dict returned
+              by this method under the key "_debug"
+        :param debug_logs: Whether all the logs of the node should be printed in the console,
+                           regardless of their severity and of the existing logger's settings.
        """
-        output = self.pipeline.run(query=query, params=params)
+        output = self.pipeline.run(query=query, params=params, debug=debug, debug_logs=debug_logs)
        document_dicts = [doc.to_dict() for doc in output["documents"]]
        output["documents"] = document_dicts
        return output
@ -648,13 +670,24 @@ class GenerativeQAPipeline(BaseStandardPipeline):
        self.pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
        self.pipeline.add_node(component=generator, name="Generator", inputs=["Retriever"])

-    def run(self, query: str, params: Optional[dict] = None):
+    def run(self,
+            query: str,
+            params: Optional[dict] = None,
+            debug: Optional[bool] = None,
+            debug_logs: Optional[bool] = None):
        """
        :param query: the query string.
        :param params: params for the `retriever` and `generator`. For instance,
-                       params={"retriever": {"top_k": 10}, "generator": {"top_k": 5}}
+                       params={"Retriever": {"top_k": 10}, "Generator": {"top_k": 5}}
+        :param debug: Whether the pipeline should instruct nodes to collect debug information
+              about their execution. By default these include the input parameters
+              they received, the output they generated, and eventual logs (of any severity)
+              emitted. All debug information can then be found in the dict returned
+              by this method under the key "_debug"
+        :param debug_logs: Whether all the logs of the node should be printed in the console,
+                           regardless of their severity and of the existing logger's settings.
        """
-        output = self.pipeline.run(query=query, params=params)
+        output = self.pipeline.run(query=query, params=params, debug=debug, debug_logs=debug_logs)
        return output


@ -674,13 +707,24 @@ class SearchSummarizationPipeline(BaseStandardPipeline):
        self.pipeline.add_node(component=summarizer, name="Summarizer", inputs=["Retriever"])
        self.return_in_answer_format = return_in_answer_format

-    def run(self, query: str, params: Optional[dict] = None):
+    def run(self,
+            query: str,
+            params: Optional[dict] = None,
+            debug: Optional[bool] = None,
+            debug_logs: Optional[bool] = None):
        """
        :param query: the query string.
        :param params: params for the `retriever` and `summarizer`. For instance,
                       params={"retriever": {"top_k": 10}, "summarizer": {"generate_single_summary": True}}
+        :param debug: Whether the pipeline should instruct nodes to collect debug information
+              about their execution. By default these include the input parameters
+              they received, the output they generated, and eventual logs (of any severity)
+              emitted. All debug information can then be found in the dict returned
+              by this method under the key "_debug"
+        :param debug_logs: Whether all the logs of the node should be printed in the console,
+                           regardless of their severity and of the existing logger's settings.
                """
-        output = self.pipeline.run(query=query, params=params)
+        output = self.pipeline.run(query=query, params=params, debug=debug, debug_logs=debug_logs)

        # Convert to answer format to allow "drop-in replacement" for other QA pipelines
        if self.return_in_answer_format:
@ -714,12 +758,23 @@ class FAQPipeline(BaseStandardPipeline):
        self.pipeline = Pipeline()
        self.pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])

-    def run(self, query: str, params: Optional[dict] = None):
+    def run(self,
+            query: str,
+            params: Optional[dict] = None,
+            debug: Optional[bool] = None,
+            debug_logs: Optional[bool] = None):
        """
        :param query: the query string.
        :param params: params for the `retriever`. For instance, params={"retriever": {"top_k": 10}}
+        :param debug: Whether the pipeline should instruct nodes to collect debug information
+              about their execution. By default these include the input parameters
+              they received, the output they generated, and eventual logs (of any severity)
+              emitted. All debug information can then be found in the dict returned
+              by this method under the key "_debug"
+        :param debug_logs: Whether all the logs of the node should be printed in the console,
+                           regardless of their severity and of the existing logger's settings.
        """
-        output = self.pipeline.run(query=query, params=params)
+        output = self.pipeline.run(query=query, params=params, debug=debug, debug_logs=debug_logs)
        documents = output["documents"]

        results: Dict = {"query": query, "answers": []}
@ -795,8 +850,13 @@ class QuestionGenerationPipeline(BaseStandardPipeline):
        self.pipeline = Pipeline()
        self.pipeline.add_node(component=question_generator, name="QuestionGenerator", inputs=["Query"])

-    def run(self, documents, params: Optional[dict] = None):
-        output = self.pipeline.run(documents=documents, params=params)
+    def run(self,
+            documents,
+            params: Optional[dict] = None,
+            debug: Optional[bool] = None,
+            debug_logs: Optional[bool] = None
+            ):
+        output = self.pipeline.run(documents=documents, params=params, debug=debug, debug_logs=debug_logs)
        return output


@ -810,8 +870,12 @@ class RetrieverQuestionGenerationPipeline(BaseStandardPipeline):
        self.pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
        self.pipeline.add_node(component=question_generator, name="Question Generator", inputs=["Retriever"])

-    def run(self, query, params: Optional[dict] = None):
-        output = self.pipeline.run(query=query, params=params)
+    def run(self,
+            query: str,
+            params: Optional[dict] = None,
+            debug: Optional[bool] = None,
+            debug_logs: Optional[bool] = None):
+        output = self.pipeline.run(query=query, params=params, debug=debug, debug_logs=debug_logs)
        return output


@ -842,8 +906,12 @@ class QuestionAnswerGenerationPipeline(BaseStandardPipeline):
            return kwargs, output_stream
        return wrapper

-    def run(self, documents: List[Document], params: Optional[dict] = None):  # type: ignore
-        output = self.pipeline.run(documents=documents, params=params)
+    def run(self,
+            documents: List[Document], # type: ignore
+            params: Optional[dict] = None,
+            debug: Optional[bool] = None,
+            debug_logs: Optional[bool] = None):
+        output = self.pipeline.run(documents=documents, params=params, debug=debug, debug_logs=debug_logs)
        return output