Prajwal214 1f27bb7feb
Doc: Adding How-to Guide for Incident Manager (#16674)
* Doc: Adding Docs for Incident Manager

* Doc: Adding Docs for Incident Manager

---------

Co-authored-by: Prajwal Pandit <prajwalpandit@Prajwals-MacBook-Air.local>
Co-authored-by: Shilpa Vernekar <94032785+ShilpaVernekar@users.noreply.github.com>
2024-06-17 09:28:50 -07:00

3.2 KiB
Raw Blame History

title slug
Incident Manager /how-to-guides/data-observability/incident-manager

Incident Manager

Using Incident Manager, managing data quality issues becomes streamlined and efficient. By centralizing the resolution process, assigning tasks, and logging root causes, your team can quickly address and resolve failures. The historical record of past incidents serves as a comprehensive guide, aiding your team in troubleshooting and resolving issues more effectively. All the necessary context is readily available, making it easier to maintain high data quality standards.

Overview of the Incident Manager

The Incident Manager serves as a centralized hub to handle the resolution flow of failed Data Quality Tests. When a test fails, users can:

  • Acknowledge the Issue: Recognize and confirm that there is a problem that needs attention.
  • Assign Responsibility: Designate a specific person or team to address the errors.
  • Log the Root Cause: Document the underlying cause of the failure for future reference and analysis.

Using the Test Resolution Flow

The Test Resolution flow is a critical feature of the Incident Manager. Heres how it works:

  1. Failure Notification: When a Data Quality Test fails, the system generates a notification.
  2. Acknowledge the Failure: The designated user acknowledges the issue within the Incident Manager.
  3. Assignment: The issue is then assigned to a knowledgeable user or team responsible for resolving it.
  4. Status Updates: The assigned user can update the status of the issue, keeping the organization informed about progress and any developments.
  5. Sharing Updates: All impacted users receive updates, ensuring everyone stays informed about the resolution process.

Building a Troubleshooting Handbook

One of the powerful features of the Incident Manager is its ability to store all past failures. This historical data becomes a valuable troubleshooting handbook for your team. Here's how you can leverage it:

  • Explore Similar Scenarios: Review previous incidents to understand how similar issues were resolved.
  • Contextual Information: Access all necessary context directly within OpenMetadata, including previous resolutions, root causes, and responsible teams.
  • Continuous Improvement: Use historical data to improve data quality tests and prevent future failures.

Steps to Get Started

  1. Access the Incident Manager: Navigate to the Incident Manager within the OpenMetadata platform.
  2. Monitor Data Quality Tests: Keep an eye on your data quality tests to quickly identify any failures.
  3. Acknowledge and Assign: Acknowledge any issues promptly and assign them to the appropriate team members.
  4. Log and Learn: Document the root cause of each failure and use the stored information to learn and improve.

By following these steps, you'll ensure that your organization effectively manages data quality issues, maintains high standards, and continuously improves its data quality processes.

{%inlineCalloutContainer%} {%inlineCallout color="violet-70" bold="How to work with Incident Manager" icon="MdMenuBook" href="/how-to-guides/data-observability/incident-manager/workflow"%} Incident Manager Workflow {%/inlineCallout%} {%/inlineCalloutContainer%}