Pere Miquel Brull 34fbe5d64c
Docs - Prepare 1.7 docs and 1.8 snapshot (#20882)
* DOCS - Prepare 1.7 Release and 1.8 SNAPSHOT

* DOCS - Prepare 1.7 Release and 1.8 SNAPSHOT
2025-04-18 12:12:17 +05:30

2.8 KiB

title slug
Set Up Anomaly Detection in Collate for Data Quality /how-to-guides/data-quality-observability/anomaly-detection/setting-up

Steps to Set Up Anomaly Detection

1. Create a Test from the UI

  • First, select the dataset and navigate to the Tests section in the Collate UI.
  • Define your test parameters. You can either create a static test (e.g., "no null values" or "data should not exceed a certain range") or configure dynamic assertions to let the system learn from the data.

{% image src="/images/v1.7/how-to-guides/anomaly-detection/set-up-anomaly-detection-1.png" alt="Manual Configuration of Tests" caption="Manual Configuration of Tests" /%}

{% image src="/images/v1.7/how-to-guides/anomaly-detection/set-up-anomaly-detection-2.png" alt="Manual Configuration of Tests" caption="Manual Configuration of Tests" /%}

2. Configure Manual Tests

  • For more controlled monitoring, set up manual thresholds (e.g., sales should not exceed a maximum value of 100). This provides specific control over data validation criteria.

3. Enable Dynamic Assertions

  • For data that naturally fluctuates or evolves, enable dynamic assertions. Collate will start profiling your data regularly to learn its normal behavior.
  • Over time (e.g., five weeks), the system will establish expected value ranges and detect any deviations from these patterns.

{% image src="/images/v1.7/how-to-guides/anomaly-detection/set-up-anomaly-detection-3.png" alt="Manual Configuration of Tests" caption="Manual Configuration of Tests" /%}

4. Monitor Incidents

  • After configuring tests, monitor for any incidents triggered by anomalies detected in the system.
  • Investigate significant spikes, drops, or unusual behaviors in the data, which may indicate system errors, backend failures, or unexpected external factors.

{% image src="/images/v1.7/how-to-guides/anomaly-detection/set-up-anomaly-detection-4.png" alt="Manual Configuration of Tests" caption="Manual Configuration of Tests" /%}

Best Practices

  • Use Static Assertions for Simple Rules: For basic data validation, such as preventing null values or enforcing a minimum threshold, static assertions are effective and straightforward to configure.
  • Leverage Dynamic Assertions for Evolving Data: When dealing with datasets that naturally fluctuate (e.g., sales or user activity), dynamic assertions can save time and ensure incidents are only triggered when significant anomalies occur.
  • Regularly Review Incidents: Stay on top of incidents generated by anomaly detection to promptly identify and address data quality issues.
  • Combine Manual and Dynamic Methods: For datasets with well-defined boundaries and evolving characteristics, combining manual thresholds and dynamic assertions provides comprehensive anomaly detection coverage.