time, notifying you when a breaking change occurs.
In this article, we'll cover the basics of monitoring Schema Assertions - what they are, how to configure them, and more - so that you and your team can
start building trust in your most important data assets.
Let's get started!
## Support
Schema Assertions are currently supported for all data sources that provide a schema via the normal ingestion process.
## What is a Schema Assertion?
A **Schema Assertion** is a Data Quality rule used to monitor the columns in a particular table and their data types.
They allow you to define a set of "required" columns for the table along with their expected types, and then be notified
if anything changes via a failing assertion.
This type of assertion can be particularly useful if you want to monitor the structure of a table which is outside of your
direct control, for example the result of an ETL process from an upstream application or tables provided by a 3rd party data vendor. It
allows you to get ahead of potentially breaking schema changes, by alerting you as soon as they occur, and before
they have a chance to negatively impact downstream assets.
### Anatomy of a Schema Assertion
At the most basic level, **Schema Assertions** consist of a few important parts:
1. A **Condition Type**
2. A set of **Expected Columns**
In this section, we'll give an overview of each.
#### 1. Condition Type
The **Condition Type** defines the conditions under which the Assertion will **fail**. More concretely, it determines
how the _expected_ columns should be compared to the _actual_ columns found in the schema to determine a passing or failing
state for the data quality check.
The list of supported condition types:
- **Contains**: The assertion will fail if the actual schema does not contain all expected columns and their types.
- **Exact Match**: The assertion will fail if the actual schema does not EXACTLY match the expected columns and their types. No
additional columns will be permitted.
Schema Assertions will be evaluated whenever a change in the schema of the underlying table is detected.
They also have an off switch: they can be started or stopped at any time by pressing the start (play) or stop (pause) buttons.
#### 2. Expected Columns
The **Expected Columns** are a set of column **names** along with their high-level **data
types** that should be used to compare against the _actual_ columns found in the table. By default, the expected column
set will be derived from the current set of columns found in the table. This conveniently allows you to "freeze" or "lock"
the current schema of a table in just a few clicks.
Each "expected column" is composed of a
1.**Name**: The name of the column that should be present in the table. Nested columns are supported in a flattened
fashion by simply providing a dot-separated path to the nested column. For example, `user.id` would be a nested column `id`.
In the case of a complex array or map, each field in the elements of the array or map will be treated as dot-delimited columns.
Note that verifying the specific type of object in primitive arrays or maps is not currently supported. Note that the comparison performed
is currently not case-sensitive.
2.**Type**: The high-level data type of the column in the table. This type intentionally "high level" to allow for normal column widening practices
without the risk of failing the assertion unnecessarily. For example a `varchar(64)` and a `varchar(256)` will both resolve to the same high-level
"STRING" type. The currently supported set of data types include the following:
- String
- Number
- Boolean
- Date
- Timestamp
- Struct
- Array
- Map
- Union
- Bytes
- Enum
## Creating a Schema Assertion
### Prerequisites
- **Permissions**: To create or delete Schema Assertions for a specific entity on DataHub, you'll need to be granted the
`Edit Assertions`, `Edit Monitors` privileges for the entity. This will be granted to Entity owners as part of the `Asset Owners - Metadata Policy`
by default.
Once these are in place, you're ready to create your Schema Assertions!
6. Define the **expected columns** that will be continually compared against the actual column set. This defaults to the current columns for the table.
- **Raise incident**: Automatically raise a new DataHub Incident for the Table whenever the Custom SQL Assertion is failing. This
may indicate that the Table is unfit for consumption. Configure Slack Notifications under **Settings** to be notified when
an incident is created due to an Assertion failure.
- **Resolve incident**: Automatically resolved any incidents that were raised due to failures in this Custom SQL Assertion. Note that
any other incidents will not be impacted.
Then click **Next**.
7. (Optional) Add a **description** for the assertion. This is a human-readable description of the assertion. If you do not provide one, a description will be generated for you.