Using the OpenMetadata Azure SQL connector requires supporting services and software. Please ensure your host system meets the requirements listed below. Then continue to follow the procedure for installing and configuring this connector.
If you have not already deployed OpenMetadata, please follow the instructions to [Run OpenMetadata](../../../try-openmetadata/run-openmetadata.md) to get up and running.
In this step, we'll create a Python virtual environment. Using a virtual environment enables us to avoid conflicts with other Python installations and packages on your host system.
Throughout the docs, we use a consistent directory structure for OpenMetadata services and connector installation. If you have not already done so by following another guide, please create an openmetadata directory now and change into that directory in your command line environment.
Ensure that you have the latest version of pip by running the following command. If you have followed the steps above, this will upgrade pip in your virtual environment.
Once the virtual environment is set up and activated as described in Step 1, run the following command to install the Python module for the Azure SQL connector.
Create a new file called `azuresql.json` in the current directory. Note that the current directory should be the `openmetadata` directory you created in Step 1.
Note: The `source.config` field in the configuration JSON will include the majority of the settings for your connector. In the steps below we describe how to customize the key-value pairs in the `source.config` field to meet your needs.
In this step we will configure the Azure SQL service settings required for this connector. Please follow the instructions below to ensure that you've configured the connector to read from your Azure SQL service as desired.
Edit the value for `source.config.host_port` in `azuresql.json` for your Azure SQL deployment. Use the `host:port` format illustrated in the example below.
OpenMetadata uniquely identifies services by their `service_name`. Edit the value for `source.config.service_name` with a name that distinguishes this deployment from other services, including other Azure SQL services that you might be ingesting metadata from.
If you want to limit metadata ingestion to a single database, include the `source.config.database` field in your configuration file. If this field is not included, the connector will ingest metadata from all databases that the specified user is authorized to read.
To specify a single database to ingest metadata from, provide the name of the database as the value for the `source.config.database` key as illustrated in the example below.
When enabled, the data profiler will run as part of metadata ingestion. Running the data profiler increases the amount of time it takes for metadata ingestion, but provides the benefits mentioned above.
You may disable the data profiler by setting the value for the key `source.config.data_profiler_enabled` to `"false"` as follows. We've done this in the configuration template provided.
If you've enabled the data profiler in Step 5, run the following command to install the Python module for the data profiler. You'll need this to run the ingestion workflow.
Use `source.config.table_filter_pattern.excludes` to exclude all tables with names matching one or more of the supplied regular expressions. All other tables will be included. See below for an example. This example is also included in the configuration template provided.
Use `source.config.table_filter_pattern.includes` to include all tables with names matching one or more of the supplied regular expressions. All other tables will be excluded. See below for an example.
```json
"table_filter_pattern": {
"includes": ["corp.*", "dept.*"]
}
```
See the documentation for the [Python re module](https://docs.python.org/3/library/re.html) for information on how to construct regular expressions.
{% hint style="info" %}
You may use either `excludes` or `includes` but not both in `table_filter_pattern.`
Use `source.config.schema_filter_pattern.excludes` and `source.config.schema_filter_pattern.includes` field to select the schemas for metadata ingestion by name. The configuration template provides an example.
The syntax and semantics for `schema_filter_pattern` are the same as for [`table_filter_pattern`](azure-sql.md#table\_filter\_pattern-optional). Please check that section for details.
Use the `source.config.generate_sample_data` field to control whether or not to generate sample data to include in table views in the OpenMetadata user interface. The image below provides an example.
If set to true, the connector will collect the first 50 rows of data from each table included in ingestion, and catalog that data as sample data, which users can refer to in the OpenMetadata user interface.
DBT provides transformation logic that creates tables and views from raw data. OpenMetadata includes an integration for DBT that enables you to see the models used to generate a table from that table's details page in the OpenMetadata user interface. The image below provides an example.
To include DBT models and metadata in your ingestion workflows, specify the location of the DBT manifest and catalog files as fields in your configuration file.
#### dbt\_manifest\_file (optional)
Use the field `source.config.dbt_manifest_file` to specify the location of your DBT manifest file. See below for an example.
```json
"dbt_manifest_file": "./dbt/manifest.json"
```
#### dbt\_catalog\_file (optional)
Use the field `source.config.dbt_catalog_file` to specify the location of your DBT catalog file. See below for an example.
You need not make any changes to the fields defined for `sink` in the template code you copied into `azuresql.json` in Step 4. This part of your configuration file should be as follows.
You need not make any changes to the fields defined for `metadata_server` in the template code you copied into `azuresql.json` in Step 4. This part of your configuration file should be as follows.
As the ingestion workflow runs, you may observe progress both from the command line and from the OpenMetadata user interface. To view the metadata ingested from Azure SQL, visit [http://localhost:8585/explore/tables](http://localhost:8585/explore/tables). Select the Azure SQL service to filter for the data you've ingested using the workflow you configured and ran following this guide. The image below provides an example.
When attempting to install the `openmetadata-ingestion[azuresql]` Python package in Step 2, you might encounter the following error. The error might include a mention of a Rust compiler.
If you encounter the following error when attempting to run the ingestion workflow in Step 12, this is probably because there is no OpenMetadata server running at http://localhost:8585.
To correct this problem, please follow the steps in the [Run OpenMetadata](../../../try-openmetadata/run-openmetadata.md) guide to deploy OpenMetadata in Docker on your local machine.