Docs - Add update highlights (#7148)

* Add update highlights

* Update openmetadata-docs/content/deployment/upgrade/versions/011-to-012.md

Co-authored-by: Teddy <teddy.crepineau@gmail.com>

* Added note regarding Snowflake multithreading for the profiler

* Airflow version upgrade

* Add link

Co-authored-by: Teddy <teddy.crepineau@gmail.com>
This commit is contained in:
Pere Miquel Brull 2022-09-06 07:19:27 +02:00 committed by GitHub
parent f463cf48c0
commit bc3934f668
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 86 additions and 1 deletions

View File

@ -16,7 +16,7 @@ now allows all users to perform backups regardless of the underlying infrastruct
## Requirements ## Requirements
The backup CLI needs to be used with `openmetadata-ingestion` version 0.12 or higher. The backup CLI needs to be used with `openmetadata-ingestion` version 0.11.5 or higher.
## Installation ## Installation

View File

@ -0,0 +1,75 @@
---
title: Upgrade 0.11 to 0.12
slug: /deployment/upgrade/versions/011-to-012
---
# Upgrade from 0.11 to 0.12
Upgrading from 0.11 to 0.12 can be done directly on your instances. This page will list a couple of details that you
should take into consideration when running the upgrade.
## Highlights
### Data Profiler and Data Quality Tests
On 0.11, the Profiler Workflow handled two things:
- Computing metrics on the data
- Running the configured Data Quality Tests
There has been a major overhaul where not only the UI greatly improved, now showing all historical data, but on the
internals as well. Main topics to consider:
1. Tests now run with the Test Suite workflow and cannot be configured in the Profiler Workflow
2. Any past test data will be cleaned up during the upgrade to 0.12.0, as the internal data storage has been improved
3. The Profiler Ingestion Pipelines will be cleaned up during the upgrade to 0.12.0 as well.
### Profiler Workflow Updates
On top of the information above, the `fqnFilterPattern` has been converted into the same patterns we use for ingestion,
`databaseFilterPattern`, `schemaFilterPattern` and `tableFilterPattern`.
In the `processor` you can now configure:
- `profileSample` to specify the % of the table to run the profiling on
- `columnConfig.profileQuery` as a query to use to sample the data of the table
- `columnConfig.excludeColumns` and `columnConfig.includeColumns` to mark which columns to skip.
- In `columnConfig.includeColumns` we can also specify a list of `metrics` to run from our supported metrics.
### Profiler Multithreading for Snowflake users
In OpenMetadata 0.12 we have migrated the metrics computation to multithreading. This migration reduced metrics computation time by 70%.
For Snowflake users, there is a known issue with the python package `snowflake-connector-python` in Python 3.9 where multithreading creates a circular import of the package. We highly recommend to either 1) run the ingestion workflow in Python 3.8 environment or 2) if you can't manage your environement set `ThreadCount` to 1. You can find more information on the profiler setting [here](/openmetadata/ingestion/workflows/profiler)
### Airflow Version
The Airflow version from the Ingestion container image has been upgraded to `2.3.3`.
Note that this means that now this is the version that will be used to run the Airflow metadata extraction. This impacted
for example when ingesting status from Airflow 2.1.4 (issue[https://github.com/open-metadata/OpenMetadata/issues/7228]).
Moreover, the authentication mechanism that Airflow exposes for the custom plugins has changed. This required
us to fully update how we were handling the managed APIs, both on the plugin side and the OpenMetadata API (which is
the one sending the authentication).
To continue working with your own Airflow linked to the OpenMetadata UI for ingestion management, we recommend migrating
to Airflow 2.3.3.
If you are using your own Airflow to prepare the ingestion from the UI, which is stuck in version 2.1.4, and you cannot
upgrade that, but you want to use OM 0.12, reach out to us.
### Service Connection Updates
- DynamoDB
- Removed: `database`
- Deltalake:
- Removed: `connectionOptions` and `supportsProfiler`
- Looker
- Renamed `username` to `clientId` and `password` to `clientSecret` to align on the internals required for the metadata extraction.
- Removed: `env`
- Oracle
- Removed: `databaseSchema` and `oracleServiceName` from the root.
- Added: `oracleConnectionType` which will either contain `oracleServiceName` or `databaseSchema`. This will reduce confusion on setting up the connection.
- Athena
- Removed: `hostPort`
- Databricks
- Removed: `username` and `password`

View File

@ -24,4 +24,12 @@ You can find further information about specific version upgrades in the followin
> >
Upgrade from 0.10 to 0.11 inplace. Upgrade from 0.10 to 0.11 inplace.
</InlineCallout> </InlineCallout>
<InlineCallout
color="violet-70"
icon="10k"
bold="Upgrade 0.12"
href="/deployment/upgrade/versions/010-to-011"
>
Upgrade from 0.11 to 0.12 inplace.
</InlineCallout>
</InlineCalloutContainer> </InlineCalloutContainer>

View File

@ -146,6 +146,8 @@ site_menu:
url: /deployment/upgrade/versions/090-to-010 url: /deployment/upgrade/versions/090-to-010
- category: Deployment / Upgrade OpenMetadata / Upgrade Version Instructions / 0.10 to 0.11 - category: Deployment / Upgrade OpenMetadata / Upgrade Version Instructions / 0.10 to 0.11
url: /deployment/upgrade/versions/010-to-011 url: /deployment/upgrade/versions/010-to-011
- category: Deployment / Upgrade OpenMetadata / Upgrade Version Instructions / 0.11 to 0.12
url: /deployment/upgrade/versions/011-to-012
- category: Deployment / Server Configuration Reference - category: Deployment / Server Configuration Reference
url: /deployment/configuration url: /deployment/configuration