mirror of
https://github.com/open-metadata/OpenMetadata.git
synced 2025-08-31 12:39:01 +00:00
275 lines
12 KiB
Markdown
275 lines
12 KiB
Markdown
---
|
|
title: Upgrade 1.1 to 1.2
|
|
slug: /deployment/upgrade/versions/110-to-120
|
|
---
|
|
|
|
# Prerequisites
|
|
|
|
Everytime that you plan on upgrading OpenMetadata to a newer version, make sure to go over all these steps:
|
|
|
|
### Backup your Metadata
|
|
|
|
Before upgrading your OpenMetadata version we strongly recommend backing up the metadata.
|
|
|
|
The source of truth is stored in the underlying database (MySQL and Postgres supported). During each version upgrade there
|
|
is a database migration process that needs to run. It will directly attack your database and update the shape of the
|
|
data to the newest OpenMetadata release.
|
|
|
|
It is important that we backup the data because if we face any unexpected issues during the upgrade process,
|
|
you will be able to get back to the previous version without any loss.
|
|
|
|
{% note %}
|
|
|
|
You can learn more about how the migration process works [here](/deployment/upgrade/how-does-it-work).
|
|
|
|
{% /note %}
|
|
|
|
- To run the backup and restore commands, please make sure that you are always in the latest `openmetadata-ingestion` version to have all the improvements shipped in the CLI.
|
|
- Also, make sure you have connectivity between your database (MySQL / PostgreSQL) and the host machine where you will be running the below commands.
|
|
|
|
**1. Create a Virtual Environment and Install the Backup CLI**
|
|
|
|
```python
|
|
python -m venv venv
|
|
source venv/bin/activate
|
|
pip install openmetadata-ingestion~=1.2.0
|
|
```
|
|
|
|
Validate the installed metadata version with `python -m metadata --version`
|
|
|
|
**2. Run the Backup**
|
|
|
|
If using MySQL:
|
|
|
|
```bash
|
|
python -m metadata backup -u openmetadata_user -p openmetadata_password -H mysql -d openmetadata_db --port 3306
|
|
```
|
|
|
|
If using Postgres:
|
|
|
|
```bash
|
|
python -m metadata backup -u openmetadata_user -p openmetadata_password -H postgresql -d openmetadata_db --port 5432 -s public
|
|
```
|
|
|
|
**3. Store the backup file somewhere safe**
|
|
|
|
The above command will generate a backup file with extension as `.sql`. You can copy the name from the backup command output.
|
|
|
|
Make sure to store it somewhere safe in case you need to restore the data later.
|
|
|
|
You can refer to the following guide to get more details about the backup and restore:
|
|
|
|
{% inlineCalloutContainer %}
|
|
{% inlineCallout
|
|
color="violet-70"
|
|
icon="luggage"
|
|
bold="Backup Metadata"
|
|
href="/deployment/backup-restore-metadata" %}
|
|
Learn how to back up MySQL or Postgres data.
|
|
{% /inlineCallout %}
|
|
{% /inlineCalloutContainer %}
|
|
|
|
### Update `sort_buffer_size` (MySQL) or `work_mem` (Postgres)
|
|
|
|
Before running the migrations, it is important to update these parameters to ensure there are no runtime errors.
|
|
A safe value would be setting them to 20MB.
|
|
|
|
**If using MySQL**
|
|
|
|
You can update it via SQL (note that it will reset after the server restarts):
|
|
|
|
```sql
|
|
SET GLOBAL sort_buffer_size = 20971520
|
|
```
|
|
|
|
To make the configuration persistent, you'd need to navigate to your MySQL Server install directory and update the
|
|
`my.ini` or `my.cnf` [files](https://dev.mysql.com/doc/refman/8.0/en/option-files.html) with `sort_buffer_size = 20971520`.
|
|
|
|
If using RDS, you will need to update your instance's [Parameter Group](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithParamGroups.html)
|
|
to include the above change.
|
|
|
|
**If using Postgres**
|
|
|
|
You can update it via SQL (not that it will reset after the server restarts):
|
|
|
|
```sql
|
|
SET work_mem = '20MB';
|
|
```
|
|
|
|
To make the configuration persistent, you'll need to update the `postgresql.conf` [file](https://www.postgresql.org/docs/9.3/config-setting.html)
|
|
with `work_mem = 20MB`.
|
|
|
|
If using RDS, you will need to update your instance's [Parameter Group](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithParamGroups.html)
|
|
to include the above change.
|
|
|
|
Note that this value would depend on the size of your `query_entity` table. If you still see `Out of Sort Memory Error`s
|
|
during the migration after bumping this value, you can increase them further.
|
|
|
|
After the migration is finished, you can revert this changes.
|
|
|
|
# Deprecation Notice
|
|
|
|
- OpenMetadata only supports Python version 3.8 to 3.10. We will add support for 3.11 in the release 1.3.
|
|
- OpenMetadata version 0.13.x is deprecated.
|
|
|
|
# Breaking Changes
|
|
|
|
## 1.2.1
|
|
|
|
### Application Logo and Login Configuration Migrated to UI
|
|
|
|
The following configuration block has been removed from `openmetadata.yaml`:
|
|
|
|
```yaml
|
|
applicationConfig:
|
|
logoConfig:
|
|
customLogoUrlPath: ${OM_CUSTOM_LOGO_URL_PATH:-""}
|
|
customMonogramUrlPath: ${OM_CUSTOM_MONOGRAM_URL_PATH:-""}
|
|
loginConfig:
|
|
maxLoginFailAttempts: ${OM_MAX_FAILED_LOGIN_ATTEMPTS:-3}
|
|
accessBlockTime: ${OM_LOGIN_ACCESS_BLOCK_TIME:-600}
|
|
jwtTokenExpiryTime: ${OM_JWT_EXPIRY_TIME:-3600}
|
|
```
|
|
|
|
This change removes the traditional way of providing **Custom URL** logos configurations as part of OpenMetadata Configurations
|
|
file and migrate this to be driven and configured right from UI from `Settings` > `OpenMetadata` > `Custom Logo`.
|
|
|
|
The same applies to the **Login Configuration**, which can now be configured under `Settings` > `OpenMetadata` > `Login Configuration`.
|
|
|
|
Note that these environment variables will now have no effect. If you are deploying on Bare Metal, make sure to use the latest `openmetadata.yaml` file.
|
|
|
|
|
|
### OpenMetadata Helm Chart Dependencies Migrate from ElasticSearch to OpenSearch Charts
|
|
|
|
As part of `1.2.1`, we migrated the base dependencies for OpenMetadata Helm Chart to use OpenSearch version `2.7` instead of ElasticSearch `8.X`. This is a reactive change done as community driven [ElasticSearch Helm Chart](https://github.com/elastic/helm-charts) project has been deprecated in the favor of Elastic Stack Operator which cannot be added as an helm chart dependency.
|
|
|
|
For new users, this is an unnoticeable change who will be installing the OpenMetadata dependencies using quickstart guides.
|
|
|
|
For existing users, who have their proof-of-concept environments using the OpenMetadata Dependencies and are looking to upgrade to newer helm release -
|
|
- The default OpenMetadata helm values for `openmetadata.config.elasticsearch.*` has been updated to connect to OpenSearch from OpenMetadata Dependencies Helm Chart. Please refer to the [helm values](/deployment/kubernetes/helm-values) and update your custom installation accordingly.
|
|
- Post upgrade, you will need to follow the steps here to [rebuild and reindex your search indexing](/deployment/upgrade#reindex).
|
|
|
|
## 1.2.0
|
|
|
|
### Database connection SSL Configuration
|
|
|
|
With 1.2.X, the environment variable `DB_USE_SSL` is deprecated in favour of `DB_PARAMS`.
|
|
For Bare Metal and Docker Deployment, Add / Update the variable `DB_PARAMS` to `allowPublicKeyRetrieval=true&useSSL=true&serverTimezone=UTC` to enable ssl security to connect to database.
|
|
For Kubernetes Deployment, `openmetadata.config.database.dbParams` is available to pass the above values as helm values.
|
|
|
|
### Version Upgrades
|
|
|
|
- The OpenMetadata Server is now based on **JDK 17**
|
|
- OpenMetadata now **requires** **Elasticsearch** version **8.10.2** or **Opensearch** version **2.7**
|
|
|
|
There is no direct migration to bump the indexes to the new supported versions. You might see errors like:
|
|
|
|
```
|
|
java.lang.IllegalStateException: cannot upgrade a node from version [7.16.3] directly to version [8.5.1]
|
|
ERROR: Elasticsearch did not exit normally - check the logs at /usr/share/elasticsearch/logs/elasticsearch.log
|
|
ERROR: Elasticsearch exited unexpectedly
|
|
```
|
|
|
|
In order to move forward, **you must remove the volumes or delete the indexes** directly from your search instances. Note that
|
|
OpenMetadata stores everything in the database, so indexes can be recreated from the UI. We will
|
|
show you how in the [Post-Upgrade Steps](/deployment/upgrade#reindex).
|
|
|
|
### Helm Chart Values
|
|
|
|
- Added a new key `openmetadata.config.database.dbParams` to pass extra database parameters as string format, e.g., `useSSL=true&serverTimezone=UTC`.
|
|
- Removed the entry for `openmetadata.config.database.dbUseSSL`. You should use `openmetadata.config.database.dbParams` instead.
|
|
- Updated the ElasticSearch Helm Chart Dependencies to version 8.5.1
|
|
|
|
### Query Entity
|
|
|
|
The Query Entity now has the `service` property, linking the Query to the Database Service that it belongs to. Note
|
|
that `service` is a required property both for the Query Entity and the Create Query Entity.
|
|
|
|
During the migrations, we pick up the service from the tables from `queryUsedIn`. If this information is not available,
|
|
then there is no way to link a query to a service and the query will be removed.
|
|
|
|
### Service Connection Changes
|
|
|
|
- Domo Database, Dashboard and Pipeline renamed the `sandboxDomain` in favor of `instanceDomain`.
|
|
- The `DatabaseMetadata` configuration renamed `viewParsingTimeoutLimit` to `queryParsingTimeoutLimit`.
|
|
- The `DatabaseMetadata` configuration removed the `markAllDeletedTables` option. For simplicity, we'll only
|
|
mark as deleted the tables coming from the filtered ingestion results.
|
|
|
|
### Ingestion Framework Changes
|
|
|
|
We have reorganized the structure of the `Workflow` classes, which requires updated imports:
|
|
|
|
- **Metadata Workflow**
|
|
- From: `from metadata.ingestion.api.workflow import Workflow`
|
|
- To: `from metadata.workflow.metadata import MetadataWorkflow`
|
|
|
|
- **Lineage Workflow**
|
|
- From: `from metadata.ingestion.api.workflow import Workflow`
|
|
- To: `from metadata.workflow.metadata import MetadataWorkflow` (same as metadata)
|
|
|
|
- **Usage Workflow**
|
|
- From: `from metadata.ingestion.api.workflow import Workflow`
|
|
- To: `from metadata.workflow.usage import UsageWorkflow`
|
|
|
|
- **Profiler Workflow**
|
|
- From: `from metadata.profiler.api.workflow import ProfilerWorkflow`
|
|
- To: `from metadata.workflow.profiler import ProfilerWorkflow`
|
|
|
|
- **Data Quality Workflow**
|
|
- From: `from metadata.data_quality.api.workflow import TestSuiteWorkflow`
|
|
- To: `from metadata.workflow.data_quality import TestSuiteWorkflow`
|
|
|
|
- **Data Insights Workflow**
|
|
- From: `from metadata.data_insight.api.workflow import DataInsightWorkflow`
|
|
- To: `from metadata.workflow.data_insight import DataInsightWorkflow`
|
|
|
|
- **Elasticsearch Reindex Workflow**
|
|
- From: `from metadata.ingestion.api.workflow import Workflow`
|
|
- To: `from metadata.workflow.metadata import MetadataWorkflow` (same as metadata)
|
|
|
|
The `Workflow` class that you import can then be called as follows:
|
|
|
|
```python
|
|
|
|
|
|
workflow = workflow_class.create(workflow_config)
|
|
workflow.execute()
|
|
workflow.raise_from_status()
|
|
workflow.print_status() # This method has been updated. Before it was `workflow.print_status()`
|
|
workflow.stop()
|
|
```
|
|
|
|
If you try to run your workflows externally and start noticing `ImportError`s, you will need to review the points above.
|
|
|
|
### Metadata CLI Changes
|
|
|
|
In 1.1.7 and below you could run the Usage Workflow as `metadata ingest -c <path to yaml>`. Now, the Usage Workflow
|
|
has its own command `metadata usage -c <path to yaml>`.
|
|
|
|
### Custom Connectors
|
|
|
|
In 1.2.0 we have reorganized the internals of our Workflow handling to centralize status & exception management. This
|
|
will simplify how you need to take care of status and exceptions on your Custom Connectors code, while helping developers
|
|
to make decisions on those errors that need to be shared in the Workflow.
|
|
|
|
{% note %}
|
|
|
|
If you want to take a look at an updated Custom Connector and its changes, you can review the demo [PR](https://github.com/open-metadata/openmetadata-demo/pull/34/files).
|
|
|
|
{% /note %}
|
|
|
|
Let's list the changes down:
|
|
1. You don't need to handle the `SourceStatus` anymore. The new basic Workflow class will take care of things for you. Therefore, this import
|
|
`from metadata.ingestion.api.source import SourceStatus` is deprecated.
|
|
2. The `Source` class is now imported from `from metadata.ingestion.api.steps import Source` (instead of `from metadata.ingestion.api.source import Source`)
|
|
3. We are now initializing the `OpenMetadata` object at the Workflow level (to share it better in each step). Therefore,
|
|
the source `__init__` method signature is now `def __init__(self, config: WorkflowSource, metadata: OpenMetadata):`. Make sure to store the `self.metadata` object
|
|
during the `__init__` and don't forget to call `super().__init__()`.
|
|
4. We are updating how the status & exception management happens in the connectors. Now each `yield` result is wrapped by
|
|
an `Either` (imported from `from metadata.ingestion.api.models import Either`). Your correct data will be `yield`ed in a `right`, while
|
|
the errors are tracked in a `left`. Read more about the Workflow management [here](https://github.com/open-metadata/OpenMetadata/blob/main/ingestion/src/metadata/workflow/README.md).
|
|
|
|
### Other Changes
|
|
|
|
- Pipeline Status are now timestamps in milliseconds.
|