Docs: Updated the Getting Started Collate Docs (#17824)

Co-authored-by: Prajwal Pandit <prajwalpandit@Prajwals-MacBook-Air.local>
This commit is contained in:
Prajwal214 2024-09-13 00:03:03 +05:30 committed by GitHub
parent 681a15c4a5
commit 3f582e3063
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
11 changed files with 345 additions and 100 deletions

View File

@ -3,39 +3,25 @@ site_menu:
- category: Home
url: /
- category: Enable Security
url: /security
- category: Getting Started
url: /getting-started
color: violet-70
icon: deployment
- category: Enable Security / Basic Authentication
url: /security/basic-auth
- category: Enable Security / Ldap Authentication
url: /security/ldap
- category: Enable Security / Auth0 SSO
url: /security/auth0
- category: Enable Security / Azure SSO
url: /security/azure
- category: Enable Security / Custom OIDC SSO
url: /security/custom-oidc
- category: Enable Security / OIDC SSO
url: /security/oidc
- category: Enable Security / Google SSO
url: /security/google
- category: Enable Security / Okta SSO
url: /security/okta
- category: Enable Security / Amazon Cognito SSO
url: /security/amazon-cognito
- category: Enable Security / One Login SSO
url: /security/one-login
- category: Enable Security / Keycloak SSO
url: /security/keycloak
- category: Enable Security / Saml
url: /security/saml
- category: Enable Security / Saml / AWS
url: /security/saml/aws
- category: Enable Security / Saml / Azure
url: /security/saml/azure
icon: openmetadata
- category: Getting Started / Day 1
url: /getting-started/day-1
- category: Getting Started / Day 1 / Hybrid SaaS
url: /getting-started/day-1/hybrid-saas
- category: Getting Started / Day 1 / Hybrid SaaS / Airflow
url: /getting-started/day-1/hybrid-saas/airflow
- category: Getting Started / Day 1 / Hybrid SaaS / MWAA
url: /getting-started/day-1/hybrid-saas/mwaa
- category: Getting Started / Day 1 / Hybrid SaaS / GCS Composer
url: /getting-started/day-1/hybrid-saas/gcs-composer
- category: Getting Started / Day 1 / Hybrid SaaS / GitHub Actions
url: /getting-started/day-1/hybrid-saas/github-actions
- category: Getting Started / Day 1 / Hybrid SaaS / Credentials
url: /getting-started/day-1/hybrid-saas/credentials
- category: Connectors
url: /connectors
@ -718,27 +704,41 @@ site_menu:
url: /how-to-guides/data-governance/domains-&-data-products/domains
- category: How-to Guides / Data Governance / Domains & Data Product / How to Use Data Products
url: /how-to-guides/data-governance/domains-&-data-products/data-products
- category: Getting Started
url: /getting-started
color: violet-70
icon: openmetadata
- category: Getting Started / Day 1
url: /getting-started/day-1
- category: Getting Started / Day 1 / Hybrid SaaS
url: /getting-started/day-1/hybrid-saas
- category: Getting Started / Day 1 / Hybrid SaaS / Airflow
url: /getting-started/day-1/hybrid-saas/airflow
- category: Getting Started / Day 1 / Hybrid SaaS / MWAA
url: /getting-started/day-1/hybrid-saas/mwaa
- category: Getting Started / Day 1 / Hybrid SaaS / GCS Composer
url: /getting-started/day-1/hybrid-saas/gcs-composer
- category: Getting Started / Day 1 / Hybrid SaaS / GitHub Actions
url: /getting-started/day-1/hybrid-saas/github-actions
- category: Getting Started / Day 1 / Hybrid SaaS / Credentials
url: /getting-started/day-1/hybrid-saas/credentials
- category: Enable Security
url: /security
color: violet-70
icon: deployment
- category: Enable Security / Basic Authentication
url: /security/basic-auth
- category: Enable Security / Ldap Authentication
url: /security/ldap
- category: Enable Security / Auth0 SSO
url: /security/auth0
- category: Enable Security / Azure SSO
url: /security/azure
- category: Enable Security / Custom OIDC SSO
url: /security/custom-oidc
- category: Enable Security / OIDC SSO
url: /security/oidc
- category: Enable Security / Google SSO
url: /security/google
- category: Enable Security / Okta SSO
url: /security/okta
- category: Enable Security / Amazon Cognito SSO
url: /security/amazon-cognito
- category: Enable Security / One Login SSO
url: /security/one-login
- category: Enable Security / Keycloak SSO
url: /security/keycloak
- category: Enable Security / Saml
url: /security/saml
- category: Enable Security / Saml / AWS
url: /security/saml/aws
- category: Enable Security / Saml / Azure
url: /security/saml/azure
- category: Releases
url: /releases
color: violet-70
@ -749,6 +749,34 @@ site_menu:
url: /releases/supported
- category: Releases / All Releases
url: /releases/all-releases
- category: Releases / All Releases / 1.5.2 Release
url: /releases/all-releases/#1.5.2-release
- category: Releases / All Releases / 1.5.1 Release
url: /releases/all-releases/#1.5.1-release
- category: Releases / All Releases / 1.4.8 Release
url: /releases/all-releases/#1.4.8-release
- category: Releases / All Releases / 1.4.7 Release
url: /releases/all-releases/#1.4.7-release
- category: Releases / All Releases / 1.4.6 Release
url: /releases/all-releases/#1.4.6-release
- category: Releases / All Releases / 1.4.5 Release
url: /releases/all-releases/#1.4.5-release
- category: Releases / All Releases / 1.4.4 Release
url: /releases/all-releases/#1.4.4-release
- category: Releases / All Releases / 1.4.3 Release
url: /releases/all-releases/#1.4.3-release
- category: Releases / All Releases / 1.4.2 Release
url: /releases/all-releases/#1.4.2-release
- category: Releases / All Releases / 1.4.1 Release
url: /releases/all-releases/#1.4.1-release
- category: Releases / All Releases / 1.4.0 Release
url: /releases/all-releases/#1.4.0-release
- category: Releases / All Releases / 1.3.4 Release
url: /releases/all-releases/#1.3.4-release
- category: Releases / All Releases / 1.3.3 Release
url: /releases/all-releases/#1.3.3-release
- category: Releases / All Releases / 1.3.2 Release
url: /releases/all-releases/#1.3.2-release
- category: Releases / All Releases / 1.3.1 Release
url: /releases/all-releases/#1.3.1-release
- category: Releases / All Releases / 1.3.0 Release

View File

@ -0,0 +1,54 @@
---
title: Running Connector using Collate SaaS
slug: /getting-started/day-1/collate-saas
collate: true
---
## Setting Up a Database Service for Metadata Extraction
You can easily set up a database service for metadata extraction from Collate SaaS in just a few minutes. For example, heres how to set up a connection using the `Snowflake` Connector:
1. Log in to your Collate SaaS instance, then navigate to **Settings > Services > Databases** & Click on Add New Service.
{% image
src="/images/v1.5/getting-started/add-service.png"
alt="Adding Database Service"
height="450px"
caption="Adding Database Service" /%}
2. **Select the database type** you want to use. Enter details such as the name and description to identify the database. In this Case we are selecting `Snowflake`.
{% image
src="/images/v1.5/getting-started/select-service.png"
alt="Selecting Database Service"
height="850px"
caption="Selecting Database Service" /%}
4. **Enter the Connection Details** You can view the available documentation in the side panel for guidance. Also, refer to the connector [documentation](/connectors).
{% image
src="/images/v1.5/getting-started/configure-connector.png"
alt="Updating Connection Details"
height="850px"
caption="Updating Connection Details" /%}
5. **Allow the Collate SaaS IP**. In the Connection Details, you will see the IP Address unique to your cluster, You need to Allow the `IP` to Access the datasource.
{% note %}
This step is required only for Collate SaaS. If you are using Hybrid SaaS, you will not see the IP address in the Service Connection details.
{% /note %}
{% image
src="/images/v1.5/getting-started/collate-saas-ip.png"
alt="Collate SaaS IP"
height="850px"
caption="Collate SaaS IP" /%}
6. **Test the connection** to verify the status. The test connection will check if the Service is reachable from Collate.
{% image
src="/images/v1.5/getting-started/test-connection.png"
alt="Verifying the Test Connection"
height="850px"
caption="Verifying the Test Connection" /%}

View File

@ -6,12 +6,14 @@ collate: true
# Getting Started: Day 1
Lets get started with your Collate service in five steps:
1. Set up a data connector
2. Ingest metadata
3. Invite users
4. Add roles
5. Create teams and add users
Get started with your Collate service in just few simple steps:
1. Set up a Data Connector: Connect your data sources to begin collecting metadata.
2. Ingest Metadata: Run the metadata ingestion to gather and push data insights.
3. Invite Users: Add team members to collaborate and manage metadata together.
4. Explore the Features: Dive into Collate's rich feature set to unlock the full potential of your data.
**Ready to begin? Let's get started!**
## Requirements
@ -29,17 +31,17 @@ Connections to [custom data sources](/connectors/custom-connectors) can also be
There's two options on how to set up a data connector:
1. **Run the connector in Collate SaaS**: In this scenario, you'll get an IP when you add the service. You need to give
access to this IP in your data sources.
2. **Run the connector in your infrastructure or laptop**: In this case, Collate won't be accessing the data, but rather
you'd control where and how the process is executed and Collate will only receive the output of the metadata extraction.
This is an interesting option for sources lying behind private networks or when external SaaS services are not allowed to
connect to your data sources. You can read more about how to extract metadata in these cases [here](/getting-started/day-1/hybrid-saas).
{% tilesContainer %}
{% tile
title="Run the connector in Collate SaaS"
description="Guide to start ingesting metadata seamlessly from your data sources."
link="/getting-started/day-1/collate-saas"
icon="discovery"
/%}
{% /tilesContainer %}
You can easily set up a database service in minutes to run the metadata extraction directly from Collate SaaS:
- Navigate to **Settings > Services > Databases**.
- Click on **Add New Service**.
- Select the database type you want. Enter the information, like name and description, to identify the database.
- Enter the Connection Details. You can view the documentation available in the side panel.
- Test the connection to verify the connection status.
2. **Run the connector in your infrastructure or laptop**: The hybrid model offers organizations the flexibility to run metadata ingestion components within their own infrastructure. This approach ensures that Collate's managed service doesn't require direct access to the underlying data. Instead, only the metadata is collected locally and securely transmitted to our SaaS platform, maintaining data privacy and security while still enabling robust metadata management. You can read more about how to extract metadata in these cases [here](/getting-started/day-1/hybrid-saas).
## Step 2: Ingest Metadata
@ -60,53 +62,93 @@ Once the metadata is ingested into the platform, you can [invite users](/how-to-
to collaborate on the data and assign different roles.
- Navigate to **Settings > Team & User Management > Users**.
{% image
src="/images/v1.5/getting-started/users.png"
alt="Users Navigation"
height="450px"
caption="Users Navigation" /%}
- Click on **Add User**, and enter their email and other details to provide access to the platform.
{% image
src="/images/v1.5/getting-started/add-users.png"
alt="Adding New User"
height="750px"
caption="Adding New User" /%}
- You can organize users into different Teams, as well as assign them to different Roles.
- Users will inherit the access defined for their assigned Teams and Roles.
- Admin access can also be granted. Admins will have access to all settings and can invite other users.
- New users will receive an email invitation to set up their account.
## Step 4: Add Roles and Policies
## Step 4: Explore Features of OpenMetadata
Add well-defined roles based on the users job description, such as Data Scientist or Data Steward.
Each role can be associated with certain policies, such as the Data Consumer Policy. These policies further comprise
fine-grained Rules to define access.
OpenMetadata provides a comprehensive solution for data teams to break down silos, securely share data assets across various sources, foster collaboration around trusted data, and establish a documentation-first data culture within the organization.
- Navigate to **Settings > Access Control** to define the Rules, Policies, and Roles.
- Refer to [this use case guide](/how-to-guides/admin-guide/roles-policies/use-cases) to understand the configuration for different circumstances.
- Start by creating a Policy. Define the rules for the policy.
- Then, create a Role and apply the related policies.
- Navigate to **Settings > Team & User Management** to assign roles to users or teams.
{% tilesContainer %}
{% tile
title="Data Discovery"
description="Discover the right data assets to make timely business decisions."
link="/how-to-guides/data-discovery"
icon="discovery"
/%}
{% tile
title="Data Collaboration"
description="Foster data team collaboration to enhance data understanding."
link="/how-to-guides/data-collaboration"
icon="collaboration"
/%}
{% tile
title="Data Quality & Observability"
description="Trust your data with quality tests & monitor the health of your data systems."
link="/how-to-guides/data-quality-observability"
icon="observability"
/%}
{% tile
title="Data Lineage"
description="Trace the path of data across tables, pipelines, and dashboards."
link="/how-to-guides/data-lineage"
icon="lineage"
/%}
{% tile
title="Data Insights"
description="Define KPIs and set goals to proactively hone the data culture of your company."
link="/how-to-guides/data-insights"
icon="discovery"
/%}
{% tile
title="Data Governance"
description="Enhance your data platform governance using OpenMetadata."
link="/how-to-guides/data-governance"
icon="governance"
/%}
{% /tilesContainer %}
For more detailed instructions, refer to the [Advanced Guide for Roles and Policies](/how-to-guides/admin-guide/roles-policies).
## Deep Dive into OpenMetadata: Guides for Admins and Data Users
## Step 5: Create Teams and Assign Users
Now that you have users added and roles defined, grant users access to the data assets they need. The easiest way to
manage this at scale is to create teams with the appropriate permissions, and to invite users to their assigned teams.
- Collate supports a hierarchical team structure with [multiple team types](/how-to-guides/admin-guide/teams-and-users/team-structure-openmetadata).
- The root team-type Organization supports other child teams and users within it.
- Business Units, Divisions, Departments, and Groups are the other team types in the hierarchy.
- Note: Only the team-type Organization and Groups can have users. Only the team-type Groups can own data assets.
Planning the [team hierarchy](/how-to-guides/admin-guide/teams-and-users/team-structure-openmetadata) can help save time
later, when creating the teams structure in **Settings > Team and User Management > Teams**. Continue to invite additional
users to onboard them to Collate, with their assigned teams and roles.
## Next Steps
You now have data sources loaded into Collate, and team structure set up. Continue to add more data sources to gain a
more complete view of your data estate, and invite users to foster broader collaboration. You can check out
the [advanced guide to roles and policies](/how-to-guides/admin-guide/roles-policies) to fine-tune role or team access to data.
{% tilesContainer %}
{% tile
title="Admin Guide"
description="Admin users can get started with OpenMetadata with just three quick and easy steps & know-it-all with the advanced guides."
link="/how-to-guides/admin-guide"
icon="administration"
/%}
{% tile
title="Guide for Data Users"
description="Get to know the basics of OpenMetadata and about the data assets that you can explore in the all-in-one platform."
link="/how-to-guides/guide-for-data-users"
icon="steward"
/%}
{% /tilesContainer %}
From here, you can further your understanding and management of your data with Collate:
- You can check out the [advanced guide to roles and policies](/how-to-guides/admin-guide/roles-policies) to fine-tune role or team access to data.
- Trace your data flow with [column-level lineage](/how-to-guides/data-lineage) graphs to understand where your data comes from, how it is used, and how it is managed.
- Build [no-code data quality tests](how-to-guides/data-quality-observability/quality/tab) to ensure its technical and
business quality, and set up an [alert](/how-to-guides/data-quality-observability/observability) for any test case failures to be quickly notified of critical data issues.
- Write [Knowledge Center](/how-to-guides/data-collaboration/knowledge-center) articles associated with data assets to document key information for your team, such as technical details, business context, and best practices.
- Review the different [Data Insights Reports](/how-to-guides/data-insights/report) on Data Assets, App Analytics, KPIs, and [Cost Analysis](/how-to-guides/data-insights/cost-analysis) to understand the health, utilization, and costs of your data estate.
- Build no-code workflows with [Metadata Automations](https://www.youtube.com/watch?v=ug08aLUyTyE&ab_channel=OpenMetadata) to add attributes like owners, tiers, domains, descriptions, glossary terms, and more to data assets, as well as propagate them using column-level lineage for more automated data management.
You can also review additional [How-To Guides](/how-to-guides) on popular topics like data discovery, data quality, and data governance.
- Build no-code workflows with [Metadata Automations](https://www.youtube.com/watch?v=ug08aLUyTyE&ab_channel=OpenMetadata) to add attributes like owners, tiers, domains, descriptions, glossary terms, and more to data assets, as well as propagate them using column-level lineage for more automated data management.

View File

@ -0,0 +1,121 @@
---
title: Basic Authentication
slug: /deployment/security/basic-auth
collate: false
---
# UserName/Password Login
Out of the box, OpenMetadata comes with a Username & Password Login Mechanism.
The default Username and Password for Login are:
```commandline
Username - admin@open-metadata.org
Password - admin
```
When using a custom domain, configure the principal domain as follows:
```yaml
config:
authorizer:
adminPrincipals: [admin]
principalDomain: "yourdomain.com"
```
With this setup, the default Username will be `admin@yourdomain.com`.
{%important%}
Security requirements for your **production** environment:
- **DELETE** the admin default account shipped by OM.
- **UPDATE** the Private / Public keys used for the [JWT Tokens](/deployment/security/enable-jwt-tokens) in case it is enabled.
{%/important%}
# Setting up Basic Auth Manually
Below are the required steps to set up the Basic Login:
## Set up Configurations in openmetadata.yaml
### Authentication Configuration
The following configuration controls the auth mechanism for OpenMetadata. Update the mentioned fields as required.
```yaml
authenticationConfiguration:
provider: ${AUTHENTICATION_PROVIDER:-basic}
publicKeyUrls: ${AUTHENTICATION_PUBLIC_KEYS:-[{your domain}/api/v1/system/config/jwks]} # Update with your Domain and Make sure this "/api/v1/system/config/jwks" is always configured to enable JWT tokens
authority: ${AUTHENTICATION_AUTHORITY:-https://accounts.google.com}
enableSelfSignup : ${AUTHENTICATION_ENABLE_SELF_SIGNUP:-true}
```
For the Basic auth we need to set:
- `provider`: basic
- `publicKeyUrls`: {http|https}://{your_domain}:{port}}/api/v1/system/config/jwks
- `authority`: {your_domain}
- `enableSelfSignup`: This flag indicates if users can come and signup by themselves on the OM
### Authorizer Configuration
This configuration controls the authorizer for OpenMetadata:
```yaml
authorizerConfiguration:
adminPrincipals: ${AUTHORIZER_ADMIN_PRINCIPALS:-[admin]}
allowedEmailRegistrationDomains: ${AUTHORIZER_ALLOWED_REGISTRATION_DOMAIN:-["all"]}
principalDomain: ${AUTHORIZER_PRINCIPAL_DOMAIN:-"open-metadata.org"}
```
For the Basic auth we need to set:
- `adminPrincipals`: admin usernames to bootstrap the server with, comma-separated values.
- `allowedEmailRegistrationDomains`: This controls what all domain are allowed for email registration can be your {principalDomain} as well, for example gmail.com, outlook.comm etc.
- `principalDomain`: This controls what all domain are allowed for email registration, for example gmail.com, outlook.comm etc.
{%note%}
Please note the following are the formats to bootstrap admins on server startup: `[admin1,admin2,admin3]`
This works for SMTP-enabled servers, Login Password for these are generated randomly and sent to the mail `adminName`@`principalDomain`.
If SMTP is not enabled for OpenMetadata, please use the method below to create admin users: `[admin1, admin2, admin3]`. The default password for all admin users will be admin.
After logging into the OpenMetadata UI, admin users can change their default password by navigating to `Settings > Members > Admins`.
{%/note%}
## Metadata Ingestion
For ingesting metadata when Basic Auth is enabled, it is mandatory to configure the `ingestion-bot` account with the JWT
configuration. To know how to enable it, you can follow the documentation of [Enable JWT Tokens](/deployment/security/enable-jwt-tokens).
### Setting up SMTP Server
Basic Authentication is successfully set. For a better login experience, we can also set up the SMTP server to allow the
users to Reset Password, Account Status Updates, etc. as well.
```yaml
email:
emailingEntity: ${OM_EMAIL_ENTITY:-"OpenMetadata"} -> Company Name (Optional)
supportUrl: ${OM_SUPPORT_URL:-"https://slack.open-metadata.org"} -> SupportUrl (Optional)
enableSmtpServer : ${AUTHORIZER_ENABLE_SMTP:-false} -> True/False
openMetadataUrl: ${OPENMETADATA_SERVER_URL:-""} -> {http/https}://{your_domain}
senderMail: ${OPENMETADATA_SMTP_SENDER_MAIL:-""} -> Sender's email
serverEndpoint: ${SMTP_SERVER_ENDPOINT:-""} -> (Ex :- smtp.gmail.com)
serverPort: ${SMTP_SERVER_PORT:-""} -> (SSL/TLS port)
username: ${SMTP_SERVER_USERNAME:-""} -> (SMTP Server Username)
password: ${SMTP_SERVER_PWD:-""} -> (SMTP Server Password)
transportationStrategy: ${SMTP_SERVER_STRATEGY:-"SMTP_TLS"}
```
Following are valid value for transportation strategy:
- `SMTP`: If SMTP port is 25 use this
- `SMTPS`: If SMTP port is 465 use this
- `SMTP_TLS`: If SMTP port is 587 use this
{% partial file="/v1.5/deployment/configure-ingestion.md" /%}

Binary file not shown.

After

Width:  |  Height:  |  Size: 163 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 95 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 380 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 189 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 106 KiB