feat(docs) Add documentation on authorization & authentication (#5265)
@ -62,7 +62,7 @@ WHZ-Authentication {
|
||||
|
||||
### Authentication in React
|
||||
The React app supports both JAAS as described above and separately OIDC authentication. To learn about configuring OIDC for React,
|
||||
see the [OIDC in React](../docs/how/auth/sso/configure-oidc-react.md) document.
|
||||
see the [OIDC in React](../docs/authentication/guides/sso/configure-oidc-react.md) document.
|
||||
|
||||
|
||||
### API Debugging
|
||||
|
||||
@ -61,6 +61,54 @@ module.exports = {
|
||||
"releases",
|
||||
],
|
||||
"Getting Started": ["docs/quickstart", "docs/debugging"],
|
||||
Authentication: [
|
||||
{
|
||||
type: "doc",
|
||||
id: "docs/authentication/README",
|
||||
label: "Overview",
|
||||
},
|
||||
{
|
||||
type: "doc",
|
||||
id: "docs/authentication/concepts",
|
||||
label: "Concepts",
|
||||
},
|
||||
{
|
||||
"Frontend Authentication": [
|
||||
"docs/authentication/guides/jaas",
|
||||
{
|
||||
"OIDC Authentication": [
|
||||
"docs/authentication/guides/sso/configure-oidc-react",
|
||||
"docs/authentication/guides/sso/configure-oidc-react-google",
|
||||
"docs/authentication/guides/sso/configure-oidc-react-okta",
|
||||
"docs/authentication/guides/sso/configure-oidc-react-azure",
|
||||
],
|
||||
},
|
||||
"docs/authentication/guides/add-users",
|
||||
],
|
||||
},
|
||||
{
|
||||
type: "doc",
|
||||
id: "docs/authentication/introducing-metadata-service-authentication",
|
||||
label: "Metadata Service Authentication",
|
||||
},
|
||||
{
|
||||
type: "doc",
|
||||
id: "docs/authentication/personal-access-tokens",
|
||||
label: "Personal Access Tokens",
|
||||
},
|
||||
],
|
||||
Authorization: [
|
||||
{
|
||||
type: "doc",
|
||||
id: "docs/authorization/README",
|
||||
label: "Overview",
|
||||
},
|
||||
{
|
||||
type: "doc",
|
||||
id: "docs/authorization/policies",
|
||||
label: "Access Policies",
|
||||
},
|
||||
],
|
||||
Ingestion: [
|
||||
// add a custom label since the default is 'Metadata Ingestion'
|
||||
// note that we also have to add the path to this file in sidebarsjs_hardcoded_titles in generateDocsDir.ts
|
||||
@ -274,13 +322,11 @@ module.exports = {
|
||||
},
|
||||
],
|
||||
"Usage Guides": [
|
||||
"docs/policies",
|
||||
"docs/domains",
|
||||
"docs/ui-ingestion",
|
||||
"docs/tags",
|
||||
"docs/schema-history",
|
||||
"docs/how/search",
|
||||
"docs/how/auth/add-users",
|
||||
"docs/how/ui-tabs-guide",
|
||||
"docs/how/business-glossary-guide",
|
||||
],
|
||||
@ -297,15 +343,6 @@ module.exports = {
|
||||
//"docs/how/build-metadata-service",
|
||||
//"docs/how/graph-onboarding",
|
||||
//"docs/demo/graph-onboarding",
|
||||
{
|
||||
Authentication: [
|
||||
"docs/how/auth/jaas",
|
||||
"docs/how/auth/sso/configure-oidc-react",
|
||||
"docs/how/auth/sso/configure-oidc-react-google",
|
||||
"docs/how/auth/sso/configure-oidc-react-okta",
|
||||
"docs/how/auth/sso/configure-oidc-react-azure",
|
||||
],
|
||||
},
|
||||
"docs/what/mxe",
|
||||
"docs/how/restore-indices",
|
||||
"docs/dev-guides/timeline",
|
||||
|
||||
@ -24,7 +24,7 @@ Today, DataHub's GraphQL endpoint is available for use in multiple places. The o
|
||||
|
||||
1. **Metadata Service**: The DataHub Metadata Service (backend) is the source-of-truth for the GraphQL endpoint. The endpoint is located at `/api/graphql` path of the DNS address
|
||||
where your instance of the `datahub-gms` container is deployed. For example, in local deployments it is typically located at `http://localhost:8080/api/graphql`. By default,
|
||||
the Metadata Service has no explicit authentication checks. However, it does have *Authorization checks*. DataHub [Access Policies](../../../docs/policies.md) will be enforced by the GraphQL API. This means you'll need to provide an actor identity when querying the GraphQL API.
|
||||
the Metadata Service has no explicit authentication checks. However, it does have *Authorization checks*. DataHub [Access Policies](../../authorization/policies.md) will be enforced by the GraphQL API. This means you'll need to provide an actor identity when querying the GraphQL API.
|
||||
To do so, include the `X-DataHub-Actor` header with an Authorized Corp User URN as the value in your request. Because anyone is able to set the value of this header, we recommend using this endpoint only in trusted environments, either by administrators themselves or programs that they own directly.
|
||||
|
||||
2. **Frontend Proxy**: The DataHub Frontend Proxy Service (frontend) is a basic web server & reverse proxy to the Metadata Service. As such, the
|
||||
|
||||
@ -122,7 +122,7 @@ DataHub provides the following GraphQL mutations for updating entities in your M
|
||||
|
||||
### Authorization
|
||||
|
||||
Mutations which change Entity metadata are subject to [DataHub Access Policies](../../../docs/policies.md). This means that DataHub's server
|
||||
Mutations which change Entity metadata are subject to [DataHub Access Policies](../../authorization/policies.md). This means that DataHub's server
|
||||
will check whether the requesting actor is authorized to perform the action. If you're querying the GraphQL endpoint via the DataHub
|
||||
Proxy Server, which is discussed more in [Getting Started](./getting-started.md), then the Session Cookie provided will carry the actor information.
|
||||
If you're querying the Metadata Service API directly, then you'll have to provide this via a special `X-DataHub-Actor` HTTP header, which should
|
||||
|
||||
41
docs/authentication/README.md
Normal file
@ -0,0 +1,41 @@
|
||||
# Overview
|
||||
|
||||
Authentication is the process of verifying the identity of a user or service. In DataHub this can be split into 2 main components:
|
||||
- How to login into DataHub.
|
||||
- How to make some action withing DataHub on **behalf** of a user/service.
|
||||
|
||||
:::note
|
||||
|
||||
Authentication in DataHub does not necessarily mean that the user/service being authenticated will be part of the metadata graph within DataHub itself other concepts like Datasets or Dashboards.
|
||||
In other words, a user called `john.smith` logging into DataHub does not mean that john.smith appears as a CorpUser Entity within DataHub.
|
||||
|
||||
For a quick video on that subject, have a look at our video on [DataHub Basics — Users, Groups, & Authentication 101
|
||||
](https://youtu.be/8Osw6p9vDYY)
|
||||
|
||||
:::
|
||||
|
||||
### Authentication in the Frontend
|
||||
|
||||
Authentication in DataHub happens at 2 possible moments, if enabled.
|
||||
|
||||
The first happens in the **DataHub Frontend** component when you access the UI.
|
||||
You will be prompted with a login screen, upon which you must supply a username/password combo or OIDC login to access DataHub's UI.
|
||||
This is typical scenario for a human interacting with DataHub.
|
||||
|
||||
DataHub provides 2 methods of authentication:
|
||||
- [JaaS Authentication](guides/jaas.md) for simple deployments where authenticated users are part of some known list or invited as a [Native DataHub User](guides/add-users.md).
|
||||
- [OIDC Authentication](guides/sso/configure-oidc-react.md) to delegate authentication responsibility to third party systems like Okta or Google/Azure Authentication. This is the recommended approach for production systems.
|
||||
|
||||
Upon validation of a user's credentials through one of these authentication systems, DataHub will generate a session token with which all subsequent requests will be made.
|
||||
|
||||
### Authentication in the Backend
|
||||
|
||||
The second way in which authentication occurs, is within DataHub's Backend (Metadata Service) when a user makes a request either through the UI or through APIs.
|
||||
In this case DataHub makes use of Personal Access Tokens or session HTTP headers to apply actions on behalf of some user.
|
||||
To learn more about DataHub's backend authentication have a look at our docs on [Introducing Metadata Service Authentication](introducing-metadata-service-authentication.md).
|
||||
|
||||
Note, while authentication can happen on both the frontend or backend components of DataHub, they are separate, related processes.
|
||||
The first is to authenticate users/services by a third party system (Open-ID connect or Java based authentication) and the latter to only permit identified requests to be accepted by DataHub via access tokens or bearer cookies.
|
||||
|
||||
If you only want some users to interact with DataHub's UI, enable authentication in the Frontend and manage who is allowed either through JaaS or OIDC login methods.
|
||||
If you want users to be able to access DataHub's backend directly without going through the UI in an authenticated manner, then enable authentication in the backend and generate access tokens for them.
|
||||
123
docs/authentication/concepts.md
Normal file
@ -0,0 +1,123 @@
|
||||
# Concepts & Key Components
|
||||
|
||||
We introduced a few important concepts to the Metadata Service to make authentication work:
|
||||
|
||||
1. Actor
|
||||
2. Authenticator
|
||||
3. AuthenticatorChain
|
||||
4. AuthenticationFilter
|
||||
5. DataHub Access Token
|
||||
6. DataHub Token Service
|
||||
|
||||
In following sections, we'll take a closer look at each individually.
|
||||
|
||||

|
||||
*High level overview of Metadata Service Authentication*
|
||||
|
||||
## What is an Actor?
|
||||
|
||||
An **Actor** is a concept within the new Authentication subsystem to represent a unique identity / principal that is initiating actions (e.g. read & write requests)
|
||||
on the platform.
|
||||
|
||||
An actor can be characterized by 2 attributes:
|
||||
|
||||
1. **Type**: The "type" of the actor making a request. The purpose is to for example distinguish between a "user" & "service" actor. Currently, the "user" actor type is the only one
|
||||
formally supported.
|
||||
2. **Id**: A unique identifier for the actor within DataHub. This is commonly known as a "principal" in other systems. In the case of users, this
|
||||
represents a unique "username". This username is in turn used when converting from the "Actor" concept into a Metadata Entity Urn (e.g. CorpUserUrn).
|
||||
|
||||
For example, the root "datahub" super user would have the following attributes:
|
||||
|
||||
```
|
||||
{
|
||||
"type": "USER",
|
||||
"id": "datahub"
|
||||
}
|
||||
```
|
||||
|
||||
Which is mapped to the CorpUser urn:
|
||||
|
||||
```
|
||||
urn:li:corpuser:datahub
|
||||
```
|
||||
|
||||
for Metadata retrieval.
|
||||
|
||||
## What is an Authenticator?
|
||||
|
||||
An **Authenticator** is a pluggable component inside the Metadata Service that is responsible for authenticating an inbound request provided context about the request (currently, the request headers).
|
||||
Authentication boils down to successfully resolving an **Actor** to associate with the inbound request.
|
||||
|
||||
There can be many types of Authenticator. For example, there can be Authenticators that
|
||||
|
||||
- Verify the authenticity of access tokens (ie. issued by either DataHub itself or a 3rd-party IdP)
|
||||
- Authenticate username / password credentials against a remote database (ie. LDAP)
|
||||
|
||||
and more! A key goal of the abstraction is *extensibility*: a custom Authenticator can be developed to authenticate requests
|
||||
based on an organization's unique needs.
|
||||
|
||||
DataHub ships with 2 Authenticators by default:
|
||||
|
||||
- **DataHubSystemAuthenticator**: Verifies that inbound requests have originated from inside DataHub itself using a shared system identifier
|
||||
and secret. This authenticator is always present.
|
||||
|
||||
- **DataHubTokenAuthenticator**: Verifies that inbound requests contain a DataHub-issued Access Token (discussed further in the "DataHub Access Token" section below) in their
|
||||
'Authorization' header. This authenticator is required if Metadata Service Authentication is enabled.
|
||||
|
||||
## What is an AuthenticatorChain?
|
||||
|
||||
An **AuthenticatorChain** is a series of **Authenticators** that are configured to run one-after-another. This allows
|
||||
for configuring multiple ways to authenticate a given request, for example via LDAP OR via local key file.
|
||||
|
||||
Only if each Authenticator within the chain fails to authenticate a request will it be rejected.
|
||||
|
||||
The Authenticator Chain can be configured in the `application.yml` file under `authentication.authenticators`:
|
||||
|
||||
```
|
||||
authentication:
|
||||
....
|
||||
authenticators:
|
||||
# Configure the Authenticators in the chain
|
||||
- type: com.datahub.authentication.Authenticator1
|
||||
...
|
||||
- type: com.datahub.authentication.Authenticator2
|
||||
....
|
||||
```
|
||||
|
||||
## What is the AuthenticationFilter?
|
||||
|
||||
The **AuthenticationFilter** is a [servlet filter](http://tutorials.jenkov.com/java-servlets/servlet-filters.html) that authenticates each and requests to the Metadata Service.
|
||||
It does so by constructing and invoking an **AuthenticatorChain**, described above.
|
||||
|
||||
If an Actor is unable to be resolved by the AuthenticatorChain, then a 401 unauthorized exception will be returned by the filter.
|
||||
|
||||
|
||||
## What is a DataHub Token Service? What are Access Tokens?
|
||||
|
||||
Along with Metadata Service Authentication comes an important new component called the **DataHub Token Service**. The purpose of this
|
||||
component is twofold:
|
||||
|
||||
1. Generate Access Tokens that grant access to the Metadata Service
|
||||
2. Verify the validity of Access Tokens presented to the Metadata Service
|
||||
|
||||
**Access Tokens** granted by the Token Service take the form of [Json Web Tokens](https://jwt.io/introduction), a type of stateless token which
|
||||
has a finite lifespan & is verified using a unique signature. JWTs can also contain a set of claims embedded within them. Tokens issued by the Token
|
||||
Service contain the following claims:
|
||||
|
||||
- exp: the expiration time of the token
|
||||
- version: version of the DataHub Access Token for purposes of evolvability (currently 1)
|
||||
- type: The type of token, currently SESSION (used for UI-based sessions) or PERSONAL (used for personal access tokens)
|
||||
- actorType: The type of the **Actor** associated with the token. Currently, USER is the only type supported.
|
||||
- actorId: The id of the **Actor** associated with the token.
|
||||
|
||||
Today, Access Tokens are granted by the Token Service under two scenarios:
|
||||
|
||||
1. **UI Login**: When a user logs into the DataHub UI, for example via [JaaS](guides/jaas.md) or
|
||||
[OIDC](guides/sso/configure-oidc-react.md), the `datahub-frontend` service issues an
|
||||
request to the Metadata Service to generate a SESSION token *on behalf of* of the user logging in. (*Only the frontend service is authorized to perform this action).
|
||||
2. **Generating Personal Access Tokens**: When a user requests to generate a Personal Access Token (described below) from the UI.
|
||||
|
||||
> At present, the Token Service supports the symmetric signing method `HS256` to generate and verify tokens.
|
||||
|
||||
Now that we're familiar with the concepts, we will talk concretely about what new capabilities have been built on top
|
||||
of Metadata Service Authentication.
|
||||
@ -182,7 +182,7 @@ Setting up SSO via OpenID Connect means that users will be able to login to Data
|
||||
and more.
|
||||
|
||||
This option is recommended for production deployments of DataHub. For detailed information about configuring DataHub to use OIDC to
|
||||
perform authentication, check out [OIDC Authentication](./sso/configure-oidc-react.md).
|
||||
perform authentication, check out [OIDC Authentication](sso/configure-oidc-react.md).
|
||||
|
||||
## URNs
|
||||
|
||||
@ -193,7 +193,7 @@ when a user logs into DataHub via OIDC is used to construct a unique identifier
|
||||
urn:li:corpuser:<extracted-username>
|
||||
```
|
||||
|
||||
For information about configuring which OIDC claim should be used as the username for Datahub, check out the [OIDC Authentication](./sso/configure-oidc-react.md) doc.
|
||||
For information about configuring which OIDC claim should be used as the username for Datahub, check out the [OIDC Authentication](sso/configure-oidc-react.md) doc.
|
||||
|
||||
|
||||
## FAQ
|
||||
@ -1,4 +1,4 @@
|
||||
# OIDC Authentication
|
||||
# Overview
|
||||
|
||||
The DataHub React application supports OIDC authentication built on top of the [Pac4j Play](https://github.com/pac4j/play-pac4j) library.
|
||||
This enables operators of DataHub to integrate with 3rd party identity providers like Okta, Google, Keycloak, & more to authenticate their users.
|
||||
@ -188,5 +188,4 @@ A brief summary of the steps that occur when the user navigates to the React app
|
||||
Even if OIDC is configured the root user can still login without OIDC by going
|
||||
to `/login` URL endpoint. It is recommended that you don't use the default
|
||||
credentials by mounting a different file in the front end container. To do this
|
||||
please see [jaas](https://datahubproject.io/docs/how/auth/jaas/#mount-a-custom-userprops-file-docker-compose) -
|
||||
"Mount a custom user.props file".
|
||||
please see how to mount a custom user.props file for a JAAS authenticated deployment.
|
||||
|
Before Width: | Height: | Size: 58 KiB After Width: | Height: | Size: 58 KiB |
|
Before Width: | Height: | Size: 182 KiB After Width: | Height: | Size: 182 KiB |
|
Before Width: | Height: | Size: 372 KiB After Width: | Height: | Size: 372 KiB |
|
Before Width: | Height: | Size: 216 KiB After Width: | Height: | Size: 216 KiB |
|
Before Width: | Height: | Size: 251 KiB After Width: | Height: | Size: 251 KiB |
|
Before Width: | Height: | Size: 148 KiB After Width: | Height: | Size: 148 KiB |
|
Before Width: | Height: | Size: 393 KiB After Width: | Height: | Size: 393 KiB |
|
Before Width: | Height: | Size: 283 KiB After Width: | Height: | Size: 283 KiB |
|
Before Width: | Height: | Size: 128 KiB After Width: | Height: | Size: 128 KiB |
@ -19,7 +19,7 @@ when a user navigated to `http://localhost:9002/`:
|
||||
|
||||
b. If cookie was present + valid, redirect to the home page
|
||||
|
||||
c. If cookie was invalid, redirect to either a) the DataHub login screen (for [JAAS authentication](https://datahubproject.io/docs/how/auth/jaas/) or b) a [configured OIDC Identity Provider](https://datahubproject.io/docs/how/auth/sso/configure-oidc-react/) to perform authentication.
|
||||
c. If cookie was invalid, redirect to either a) the DataHub login screen (for [JAAS authentication](guides/jaas.md) or b) a [configured OIDC Identity Provider](guides/sso/configure-oidc-react.md) to perform authentication.
|
||||
|
||||
Once authentication had succeeded at the frontend proxy layer, a stateless (token-based) session cookie (PLAY_SESSION) would be set in the users browser.
|
||||
All subsequent requests, including the GraphQL requests issued by the React UI, would be authenticated using this session cookie. Once a request had made it beyond
|
||||
@ -42,177 +42,9 @@ To address these problems, we introduced configurable Authentication inside the
|
||||
meaning that requests are no longer considered trusted until they are authenticated by the Metadata Service.
|
||||
|
||||
Why push Authentication down? In addition to the problems described above, we wanted to plan for a future
|
||||
where Authentication of Kafka-based-writes could be performed in the same manner as Rest writes.
|
||||
where Authentication of Kafka-based-writes could be performed in the same manner as Rest writes.
|
||||
|
||||
Next, we'll cover the components being introduced to support Authentication inside the Metadata Service.
|
||||
|
||||
### Concepts & Key Components
|
||||
|
||||
We introduced a few important concepts to the Metadata Service to make authentication work:
|
||||
|
||||
1. Actor
|
||||
2. Authenticator
|
||||
3. AuthenticatorChain
|
||||
4. AuthenticationFilter
|
||||
5. DataHub Access Token
|
||||
6. DataHub Token Service
|
||||
|
||||
In following sections, we'll take a closer look at each individually.
|
||||
|
||||

|
||||
*High level overview of Metadata Service Authentication*
|
||||
|
||||
#### What is an Actor?
|
||||
|
||||
An **Actor** is a concept within the new Authentication subsystem to represent a unique identity / principal that is initiating actions (e.g. read & write requests)
|
||||
on the platform.
|
||||
|
||||
An actor can be characterized by 2 attributes:
|
||||
|
||||
1. **Type**: The "type" of the actor making a request. The purpose is to for example distinguish between a "user" & "service" actor. Currently, the "user" actor type is the only one
|
||||
formally supported.
|
||||
2. **Id**: A unique identifier for the actor within DataHub. This is commonly known as a "principal" in other systems. In the case of users, this
|
||||
represents a unique "username". This username is in turn used when converting from the "Actor" concept into a Metadata Entity Urn (e.g. CorpUserUrn).
|
||||
|
||||
For example, the root "datahub" super user would have the following attributes:
|
||||
|
||||
```
|
||||
{
|
||||
"type": "USER",
|
||||
"id": "datahub"
|
||||
}
|
||||
```
|
||||
|
||||
Which is mapped to the CorpUser urn:
|
||||
|
||||
```
|
||||
urn:li:corpuser:datahub
|
||||
```
|
||||
|
||||
for Metadata retrieval.
|
||||
|
||||
#### What is an Authenticator?
|
||||
|
||||
An **Authenticator** is a pluggable component inside the Metadata Service that is responsible for authenticating an inbound request provided context about the request (currently, the request headers).
|
||||
Authentication boils down to successfully resolving an **Actor** to associate with the inbound request.
|
||||
|
||||
There can be many types of Authenticator. For example, there can be Authenticators that
|
||||
|
||||
- Verify the authenticity of access tokens (ie. issued by either DataHub itself or a 3rd-party IdP)
|
||||
- Authenticate username / password credentials against a remote database (ie. LDAP)
|
||||
|
||||
and more! A key goal of the abstraction is *extensibility*: a custom Authenticator can be developed to authenticate requests
|
||||
based on an organization's unique needs.
|
||||
|
||||
DataHub ships with 2 Authenticators by default:
|
||||
|
||||
- **DataHubSystemAuthenticator**: Verifies that inbound requests have originated from inside DataHub itself using a shared system identifier
|
||||
and secret. This authenticator is always present.
|
||||
|
||||
- **DataHubTokenAuthenticator**: Verifies that inbound requests contain a DataHub-issued Access Token (discussed further in the "DataHub Access Token" section below) in their
|
||||
'Authorization' header. This authenticator is required if Metadata Service Authentication is enabled.
|
||||
|
||||
#### What is an AuthenticatorChain?
|
||||
|
||||
An **AuthenticatorChain** is a series of **Authenticators** that are configured to run one-after-another. This allows
|
||||
for configuring multiple ways to authenticate a given request, for example via LDAP OR via local key file.
|
||||
|
||||
Only if each Authenticator within the chain fails to authenticate a request will it be rejected.
|
||||
|
||||
The Authenticator Chain can be configured in the `application.yml` file under `authentication.authenticators`:
|
||||
|
||||
```
|
||||
authentication:
|
||||
....
|
||||
authenticators:
|
||||
# Configure the Authenticators in the chain
|
||||
- type: com.datahub.authentication.Authenticator1
|
||||
...
|
||||
- type: com.datahub.authentication.Authenticator2
|
||||
....
|
||||
```
|
||||
|
||||
#### What is the AuthenticationFilter?
|
||||
|
||||
The **AuthenticationFilter** is a [servlet filter](http://tutorials.jenkov.com/java-servlets/servlet-filters.html) that authenticates each and requests to the Metadata Service.
|
||||
It does so by constructing and invoking an **AuthenticatorChain**, described above.
|
||||
|
||||
If an Actor is unable to be resolved by the AuthenticatorChain, then a 401 unauthorized exception will be returned by the filter.
|
||||
|
||||
|
||||
#### What is a DataHub Token Service? What are Access Tokens?
|
||||
|
||||
Along with Metadata Service Authentication comes an important new component called the **DataHub Token Service**. The purpose of this
|
||||
component is twofold:
|
||||
|
||||
1. Generate Access Tokens that grant access to the Metadata Service
|
||||
2. Verify the validity of Access Tokens presented to the Metadata Service
|
||||
|
||||
**Access Tokens** granted by the Token Service take the form of [Json Web Tokens](https://jwt.io/introduction), a type of stateless token which
|
||||
has a finite lifespan & is verified using a unique signature. JWTs can also contain a set of claims embedded within them. Tokens issued by the Token
|
||||
Service contain the following claims:
|
||||
|
||||
- exp: the expiration time of the token
|
||||
- version: version of the DataHub Access Token for purposes of evolvability (currently 1)
|
||||
- type: The type of token, currently SESSION (used for UI-based sessions) or PERSONAL (used for personal access tokens)
|
||||
- actorType: The type of the **Actor** associated with the token. Currently, USER is the only type supported.
|
||||
- actorId: The id of the **Actor** associated with the token.
|
||||
|
||||
Today, Access Tokens are granted by the Token Service under two scenarios:
|
||||
|
||||
1. **UI Login**: When a user logs into the DataHub UI, for example via [JaaS](https://datahubproject.io/docs/how/auth/jaas/) or
|
||||
[OIDC](https://datahubproject.io/docs/how/auth/sso/configure-oidc-react/), the `datahub-frontend` service issues an
|
||||
request to the Metadata Service to generate a SESSION token *on behalf of* of the user logging in. (*Only the frontend service is authorized to perform this action).
|
||||
2. **Generating Personal Access Tokens**: When a user requests to generate a Personal Access Token (described below) from the UI.
|
||||
|
||||
> At present, the Token Service supports the symmetric signing method `HS256` to generate and verify tokens.
|
||||
|
||||
Now that we're familiar with the concepts, we will talk concretely about what new capabilities have been built on top
|
||||
of Metadata Service Authentication.
|
||||
|
||||
### New Capabilities
|
||||
|
||||
#### Personal Access Tokens
|
||||
|
||||
With these changes, we introduced a way to generate a "Personal Access Token" suitable for programmatic use with both the DataHub GraphQL
|
||||
and DataHub Rest.li (Ingestion) APIs.
|
||||
|
||||
Personal Access Tokens have a finite lifespan (default 3 months) and currently cannot be revoked without changing the signing key that
|
||||
DataHub uses to generate these tokens (via the TokenService described above). Most importantly, they inherit the permissions
|
||||
granted to the user who generates them.
|
||||
|
||||
##### Generating Personal Access Tokens
|
||||
|
||||
To generate a personal access token, users must have been granted the "Generate Personal Access Tokens" (GENERATE_PERSONAL_ACCESS_TOKENS) Privilege via a [DataHub Policy](./policies.md). Once
|
||||
they have this permission, users can navigate to **'Settings'** > **'Access Tokens'** > **'Generate Personal Access Token'** to generate a token.
|
||||
|
||||

|
||||
|
||||
The token expiration dictates how long the token will be valid for. We recommend setting the shortest duration possible, as tokens are not currently
|
||||
revokable once granted (without changing the signing key).
|
||||
|
||||
|
||||
#### Using a Personal Access Token
|
||||
|
||||
The user will subsequently be able to make authenticated requests to DataHub frontend proxy or DataHub GMS directly by providing
|
||||
the generated Access Token as a Bearer token in the `Authorization` header:
|
||||
|
||||
```
|
||||
Authorization: Bearer <generated-access-token>
|
||||
```
|
||||
|
||||
For example, using a curl to the frontend proxy (preferred in production):
|
||||
|
||||
`curl 'http://localhost:9002/api/gms/entities/urn:li:corpuser:datahub' -H 'Authorization: Bearer <access-token>`
|
||||
|
||||
or to Metadata Service directly:
|
||||
|
||||
`curl 'http://localhost:8080/entities/urn:li:corpuser:datahub' -H 'Authorization: Bearer <access-token>`
|
||||
|
||||
Without an access token, making programmatic requests will result in a 401 result from the server if Metadata Service Authentication
|
||||
is enabled.
|
||||
|
||||
### Configuring Metadata Service Authentication
|
||||
## Configuring Metadata Service Authentication
|
||||
|
||||
Metadata Service Authentication is currently **opt-in**. This means that you may continue to use DataHub without Metadata Service Authentication without interruption.
|
||||
To enable Metadata Service Authentication:
|
||||
@ -238,7 +70,7 @@ contains a valid Access Token for the Metadata Service. When browsing the UI, th
|
||||
to authenticate each request.
|
||||
|
||||
For users who want to access the Metadata Service programmatically, i.e. for running ingestion, the current recommendation is to generate
|
||||
a **Personal Access Token** (described above) from the root "datahub" user account, and using this token when configuring your [Ingestion Recipes](https://datahubproject.io/docs/metadata-ingestion/#recipes).
|
||||
a **Personal Access Token** (described above) from the root "datahub" user account, and using this token when configuring your [Ingestion Recipes](../../metadata-ingestion/README.md#recipes).
|
||||
To configure the token for use in ingestion, simply populate the "token" configuration for the `datahub-rest` sink:
|
||||
|
||||
```
|
||||
@ -339,7 +171,7 @@ to the **DataHub Frontend Proxy**, as routing to Metadata Service endpoints is c
|
||||
This recommendation is in effort to minimize the exposed surface area of DataHub to make securing, operating, maintaining, and developing
|
||||
the platform simpler.
|
||||
|
||||
In practice, this will require migrating Metadata [Ingestion Recipes](https://datahubproject.io/docs/metadata-ingestion/#recipes) use the `datahub-rest` sink to pointing at a slightly different
|
||||
In practice, this will require migrating Metadata [Ingestion Recipes](../../metadata-ingestion/README.md#recipes) use the `datahub-rest` sink to pointing at a slightly different
|
||||
host + path.
|
||||
|
||||
Example recipe that proxies through DataHub Frontend
|
||||
47
docs/authentication/personal-access-tokens.md
Normal file
@ -0,0 +1,47 @@
|
||||
# Personal Access Tokens
|
||||
|
||||
With these changes, we introduced a way to generate a "Personal Access Token" suitable for programmatic use with both the DataHub GraphQL
|
||||
and DataHub Rest.li (Ingestion) APIs.
|
||||
|
||||
Personal Access Tokens have a finite lifespan (default 3 months) and currently cannot be revoked without changing the signing key that
|
||||
DataHub uses to generate these tokens (via the TokenService described above). Most importantly, they inherit the permissions
|
||||
granted to the user who generates them.
|
||||
|
||||
## Generating Personal Access Tokens
|
||||
|
||||
To generate a personal access token, users must have been granted the "Generate Personal Access Tokens" (GENERATE_PERSONAL_ACCESS_TOKENS) or "Manage All Access Tokens" Privilege via a [DataHub Policy](../authorization/policies.md). Once
|
||||
they have this permission, users can navigate to **'Settings'** > **'Access Tokens'** > **'Generate Personal Access Token'** to generate a token.
|
||||
|
||||

|
||||
|
||||
The token expiration dictates how long the token will be valid for. We recommend setting the shortest duration possible, as tokens are not currently
|
||||
revokable once granted (without changing the signing key).
|
||||
|
||||
|
||||
## Using Personal Access Tokens
|
||||
|
||||
The user will subsequently be able to make authenticated requests to DataHub frontend proxy or DataHub GMS directly by providing
|
||||
the generated Access Token as a Bearer token in the `Authorization` header:
|
||||
|
||||
```
|
||||
Authorization: Bearer <generated-access-token>
|
||||
```
|
||||
|
||||
For example, using a curl to the frontend proxy (preferred in production):
|
||||
|
||||
`curl 'http://localhost:9002/api/gms/entities/urn:li:corpuser:datahub' -H 'Authorization: Bearer <access-token>`
|
||||
|
||||
or to Metadata Service directly:
|
||||
|
||||
`curl 'http://localhost:8080/entities/urn:li:corpuser:datahub' -H 'Authorization: Bearer <access-token>`
|
||||
|
||||
Since authorization now happens at the GMS level, this means that ingestion is also protected behind access tokens, to use them simply add a `token` to the sink config property as seen below:
|
||||
|
||||

|
||||
|
||||
:::note
|
||||
|
||||
Without an access token, making programmatic requests will result in a 401 result from the server if Metadata Service Authentication
|
||||
is enabled.
|
||||
|
||||
:::
|
||||
18
docs/authorization/README.md
Normal file
@ -0,0 +1,18 @@
|
||||
# Overview
|
||||
|
||||
Authorization specifies _what_ accesses an _authenticated_ user has within a system.
|
||||
This section is all about how DataHub authorizes a given user/service that wants to interact with the system.
|
||||
|
||||
:::note
|
||||
|
||||
Authorization only makes sense in the context of an **Authenticated** DataHub deployment. To use DataHub's authorization features
|
||||
please first make sure that the system has been configured from an authentication perspective as you intend.
|
||||
|
||||
:::
|
||||
|
||||
Once the identity of a user or service has been established, DataHub determines what accesses the authenticated request has.
|
||||
|
||||
This is done by checking what operation a given user/service wants to perform within DataHub & whether it is allowed to do so.
|
||||
The set of operations that are allowed in DataHub are what we call **Policies**.
|
||||
|
||||
Policies specify fine-grain access control for _who_ can do _what_ to _which_ resources, for more details on the set of Policies that DataHub provides please see the [Policies Guide](../authorization/policies.md).
|
||||
@ -162,7 +162,7 @@ You need to request a certificate in the AWS Certificate Manager by following th
|
||||
the ARN of the new certificate. You also need to replace host-name with the hostname of choice like
|
||||
demo.datahubproject.io.
|
||||
|
||||
To have the metadata [authentication service](https://datahubproject.io/docs/introducing-metadata-service-authentication/#configuring-metadata-service-authentication) enable and use [API tokens](https://datahubproject.io/docs/introducing-metadata-service-authentication/#generating-personal-access-tokens) from the UI you will need to set the configuration in the values.yaml for the `gms` and the `frontend` deployments. This could be done by enabling the `metadata_service_authentication`:
|
||||
To have the metadata [authentication service](../authentication/introducing-metadata-service-authentication.md#Configuring Metadata Service Authentication) enabled and use [API tokens](../authentication/personal-access-tokens.md#Generating Personal Access Tokens) from the UI you will need to set the configuration in the values.yaml for the `gms` and the `frontend` deployments. This could be done by enabling the `metadata_service_authentication`:
|
||||
|
||||
```
|
||||
datahub:
|
||||
|
||||
@ -18,7 +18,7 @@ DataHub supports Tags, Glossary Terms, & Domains as distinct types of Metadata t
|
||||
## Creating a Domain
|
||||
|
||||
To create a Domain, first navigate to the **Domains** tab in the top-right menu of DataHub. Users must have the Platform Privilege
|
||||
called `Manage Domains` to view this tab, which can be granted by creating a new Platform [Policy](./policies.md).
|
||||
called `Manage Domains` to view this tab, which can be granted by creating a new Platform [Policy](authorization/policies.md).
|
||||
|
||||

|
||||
|
||||
@ -57,7 +57,7 @@ see a 'Domain' section. Click 'Set Domain', and then search for the Domain you'd
|
||||
To remove an asset from a Domain, click the 'x' icon on the Domain tag.
|
||||
|
||||
> Notice: Adding or removing an asset from a Domain requires the `Edit Domain` Metadata Privilege, which can be granted
|
||||
> by a [Policy](./policies.md).
|
||||
> by a [Policy](authorization/policies.md).
|
||||
|
||||
|
||||
## Searching by Domain
|
||||
|
||||
@ -4,7 +4,7 @@ This guide shares how you can add user metadata in DataHub. Usually you would wa
|
||||
|
||||
:::note
|
||||
|
||||
This does not allow you to add new users for Authentication. If you want to add a new user in DataHub for Login please refer to [JaaS Authentication](./auth/jaas.md)
|
||||
This does not allow you to add new users for Authentication. If you want to add a new user in DataHub for Login please refer to [JaaS Authentication](../authentication/guides/jaas.md)
|
||||
|
||||
:::
|
||||
|
||||
|
||||
@ -19,7 +19,7 @@ For Glossary Terms, you are also able to establish relationships between differe
|
||||
|
||||
## Getting to your Glossary
|
||||
|
||||
In order to view a Business Glossary, users must have the Platform Privilege called `Manage Glossaries` which can be granted by creating a new Platform [Policy](../policies.md).
|
||||
In order to view a Business Glossary, users must have the Platform Privilege called `Manage Glossaries` which can be granted by creating a new Platform [Policy](../authorization/policies.md).
|
||||
|
||||
Once granted this privilege, you can access your Glossary by clicking the dropdown at the top of the page called **Govern** and then click **Glossary**:
|
||||
|
||||
|
||||
|
Before Width: | Height: | Size: 84 KiB After Width: | Height: | Size: 561 KiB |
@ -55,7 +55,7 @@ To deploy a new instance of DataHub, perform the following steps.
|
||||
|
||||
:::note
|
||||
|
||||
If you've enabled [Metadata Service Authentication](./introducing-metadata-service-authentication.md), you'll need to provide a Personal Access Token
|
||||
If you've enabled [Metadata Service Authentication](authentication/introducing-metadata-service-authentication.md), you'll need to provide a Personal Access Token
|
||||
using the `--token <token>` parameter in the command.
|
||||
|
||||
:::
|
||||
@ -70,13 +70,13 @@ To start pushing your company's metadata into DataHub, take a look at the [Metad
|
||||
|
||||
### Invite Users
|
||||
|
||||
To add users to your deployment to share with your team check out our [Adding Users to DataHub](./how/auth/add-users.md)
|
||||
To add users to your deployment to share with your team check out our [Adding Users to DataHub](authentication/guides/add-users.md)
|
||||
|
||||
### Enable Authentication
|
||||
|
||||
To enable SSO, check out [Configuring OIDC Authentication](./how/auth/sso/configure-oidc-react.md) or [Configuring JaaS Authentication](./how/auth/jaas.md).
|
||||
To enable SSO, check out [Configuring OIDC Authentication](authentication/guides/sso/configure-oidc-react.md) or [Configuring JaaS Authentication](authentication/guides/jaas.md).
|
||||
|
||||
To enable backend Authentication, check out [authentication in DataHub's backend](./introducing-metadata-service-authentication.md#Configuring Metadata Service Authentication).
|
||||
To enable backend Authentication, check out [authentication in DataHub's backend](authentication/introducing-metadata-service-authentication.md#Configuring Metadata Service Authentication).
|
||||
|
||||
### Move to Production
|
||||
|
||||
|
||||
@ -17,7 +17,7 @@ DataHub supports Tags, Glossary Terms, & Domains as distinct types of Metadata t
|
||||
## Adding a Tag
|
||||
|
||||
Users must have the Metadata Privilege called `Edit Tags` to add tags at the entity level, and the Privilege called `Edit Dataset Column Tags` to edit tags at the column level. These Privileges
|
||||
can be granted by creating a new Metadata [Policy](./policies.md).
|
||||
can be granted by creating a new Metadata [Policy](authorization/policies.md).
|
||||
|
||||
To add a tag at the dataset or container level, simply navigate to the page for that entity and click on the "Add Tag" button.
|
||||
|
||||
|
||||
@ -12,7 +12,7 @@ This document will describe the steps required to configure, schedule, and execu
|
||||
### Prerequisites
|
||||
|
||||
To view & manage UI-based metadata ingestion, you must have the `Manage Metadata Ingestion` & `Manage Secrets`
|
||||
privileges assigned to your account. These can be granted by a [Platform Policy](./policies.md).
|
||||
privileges assigned to your account. These can be granted by a [Platform Policy](authorization/policies.md).
|
||||
|
||||

|
||||
|
||||
@ -112,7 +112,7 @@ _Referencing DataHub Secrets from a Recipe definition_
|
||||
When the Ingestion Source with this Recipe executes, DataHub will attempt to 'resolve' Secrets found within the YAML. If a secret can be resolved, the reference is substituted for its decrypted value prior to execution.
|
||||
Secret values are not persisted to disk beyond execution time, and are never transmitted outside DataHub.
|
||||
|
||||
> **Attention**: Any DataHub users who have been granted the `Manage Secrets` [Platform Privilege](./policies.md) will be able to retrieve plaintext secret values using the GraphQL API.
|
||||
> **Attention**: Any DataHub users who have been granted the `Manage Secrets` [Platform Privilege](authorization/policies.md) will be able to retrieve plaintext secret values using the GraphQL API.
|
||||
|
||||
|
||||
#### Step 3: Schedule Execution
|
||||
@ -191,7 +191,7 @@ A variety of things can cause an ingestion run to fail. Common reasons for failu
|
||||
failures, metadata ingestion will fail. Ensure that the network where DataHub is deployed has access to the data source which
|
||||
you are trying to reach.
|
||||
|
||||
4. **Authentication**: If you've enabled [Metadata Service Authentication](https://datahubproject.io/docs/introducing-metadata-service-authentication/), you'll need to provide a Personal Access Token
|
||||
4. **Authentication**: If you've enabled [Metadata Service Authentication](authentication/introducing-metadata-service-authentication.md), you'll need to provide a Personal Access Token
|
||||
in your Recipe Configuration. To so this, set the 'token' field of the sink configuration to contain a Personal Access Token:
|
||||

|
||||
|
||||
|
||||