mirror of
https://github.com/datahub-project/datahub.git
synced 2025-06-27 05:03:31 +00:00
319 lines
16 KiB
Markdown
319 lines
16 KiB
Markdown
# Plugins Guide
|
|
|
|
Plugins are way to enhance the basic DataHub functionality in a custom manner.
|
|
|
|
Currently, DataHub formally supports 2 types of plugins:
|
|
|
|
- [Authentication](#authentication)
|
|
- [Authorization](#authorization)
|
|
|
|
## Authentication
|
|
|
|
> **Note:** This is in <b>BETA</b> version
|
|
|
|
> It is recommend that you do not do this unless you really know what you are doing
|
|
|
|
Custom authentication plugin makes it possible to authenticate DataHub users against any Identity Management System.
|
|
Choose your Identity Management System and write custom authentication plugin as per detail mentioned in this section.
|
|
|
|
> Currently, custom authenticators cannot be used to authenticate users of DataHub's web UI. This is because the DataHub web app expects the presence of 2 special cookies PLAY_SESSION and actor which are explicitly set by the server when a login action is performed.
|
|
> Instead, custom authenticators are useful for authenticating API requests to DataHub's backend (GMS), and can stand in addition to the default Authentication performed by DataHub, which is based on DataHub-minted access tokens.
|
|
|
|
The sample authenticator implementation can be found at [Authenticator Sample](../metadata-service/plugin/src/test/sample-test-plugins)
|
|
|
|
### Implementing an Authentication Plugin
|
|
|
|
1. Add _datahub-auth-api_ as compileOnly dependency: Maven coordinates of _datahub-auth-api_ can be found at [Maven](https://mvnrepository.com/artifact/io.acryl/datahub-auth-api)
|
|
|
|
Example of gradle dependency is given below.
|
|
|
|
```groovy
|
|
dependencies {
|
|
|
|
def auth_api = 'io.acryl:datahub-auth-api:0.9.3-3rc3'
|
|
compileOnly "${auth_api}"
|
|
testImplementation "${auth_api}"
|
|
|
|
}
|
|
```
|
|
|
|
2. Implement the Authenticator interface: Refer [Authenticator Sample](../metadata-service/plugin/src/test/sample-test-plugins)
|
|
|
|
<details>
|
|
<summary>Sample class which implements the Authenticator interface</summary>
|
|
|
|
```java
|
|
public class GoogleAuthenticator implements Authenticator {
|
|
|
|
@Override
|
|
public void init(@Nonnull Map<String, Object> authenticatorConfig, @Nullable AuthenticatorContext context) {
|
|
// Plugin initialization code will go here
|
|
// DataHub will call this method on boot time
|
|
}
|
|
|
|
@Nullable
|
|
@Override
|
|
public Authentication authenticate(@Nonnull AuthenticationRequest authenticationRequest)
|
|
throws AuthenticationException {
|
|
// DataHub will call this method whenever authentication decisions are need to be taken
|
|
// Authenticate the request and return Authentication
|
|
}
|
|
}
|
|
```
|
|
|
|
</details>
|
|
|
|
3. Use `getResourceAsStream` to read files: If your plugin read any configuration file like properties or YAML or JSON or xml then use `this.getClass().getClassLoader().getResourceAsStream("<file-name>")` to read that file from DataHub GMS plugin's class-path. For DataHub GMS resource look-up behavior please refer [Plugin Installation](#plugin-installation) section. Sample code of `getResourceAsStream` is available in sample Authenticator plugin [TestAuthenticator.java](../metadata-service/plugin/src/test/sample-test-plugins/src/main/java/com/datahub/plugins/test/TestAuthenticator.java).
|
|
|
|
4. Bundle your Jar: Use `com.gradleup.shadow` gradle plugin to create an uber jar.
|
|
|
|
To see an example of building an uber jar, check out the `build.gradle` file for the apache-ranger-plugin file of [Apache Ranger Plugin](https://github.com/acryldata/datahub-ranger-auth-plugin/tree/main/apache-ranger-plugin) for reference.
|
|
|
|
Exclude signature files as shown in below `shadowJar` task.
|
|
|
|
```groovy
|
|
apply plugin: 'com.gradleup.shadow';
|
|
shadowJar {
|
|
// Exclude com.datahub.plugins package and files related to jar signature
|
|
exclude "META-INF/*.RSA", "META-INF/*.SF","META-INF/*.DSA"
|
|
}
|
|
```
|
|
|
|
5. Refer section [Plugin Installation](#plugin-installation) for plugin installation in DataHub environment
|
|
|
|
## Enable GMS Authentication
|
|
|
|
By default, authentication is disabled in DataHub GMS.
|
|
|
|
Follow below steps to enable GMS authentication
|
|
|
|
1. Download docker-compose.quickstart.yml: Download docker compose file [docker-compose.quickstart.yml](../docker/quickstart/docker-compose.quickstart.yml)
|
|
|
|
2. Set environment variable: Set `METADATA_SERVICE_AUTH_ENABLED` environment variable to `true`
|
|
|
|
3. Redeploy DataHub GMS: Below is quickstart command to redeploy DataHub GMS
|
|
|
|
```shell
|
|
datahub docker quickstart -f docker-compose.quickstart.yml
|
|
```
|
|
|
|
## Authorization
|
|
|
|
> **Note:** This is in <b>BETA</b> version
|
|
|
|
> It is recommend that you do not do this unless you really know what you are doing
|
|
|
|
Custom authorization plugin makes it possible to authorize DataHub users against any Access Management System.
|
|
Choose your Access Management System and write custom authorization plugin as per detail mentioned in this section.
|
|
|
|
The sample authorizer implementation can be found at [Authorizer Sample](https://github.com/acryldata/datahub-ranger-auth-plugin/tree/main/apache-ranger-plugin)
|
|
|
|
### Implementing an Authorization Plugin
|
|
|
|
1. Add _datahub-auth-api_ as compileOnly dependency: Maven coordinates of _datahub-auth-api_ can be found at [Maven](https://mvnrepository.com/artifact/io.acryl/datahub-auth-api)
|
|
|
|
Example of gradle dependency is given below.
|
|
|
|
```groovy
|
|
dependencies {
|
|
|
|
def auth_api = 'io.acryl:datahub-auth-api:0.9.3-3rc3'
|
|
compileOnly "${auth_api}"
|
|
testImplementation "${auth_api}"
|
|
|
|
}
|
|
```
|
|
|
|
2. Implement the Authorizer interface: [Authorizer Sample](https://github.com/acryldata/datahub-ranger-auth-plugin/tree/main/apache-ranger-plugin)
|
|
|
|
<details>
|
|
<summary>Sample class which implements the Authorization interface </summary>
|
|
|
|
```java
|
|
public class ApacheRangerAuthorizer implements Authorizer {
|
|
@Override
|
|
public void init(@Nonnull Map<String, Object> authorizerConfig, @Nonnull AuthorizerContext ctx) {
|
|
// Plugin initialization code will go here
|
|
// DataHub will call this method on boot time
|
|
}
|
|
|
|
@Override
|
|
public AuthorizationResult authorize(@Nonnull AuthorizationRequest request) {
|
|
// DataHub will call this method whenever authorization decisions are need be taken
|
|
// Authorize the request and return AuthorizationResult
|
|
}
|
|
|
|
@Override
|
|
public AuthorizedActors authorizedActors(String privilege, Optional<ResourceSpec> resourceSpec) {
|
|
// Need to add doc
|
|
}
|
|
}
|
|
```
|
|
|
|
</details>
|
|
|
|
3. Use `getResourceAsStream` to read files: If your plugin read any configuration file like properties or YAML or JSON or xml then use `this.getClass().getClassLoader().getResourceAsStream("<file-name>")` to read that file from DataHub GMS plugin's class-path. For DataHub GMS resource look-up behavior please refer [Plugin Installation](#plugin-installation) section. Sample code of `getResourceAsStream` is available in sample Authenticator plugin [TestAuthenticator.java](../metadata-service/plugin/src/test/sample-test-plugins/src/main/java/com/datahub/plugins/test/TestAuthenticator.java).
|
|
|
|
4. Bundle your Jar: Use `com.gradleup.shadow` gradle plugin to create an uber jar.
|
|
|
|
To see an example of building an uber jar, check out the `build.gradle` file for the apache-ranger-plugin file of [Apache Ranger Plugin](https://github.com/acryldata/datahub-ranger-auth-plugin/tree/main/apache-ranger-plugin) for reference.
|
|
|
|
Exclude signature files as shown in below `shadowJar` task.
|
|
|
|
```groovy
|
|
apply plugin: 'com.gradleup.shadow';
|
|
shadowJar {
|
|
// Exclude com.datahub.plugins package and files related to jar signature
|
|
exclude "META-INF/*.RSA", "META-INF/*.SF","META-INF/*.DSA"
|
|
}
|
|
```
|
|
|
|
5. Install the Plugin: Refer to the section (Plugin Installation)[#plugin_installation] for plugin installation in DataHub environment
|
|
|
|
## Plugin Installation
|
|
|
|
DataHub's GMS Service searches for the plugins in container's local directory at location `/etc/datahub/plugins/auth/`. This location will be referred as `plugin-base-directory` hereafter.
|
|
|
|
For docker, we set docker-compose to mount `${HOME}/.datahub` directory to `/etc/datahub` directory within the GMS containers.
|
|
|
|
### Docker
|
|
|
|
Follow below steps to install plugins:
|
|
|
|
Lets consider you have created an uber jar for authorizer plugin and jar name is apache-ranger-authorizer.jar and class com.abc.RangerAuthorizer has implemented the [Authorizer](../metadata-auth/auth-api/src/main/java/com/datahub/plugins/auth/authorization/Authorizer.java) interface.
|
|
|
|
1. Create a plugin configuration file: Create a `config.yml` file at `${HOME}/.datahub/plugins/auth/`. For more detail on configuration refer [Config Detail](#config-detail) section
|
|
|
|
2. Create a plugin directory: Create plugin directory as `apache-ranger-authorizer`, this directory will be referred as `plugin-home` hereafter
|
|
|
|
```shell
|
|
mkdir -p ${HOME}/.datahub/plugins/auth/apache-ranger-authorizer
|
|
```
|
|
|
|
3. Copy plugin jar to `plugin-home`: Copy `apache-ranger-authorizer.jar` to `plugin-home`
|
|
|
|
```shell
|
|
copy apache-ranger-authorizer.jar ${HOME}/.datahub/plugins/auth/apache-ranger-authorizer
|
|
```
|
|
|
|
4. Update plugin configuration file: Add below entry in `config.yml` file, the plugin can take any arbitrary configuration under the "configs" block. in our example, there is username and password
|
|
|
|
```yaml
|
|
plugins:
|
|
- name: "apache-ranger-authorizer"
|
|
type: "authorizer"
|
|
enabled: "true"
|
|
params:
|
|
className: "com.abc.RangerAuthorizer"
|
|
configs:
|
|
username: "foo"
|
|
password: "fake"
|
|
```
|
|
|
|
5. Restart datahub-gms container:
|
|
|
|
On startup DataHub GMS service performs below steps
|
|
|
|
1. Load `config.yml`
|
|
2. Prepare list of plugin where `enabled` is set to `true`
|
|
3. Look for directory equivalent to plugin `name` in `plugin-base-directory`. In this case it is `/etc/datahub/plugins/auth/apache-ranger-authorizer/`, this directory will become `plugin-home`
|
|
4. Look for `params.jarFileName` attribute otherwise look for jar having name as <plugin-name>.jar. In this case it is `/etc/datahub/plugins/auth/apache-ranger-authorizer/apache-ranger-authorizer.jar`
|
|
5. Load class given in plugin `params.className` attribute from the jar, here load class `com.abc.RangerAuthorizer` from `apache-ranger-authorizer.jar`
|
|
6. Call `init` method of plugin
|
|
|
|
<br/>On method call of `getResourceAsStream` DataHub GMS service looks for the resource in below order.
|
|
|
|
1. Look for the requested resource in plugin-jar file. if found then return the resource as InputStream.
|
|
2. Look for the requested resource in `plugin-home` directory. if found then return the resource as InputStream.
|
|
3. Look for the requested resource in application class-loader. if found then return the resource as InputStream.
|
|
4. Return `null` as requested resource is not found.
|
|
|
|
By default, authentication is disabled in DataHub GMS, Please follow section [Enable GMS Authentication](#enable-gms-authentication) to enable authentication.
|
|
|
|
### Kubernetes
|
|
|
|
Helm support is coming soon.
|
|
|
|
## Config Detail
|
|
|
|
A sample `config.yml` can be found at [config.yml](../metadata-service/plugin/src/test/resources/valid-base-plugin-dir1/config.yml).
|
|
|
|
`config.yml` structure:
|
|
|
|
| Field | Required | Type | Default | Description |
|
|
| ---------------------------- | -------- | ------------------------------- | ------------------------------- | ---------------------------------------------------------------------------------------- |
|
|
| plugins[].name | ✅ | string | | name of the plugin |
|
|
| plugins[].type | ✅ | enum[authenticator, authorizer] | | type of plugin, possible values are authenticator or authorizer |
|
|
| plugins[].enabled | ✅ | boolean | | whether this plugin is enabled or disabled. DataHub GMS wouldn't process disabled plugin |
|
|
| plugins[].params.className | ✅ | string | | Authenticator or Authorizer implementation class' fully qualified class name |
|
|
| plugins[].params.jarFileName | | string | default to `plugins[].name`.jar | jar file name in `plugin-home` |
|
|
| plugins[].params.configs | | map<string,object> | default to empty map | Runtime configuration required for plugin |
|
|
|
|
> plugins[] is an array of plugin, where you can define multiple authenticator and authorizer plugins. plugin name should be unique in plugins array.
|
|
|
|
## Plugin Permissions
|
|
|
|
Adhere to below plugin access control to keep your plugin forward compatible.
|
|
|
|
- Plugin should read/write file to and from `plugin-home` directory only. Refer [Plugin Installation](#plugin-installation) step2 for `plugin-home` definition
|
|
- Plugin should access port 80 or 443 or port higher than 1024
|
|
|
|
All other access are forbidden for the plugin.
|
|
|
|
> Disclaimer: In BETA version your plugin can access any port and can read/write to any location on file system, however you should implement the plugin as per above access permission to keep your plugin compatible with upcoming release of DataHub.
|
|
|
|
## Migration Of Plugins From application.yaml
|
|
|
|
If you have any custom Authentication or Authorization plugin define in `authorization` or `authentication` section of [application.yaml](../metadata-service/configuration/src/main/resources/application.yaml) then migrate them as per below steps.
|
|
|
|
1. Implement Plugin: For Authentication Plugin follow steps of [Implementing an Authentication Plugin](#implementing-an-authentication-plugin) and for Authorization Plugin follow steps of [Implementing an Authorization Plugin](#implementing-an-authorization-plugin)
|
|
2. Install Plugin: Install the plugins as per steps mentioned in [Plugin Installation](#plugin-installation). Here you need to map the configuration from [application.yaml](../metadata-service/configuration/src/main/resources/application.yaml) to configuration in `config.yml`. This mapping from `application.yaml` to `config.yml` is described below
|
|
|
|
**Mapping for Authenticators**
|
|
|
|
a. In `config.yml` set `plugins[].type` to `authenticator`
|
|
|
|
b. `authentication.authenticators[].type` is mapped to `plugins[].params.className`
|
|
|
|
c. `authentication.authenticators[].configs` is mapped to `plugins[].params.configs`
|
|
|
|
Example Authenticator Plugin configuration in `config.yml`
|
|
|
|
```yaml
|
|
plugins:
|
|
- name: "apache-ranger-authenticator"
|
|
type: "authenticator"
|
|
enabled: "true"
|
|
params:
|
|
className: "com.abc.RangerAuthenticator"
|
|
configs:
|
|
username: "foo"
|
|
password: "fake"
|
|
|
|
```
|
|
|
|
**Mapping for Authorizer**
|
|
|
|
a. In `config.yml` set `plugins[].type` to `authorizer`
|
|
|
|
b. `authorization.authorizers[].type` is mapped to `plugins[].params.className`
|
|
|
|
c. `authorization.authorizers[].configs` is mapped to `plugins[].params.configs`
|
|
|
|
Example Authorizer Plugin configuration in `config.yml`
|
|
|
|
```yaml
|
|
plugins:
|
|
- name: "apache-ranger-authorizer"
|
|
type: "authorizer"
|
|
enabled: "true"
|
|
params:
|
|
className: "com.abc.RangerAuthorizer"
|
|
configs:
|
|
username: "foo"
|
|
password: "fake"
|
|
|
|
```
|
|
|
|
3. Move any other configurations files of your plugin to `plugin_home` directory. The detail about `plugin_home` is mentioned in [Plugin Installation](#plugin-installation) section.
|