parthp2107 b8531e7561
changes in documentation (#531)
Co-authored-by: parthp2107 <parth@getcollate.io>
2021-09-20 10:57:50 +05:30

2.2 KiB

description
This guide will help install Athena connector and run manually

Athena

{% hint style="info" %} Prerequisites

OpenMetadata is built using Java, DropWizard, Jetty, and MySQL.

  1. Python 3.7 or above {% endhint %}

Install from PyPI or Source

{% tabs %} {% tab title="Install Using PyPI" %}

pip install 'openmetadata-ingestion[athena]'

{% endtab %} {% endtabs %}

Configuration

{% code title="athena.json" %}

{
  "source": {
    "type": "athena",
    "config": {
      "host_port":"host_port",
      "username": "openmetadata_user",
      "password": "openmetadata_password",
      "service_name": "local_athena",
      "service_type": "Athena"
    }
  },
 ...

{% endcode %}

  1. username - pass the Athena username. We recommend creating a user with read-only permissions to all the databases in your Athena installation
  2. password - password for the username
  3. service_name - Service Name for this Athena cluster. If you added the Athena cluster through OpenMetadata UI, make sure the service name matches the same.
  4. filter_pattern - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata

Publish to OpenMetadata

Below is the configuration to publish Athena data into the OpenMetadata service.

Add optionally pii processor and metadata-rest-tables sink along with metadata-server config

{% code title="athena.json" %}

{
  "source": {
    "type": "athena",
    "config": {
      "host_port":"host_port",
      "username": "openmetadata_user",
      "password": "openmetadata_password",
      "service_name": "local_athena",
      "service_type": "Athena"
    }
  },
  "processor": {
    "type": "pii",
    "config": {
      "api_endpoint": "http://localhost:8585/api"
    }
  },
  "sink": {
    "type": "metadata-rest",
    "config": {
    }
  },
  "metadata_server": {
    "type": "metadata-server",
    "config": {
      "api_endpoint": "http://localhost:8585/api",
        "auth_provider_type": "no-auth"
    }
  },
  "cron": {
    "minute": "*/5",
    "hour": null,
    "day": null,
    "month": null,
    "day_of_week": null
  }
}

{% endcode %}