mirror of
https://github.com/open-metadata/OpenMetadata.git
synced 2025-12-05 03:54:23 +00:00
GitBook: [main] 3 pages and one asset modified
This commit is contained in:
parent
d818829416
commit
31a6ae4420
BIN
docs/.gitbook/assets/fork-github (1).png
Normal file
BIN
docs/.gitbook/assets/fork-github (1).png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 277 KiB |
@ -56,6 +56,7 @@
|
||||
* [Metadata Ingestion](install/metadata-ingestion/README.md)
|
||||
* [Ingest Sample Data](install/metadata-ingestion/ingest-sample-data.md)
|
||||
* [Connectors](install/metadata-ingestion/connectors/README.md)
|
||||
* [Hive](install/metadata-ingestion/connectors/hive.md)
|
||||
* [Athena](install/metadata-ingestion/connectors/athena.md)
|
||||
* [BigQuery](install/metadata-ingestion/connectors/bigquery.md)
|
||||
* [ElasticSearch](install/metadata-ingestion/connectors/elastic-search.md)
|
||||
|
||||
105
docs/install/metadata-ingestion/connectors/hive.md
Normal file
105
docs/install/metadata-ingestion/connectors/hive.md
Normal file
@ -0,0 +1,105 @@
|
||||
---
|
||||
description: This guide will help install Hive connector and run manually
|
||||
---
|
||||
|
||||
# Hive
|
||||
|
||||
{% hint style="info" %}
|
||||
**Prerequisites**
|
||||
|
||||
OpenMetadata is built using Java, DropWizard, Jetty, and MySQL.
|
||||
|
||||
1. Python 3.7 or above
|
||||
2. Library: **libsasl2-dev**
|
||||
{% endhint %}
|
||||
|
||||
### Install from PyPI or Source
|
||||
|
||||
{% tabs %}
|
||||
{% tab title="Install Using PyPI" %}
|
||||
```bash
|
||||
#install hive-sasl library
|
||||
sudo apt-get install libsasl2-dev
|
||||
pip install 'openmetadata-ingestion[hive]'
|
||||
python -m spacy download en_core_web_sm
|
||||
```
|
||||
{% endtab %}
|
||||
|
||||
{% tab title="Build from source " %}
|
||||
```bash
|
||||
# checkout OpenMetadata
|
||||
git clone https://github.com/open-metadata/OpenMetadata.git
|
||||
cd OpenMetadata/ingestion
|
||||
#install hive-sasl library
|
||||
sudo apt-get install libsasl2-dev
|
||||
python3 -m venv env
|
||||
source env/bin/activate
|
||||
pip install '.[hive]'
|
||||
```
|
||||
{% endtab %}
|
||||
{% endtabs %}
|
||||
|
||||
### Configuration
|
||||
|
||||
{% code title="hive.json" %}
|
||||
```javascript
|
||||
{
|
||||
"source": {
|
||||
"type": "hive",
|
||||
"config": {
|
||||
"service_name": "local_hive",
|
||||
"service_type": "Hive",
|
||||
"host_port": "localhost:10000"
|
||||
}
|
||||
},
|
||||
...
|
||||
```
|
||||
{% endcode %}
|
||||
|
||||
1. **service\_name** - Service Name for this Hive cluster. If you added the Hive cluster through OpenMetadata UI, make sure the service name matches the same.
|
||||
2. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata
|
||||
|
||||
## Publish to OpenMetadata
|
||||
|
||||
Below is the configuration to publish Hive data into the OpenMetadata service.
|
||||
|
||||
Add optionally `pii` processor and `metadata-rest-tables` sink along with `metadata-server` config
|
||||
|
||||
{% code title="hive.json" %}
|
||||
```javascript
|
||||
{
|
||||
"source": {
|
||||
"type": "hive",
|
||||
"config": {
|
||||
"service_name": "local_hive",
|
||||
"service_type": "Hive",
|
||||
"host_port": "localhost:10000"
|
||||
}
|
||||
},
|
||||
"processor": {
|
||||
"type": "pii",
|
||||
"config": {}
|
||||
},
|
||||
"sink": {
|
||||
"type": "metadata-rest-tables",
|
||||
"config": {}
|
||||
},
|
||||
"metadata_server": {
|
||||
"type": "metadata-server",
|
||||
"config": {
|
||||
"api_endpoint": "http://localhost:8585/api",
|
||||
"auth_provider_type": "no-auth"
|
||||
}
|
||||
},
|
||||
"cron": {
|
||||
"minute": "*/5",
|
||||
"hour": null,
|
||||
"day": null,
|
||||
"month": null,
|
||||
"day_of_week": null
|
||||
}
|
||||
}
|
||||
|
||||
```
|
||||
{% endcode %}
|
||||
|
||||
@ -17,15 +17,19 @@ OpenMetadata Github repository can be accessed here [https://github.com/open-met
|
||||

|
||||
|
||||
Create a local clone of your fork
|
||||
```bash
|
||||
|
||||
```bash
|
||||
git clone https://github.com/<username>/OpenMetadata.git
|
||||
```
|
||||
|
||||
Set a new remote repository that points to the OpenMetadata repository to pull changes from the open source OpenMetadata codebase into your clone
|
||||
|
||||
```bash
|
||||
cd OpenMetadata/
|
||||
git remote add upstream https://github.com/open-metadata/OpenMetadata.git
|
||||
git remote -v
|
||||
```
|
||||
|
||||
## Create a branch in your fork
|
||||
|
||||
```bash
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user