From acb9293642c92237fe269804a7b60ff4b2d73f43 Mon Sep 17 00:00:00 2001 From: OpenMetadata Date: Fri, 3 Dec 2021 21:17:53 +0000 Subject: [PATCH] GitBook: [#61] Fix elastic search connector config --- docs/SUMMARY.md | 1 + .../developer/build-a-connector/README.md | 4 +- .../developer/build-a-connector/setup.md | 40 +++++++++---------- .../openmetadata/connectors/elastic-search.md | 21 +++++----- 4 files changed, 34 insertions(+), 32 deletions(-) diff --git a/docs/SUMMARY.md b/docs/SUMMARY.md index aa19124495c..7e7622c1b0e 100644 --- a/docs/SUMMARY.md +++ b/docs/SUMMARY.md @@ -125,6 +125,7 @@ * [Coding Style](open-source-community/developer/coding-style.md) * [Build the code & run tests](open-source-community/developer/build-code-run-tests.md) * [Build a Connector](open-source-community/developer/build-a-connector/README.md) + * [Setup](open-source-community/developer/build-a-connector/setup.md) * [Source](open-source-community/developer/build-a-connector/source.md) * [Sink](open-source-community/developer/build-a-connector/sink.md) * [Stage](open-source-community/developer/build-a-connector/stage.md) diff --git a/docs/open-source-community/developer/build-a-connector/README.md b/docs/open-source-community/developer/build-a-connector/README.md index be94f72812e..4f9af09ed3a 100644 --- a/docs/open-source-community/developer/build-a-connector/README.md +++ b/docs/open-source-community/developer/build-a-connector/README.md @@ -16,9 +16,9 @@ A workflow consists of [Source](source.md) and [Sink](sink.md). It also provides Workflow execution happens in a serial fashion. -1. The **Workflow** runs the **source** component first. The **source** retrieves a record from external sources and emits the record downstream. +1. The **Workflow** runs the **source** component first. The **source** retrieves a record from external sources and emits the record downstream. 2. If the **processor** component is configured, the **workflow** sends the record to the **processor** next. -3. There can be multiple **processor** components attached to the **workflow**. The **workflow** passes a record to each **processor** in the order they are configured. +3. There can be multiple **processor** components attached to the **workflow**. The **workflow** passes a record to each **processor** in the order they are configured. 4. Once a **processor** is finished, it sends the modified record to the **sink**. 5. The above steps are repeated for each record emitted from the **source**. diff --git a/docs/open-source-community/developer/build-a-connector/setup.md b/docs/open-source-community/developer/build-a-connector/setup.md index 99c01019fe1..9f214965428 100644 --- a/docs/open-source-community/developer/build-a-connector/setup.md +++ b/docs/open-source-community/developer/build-a-connector/setup.md @@ -1,36 +1,36 @@ -# Local Setup +--- +description: Let's review the Python tooling to start working on the Ingestion Framework. +--- -## Generated Sources +# Setup -The backbone of OpenMetadata are the series of JSON schemas defining the Entities and their properties. +### Generated Sources -All different parts of the code rely on those definitions. The first step to start developing new connectors -is to properly set up your local environment to interact with the Entities. +The backbone of OpenMetadata is the series of JSON schemas defining the Entities and their properties. -In the Ingestion Framework, this process is handled with `datamodel-code-generator`, which is able -to read JSON schemas and automatically prepare `pydantic` models representing the input definitions. Please, make sure to -run `make install_dev generate` from the project root to fill the `ingestion/src/metadata/generated` directory with the required models. +All different parts of the code rely on those definitions. The first step to start developing new connectors is to properly set up your local environment to interact with the Entities. -## Quality tools +In the Ingestion Framework, this process is handled with `datamodel-code-generator`, which is able to read JSON schemas and automatically prepare `pydantic` models representing the input definitions. Please, make sure to run `make install_dev generate` from the project root to fill the `ingestion/src/metadata/generated` directory with the required models. + +### Quality tools When working on the Ingestion Framework, you might want to take into consideration the following style-check tooling: -- [pylint](www.pylint.org) is a Static Code Analysis tool to catch errors, align coding standards and help us follow conventions and apply improvements. -- [black](https://black.readthedocs.io/en/stable/) can be used to both autoformat the code and validate that the codebase is compliant. -- [isort](https://pycqa.github.io/isort/) helps us not lose time trying to find the proper combination of importing from `stdlib`, requirements, project files… + +* [pylint](https://www.pylint.org) is a Static Code Analysis tool to catch errors, align coding standards and help us follow conventions and apply improvements. +* [black](https://black.readthedocs.io/en/stable/) can be used to both autoformat the code and validate that the codebase is compliant. +* [isort](https://pycqa.github.io/isort/) helps us not lose time trying to find the proper combination of importing from `stdlib`, requirements, project files… The main goal is to ensure standardised formatting throughout the codebase. -When developing, you can run this tools with `make` recipes: `make lint`, `make black` and `make isort`. Note that we are excluding the generated sources -from the JSON Schema standards. +When developing, you can run these tools with `make` recipes: `make lint`, `make black` and `make isort`. Note that we are excluding the generated sources from the JSON Schema standards. -If you want to take this one step further and make sure that you are not commiting any malformed changes, you can use [pre-commit hooks](https://pre-commit.com/). -This is a powerful tool that allows us to run specific validations at commit-time. If those validations fail, the commit won't proceed. The interesting point -is that the tools are going to fix your code for you, so you can freely try to commit again! +If you want to take this one step further and make sure that you are not committing any malformed changes, you can use [pre-commit hooks](https://pre-commit.com). This is a powerful tool that allows us to run specific validations at commit time. If those validations fail, the commit won't proceed. The interesting point is that the tools are going to fix your code for you, so you can freely try to commit again! You can install our hooks via `make precommit_install`. -### Tooling Status +#### Tooling Status We are currently using: -- `pylint` & `black` in the CI validations, so make sure to review your PRs for any warnings you generated. -- `black` & `isort` in the pre-commit hooks. + +* `pylint` & `black` in the CI validations, so make sure to review your PRs for any warnings you generated. +* `black` & `isort` in the pre-commit hooks. diff --git a/docs/openmetadata/connectors/elastic-search.md b/docs/openmetadata/connectors/elastic-search.md index cba0d86d47a..dc0a25859df 100644 --- a/docs/openmetadata/connectors/elastic-search.md +++ b/docs/openmetadata/connectors/elastic-search.md @@ -53,21 +53,22 @@ Add Optionally `file` stage and `elasticsearch` bulk\_sink along with `metadata- ```javascript { "source": { - "type": "metadata_es", - "config": {} - }, - "stage": { - "type": "file", + "type": "metadata", "config": { - "filename": "/tmp/tables.txt" + "include_tables": "true", + "include_topics": "true", + "include_dashboards": "true", + "limit_records": 10 } }, - "bulk_sink": { + "sink": { "type": "elasticsearch", "config": { - "filename": "/tmp/tables.txt", - "es_host_port": "localhost", - "index_name": "table_search_index" + "index_tables": "true", + "index_topics": "true", + "index_dashboards": "true", + "es_host": "localhost", + "es_port": 9200 } }, "metadata_server": {