GitBook: [#61] Fix elastic search connector config

This commit is contained in:
OpenMetadata 2021-12-03 21:17:53 +00:00 committed by Sriharsha Chintalapani
parent b7994a0639
commit acb9293642
4 changed files with 34 additions and 32 deletions

View File

@ -125,6 +125,7 @@
* [Coding Style](open-source-community/developer/coding-style.md)
* [Build the code & run tests](open-source-community/developer/build-code-run-tests.md)
* [Build a Connector](open-source-community/developer/build-a-connector/README.md)
* [Setup](open-source-community/developer/build-a-connector/setup.md)
* [Source](open-source-community/developer/build-a-connector/source.md)
* [Sink](open-source-community/developer/build-a-connector/sink.md)
* [Stage](open-source-community/developer/build-a-connector/stage.md)

View File

@ -16,9 +16,9 @@ A workflow consists of [Source](source.md) and [Sink](sink.md). It also provides
Workflow execution happens in a serial fashion.
1. The **Workflow** runs the **source** component first. The **source** retrieves a record from external sources and emits the record downstream.
1. The **Workflow** runs the **source** component first. The **source** retrieves a record from external sources and emits the record downstream.
2. If the **processor** component is configured, the **workflow** sends the record to the **processor** next.
3. There can be multiple **processor** components attached to the **workflow**. The **workflow** passes a record to each **processor** in the order they are configured.
3. There can be multiple **processor** components attached to the **workflow**. The **workflow** passes a record to each **processor** in the order they are configured.
4. Once a **processor** is finished, it sends the modified record to the **sink**.
5. The above steps are repeated for each record emitted from the **source**.

View File

@ -1,36 +1,36 @@
# Local Setup
---
description: Let's review the Python tooling to start working on the Ingestion Framework.
---
## Generated Sources
# Setup
The backbone of OpenMetadata are the series of JSON schemas defining the Entities and their properties.
### Generated Sources
All different parts of the code rely on those definitions. The first step to start developing new connectors
is to properly set up your local environment to interact with the Entities.
The backbone of OpenMetadata is the series of JSON schemas defining the Entities and their properties.
In the Ingestion Framework, this process is handled with `datamodel-code-generator`, which is able
to read JSON schemas and automatically prepare `pydantic` models representing the input definitions. Please, make sure to
run `make install_dev generate` from the project root to fill the `ingestion/src/metadata/generated` directory with the required models.
All different parts of the code rely on those definitions. The first step to start developing new connectors is to properly set up your local environment to interact with the Entities.
## Quality tools
In the Ingestion Framework, this process is handled with `datamodel-code-generator`, which is able to read JSON schemas and automatically prepare `pydantic` models representing the input definitions. Please, make sure to run `make install_dev generate` from the project root to fill the `ingestion/src/metadata/generated` directory with the required models.
### Quality tools
When working on the Ingestion Framework, you might want to take into consideration the following style-check tooling:
- [pylint](www.pylint.org) is a Static Code Analysis tool to catch errors, align coding standards and help us follow conventions and apply improvements.
- [black](https://black.readthedocs.io/en/stable/) can be used to both autoformat the code and validate that the codebase is compliant.
- [isort](https://pycqa.github.io/isort/) helps us not lose time trying to find the proper combination of importing from `stdlib`, requirements, project files…
* [pylint](https://www.pylint.org) is a Static Code Analysis tool to catch errors, align coding standards and help us follow conventions and apply improvements.
* [black](https://black.readthedocs.io/en/stable/) can be used to both autoformat the code and validate that the codebase is compliant.
* [isort](https://pycqa.github.io/isort/) helps us not lose time trying to find the proper combination of importing from `stdlib`, requirements, project files…
The main goal is to ensure standardised formatting throughout the codebase.
When developing, you can run this tools with `make` recipes: `make lint`, `make black` and `make isort`. Note that we are excluding the generated sources
from the JSON Schema standards.
When developing, you can run these tools with `make` recipes: `make lint`, `make black` and `make isort`. Note that we are excluding the generated sources from the JSON Schema standards.
If you want to take this one step further and make sure that you are not commiting any malformed changes, you can use [pre-commit hooks](https://pre-commit.com/).
This is a powerful tool that allows us to run specific validations at commit-time. If those validations fail, the commit won't proceed. The interesting point
is that the tools are going to fix your code for you, so you can freely try to commit again!
If you want to take this one step further and make sure that you are not committing any malformed changes, you can use [pre-commit hooks](https://pre-commit.com). This is a powerful tool that allows us to run specific validations at commit time. If those validations fail, the commit won't proceed. The interesting point is that the tools are going to fix your code for you, so you can freely try to commit again!
You can install our hooks via `make precommit_install`.
### Tooling Status
#### Tooling Status
We are currently using:
- `pylint` & `black` in the CI validations, so make sure to review your PRs for any warnings you generated.
- `black` & `isort` in the pre-commit hooks.
* `pylint` & `black` in the CI validations, so make sure to review your PRs for any warnings you generated.
* `black` & `isort` in the pre-commit hooks.

View File

@ -53,21 +53,22 @@ Add Optionally `file` stage and `elasticsearch` bulk\_sink along with `metadata-
```javascript
{
"source": {
"type": "metadata_es",
"config": {}
},
"stage": {
"type": "file",
"type": "metadata",
"config": {
"filename": "/tmp/tables.txt"
"include_tables": "true",
"include_topics": "true",
"include_dashboards": "true",
"limit_records": 10
}
},
"bulk_sink": {
"sink": {
"type": "elasticsearch",
"config": {
"filename": "/tmp/tables.txt",
"es_host_port": "localhost",
"index_name": "table_search_index"
"index_tables": "true",
"index_topics": "true",
"index_dashboards": "true",
"es_host": "localhost",
"es_port": 9200
}
},
"metadata_server": {