mirror of
https://github.com/open-metadata/OpenMetadata.git
synced 2026-01-06 04:26:57 +00:00
GitBook: [#61] Fix elastic search connector config
This commit is contained in:
parent
36e935340f
commit
0ce8d6d99c
@ -117,6 +117,7 @@
|
||||
* [Coding Style](open-source-community/developer/coding-style.md)
|
||||
* [Build the code & run tests](open-source-community/developer/build-code-run-tests.md)
|
||||
* [Build a Connector](open-source-community/developer/build-a-connector/README.md)
|
||||
* [Setup](open-source-community/developer/build-a-connector/setup.md)
|
||||
* [Source](open-source-community/developer/build-a-connector/source.md)
|
||||
* [Processor](open-source-community/developer/build-a-connector/processor.md)
|
||||
* [Sink](open-source-community/developer/build-a-connector/sink.md)
|
||||
|
||||
@ -16,9 +16,9 @@ A workflow consists of [Source](source.md), [Processor](processor.md) and [Sink]
|
||||
|
||||
Workflow execution happens in a serial fashion.
|
||||
|
||||
1. The** Workflow** runs the **source** component first. The **source** retrieves a record from external sources and emits the record downstream.
|
||||
1. The **Workflow** runs the **source** component first. The **source** retrieves a record from external sources and emits the record downstream.
|
||||
2. If the **processor** component is configured, the **workflow** sends the record to the **processor** next.
|
||||
3. There can be multiple **processor** components attached to the **workflow**. The **workflow** passes a record to each **processor** in the order they are configured.
|
||||
3. There can be multiple **processor** components attached to the **workflow**. The **workflow** passes a record to each **processor** in the order they are configured.
|
||||
4. Once a **processor** is finished, it sends the modified record to the **sink**.
|
||||
5. The above steps are repeated for each record emitted from the **source**.
|
||||
|
||||
|
||||
@ -0,0 +1,36 @@
|
||||
---
|
||||
description: Let's review the Python tooling to start working on the Ingestion Framework.
|
||||
---
|
||||
|
||||
# Setup
|
||||
|
||||
### Generated Sources
|
||||
|
||||
The backbone of OpenMetadata is the series of JSON schemas defining the Entities and their properties.
|
||||
|
||||
All different parts of the code rely on those definitions. The first step to start developing new connectors is to properly set up your local environment to interact with the Entities.
|
||||
|
||||
In the Ingestion Framework, this process is handled with `datamodel-code-generator`, which is able to read JSON schemas and automatically prepare `pydantic` models representing the input definitions. Please, make sure to run `make install_dev generate` from the project root to fill the `ingestion/src/metadata/generated` directory with the required models.
|
||||
|
||||
### Quality tools
|
||||
|
||||
When working on the Ingestion Framework, you might want to take into consideration the following style-check tooling:
|
||||
|
||||
* [pylint](https://www.pylint.org) is a Static Code Analysis tool to catch errors, align coding standards and help us follow conventions and apply improvements.
|
||||
* [black](https://black.readthedocs.io/en/stable/) can be used to both autoformat the code and validate that the codebase is compliant.
|
||||
* [isort](https://pycqa.github.io/isort/) helps us not lose time trying to find the proper combination of importing from `stdlib`, requirements, project files…
|
||||
|
||||
The main goal is to ensure standardised formatting throughout the codebase.
|
||||
|
||||
When developing, you can run these tools with `make` recipes: `make lint`, `make black` and `make isort`. Note that we are excluding the generated sources from the JSON Schema standards.
|
||||
|
||||
If you want to take this one step further and make sure that you are not committing any malformed changes, you can use [pre-commit hooks](https://pre-commit.com). This is a powerful tool that allows us to run specific validations at commit time. If those validations fail, the commit won't proceed. The interesting point is that the tools are going to fix your code for you, so you can freely try to commit again!
|
||||
|
||||
You can install our hooks via `make precommit_install`.
|
||||
|
||||
#### Tooling Status
|
||||
|
||||
We are currently using:
|
||||
|
||||
* `pylint` & `black` in the CI validations, so make sure to review your PRs for any warnings you generated.
|
||||
* `black` & `isort` in the pre-commit hooks.
|
||||
@ -53,21 +53,22 @@ Add Optionally `file` stage and `elasticsearch` bulk\_sink along with `metadata-
|
||||
```javascript
|
||||
{
|
||||
"source": {
|
||||
"type": "metadata_es",
|
||||
"config": {}
|
||||
},
|
||||
"stage": {
|
||||
"type": "file",
|
||||
"type": "metadata",
|
||||
"config": {
|
||||
"filename": "/tmp/tables.txt"
|
||||
"include_tables": "true",
|
||||
"include_topics": "true",
|
||||
"include_dashboards": "true",
|
||||
"limit_records": 10
|
||||
}
|
||||
},
|
||||
"bulk_sink": {
|
||||
"sink": {
|
||||
"type": "elasticsearch",
|
||||
"config": {
|
||||
"filename": "/tmp/tables.txt",
|
||||
"es_host_port": "localhost",
|
||||
"index_name": "table_search_index"
|
||||
"index_tables": "true",
|
||||
"index_topics": "true",
|
||||
"index_dashboards": "true",
|
||||
"es_host": "localhost",
|
||||
"es_port": 9200
|
||||
}
|
||||
},
|
||||
"metadata_server": {
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user