From 9ff6b928e55c3acc6c477dc1e91fe382bad90213 Mon Sep 17 00:00:00 2001 From: pmbrull Date: Tue, 22 Mar 2022 15:33:07 +0000 Subject: [PATCH] GitBook: [#110] Python dev docs --- docs/SUMMARY.md | 5 ++-- .../open-source-community/developer/README.md | 4 +-- .../developer/build-a-connector/README.md | 4 --- .../developer/build-code-run-tests/README.md | 13 ++++++++ .../ingestion-framework.md} | 19 ++++++++++-- .../openmetadata-server.md} | 30 +++++++++++-------- .../developer/how-to-contribute.md | 2 +- .../developer/python-api.md | 2 +- .../developer/run-integration-tests.md | 14 ++++----- 9 files changed, 61 insertions(+), 32 deletions(-) create mode 100644 docs/open-source-community/developer/build-code-run-tests/README.md rename docs/open-source-community/developer/{build-a-connector/setup.md => build-code-run-tests/ingestion-framework.md} (65%) rename docs/open-source-community/developer/{build-code-run-tests.md => build-code-run-tests/openmetadata-server.md} (80%) diff --git a/docs/SUMMARY.md b/docs/SUMMARY.md index ee8da69ef75..21b6bbf360a 100644 --- a/docs/SUMMARY.md +++ b/docs/SUMMARY.md @@ -223,10 +223,11 @@ * [How to Contribute](open-source-community/developer/how-to-contribute.md) * [Prerequisites](open-source-community/developer/prerequisites.md) * [Backend](open-source-community/developer/backend/README.md) - * [Build the code & run tests](open-source-community/developer/build-code-run-tests.md) + * [Build the code & run tests](open-source-community/developer/build-code-run-tests/README.md) + * [OpenMetadata Server](open-source-community/developer/build-code-run-tests/openmetadata-server.md) + * [Ingestion Framework](open-source-community/developer/build-code-run-tests/ingestion-framework.md) * [Quick Start Guide](open-source-community/developer/quick-start-guide.md) * [Build a Connector](open-source-community/developer/build-a-connector/README.md) - * [Setup](open-source-community/developer/build-a-connector/setup.md) * [Source](open-source-community/developer/build-a-connector/source.md) * [Sink](open-source-community/developer/build-a-connector/sink.md) * [Stage](open-source-community/developer/build-a-connector/stage.md) diff --git a/docs/open-source-community/developer/README.md b/docs/open-source-community/developer/README.md index 443f6c6be30..01ca63412ab 100644 --- a/docs/open-source-community/developer/README.md +++ b/docs/open-source-community/developer/README.md @@ -18,8 +18,8 @@ This document summarizes information relevant to OpenMetadata committers and con [backend](backend/) {% endcontent-ref %} -{% content-ref url="build-code-run-tests.md" %} -[build-code-run-tests.md](build-code-run-tests.md) +{% content-ref url="build-code-run-tests/" %} +[build-code-run-tests](build-code-run-tests/) {% endcontent-ref %} {% content-ref url="quick-start-guide.md" %} diff --git a/docs/open-source-community/developer/build-a-connector/README.md b/docs/open-source-community/developer/build-a-connector/README.md index 7248e8037ce..d39376d8e26 100644 --- a/docs/open-source-community/developer/build-a-connector/README.md +++ b/docs/open-source-community/developer/build-a-connector/README.md @@ -24,10 +24,6 @@ Workflow execution happens in a serial fashion. In the cases where we need aggregation over the records, we can use the **stage** to write to a file or other store. Use the file written to in **stage** and pass it to **bulk sink** to publish to external services such as **OpenMetadata** or **Elasticsearch**. -{% content-ref url="setup.md" %} -[setup.md](setup.md) -{% endcontent-ref %} - {% content-ref url="source.md" %} [source.md](source.md) {% endcontent-ref %} diff --git a/docs/open-source-community/developer/build-code-run-tests/README.md b/docs/open-source-community/developer/build-code-run-tests/README.md new file mode 100644 index 00000000000..6ba8cee04e8 --- /dev/null +++ b/docs/open-source-community/developer/build-code-run-tests/README.md @@ -0,0 +1,13 @@ +--- +description: Learn how to build and run the building blocks of OpenMetadata +--- + +# Build the code & run tests + +{% content-ref url="openmetadata-server.md" %} +[openmetadata-server.md](openmetadata-server.md) +{% endcontent-ref %} + +{% content-ref url="ingestion-framework.md" %} +[ingestion-framework.md](ingestion-framework.md) +{% endcontent-ref %} diff --git a/docs/open-source-community/developer/build-a-connector/setup.md b/docs/open-source-community/developer/build-code-run-tests/ingestion-framework.md similarity index 65% rename from docs/open-source-community/developer/build-a-connector/setup.md rename to docs/open-source-community/developer/build-code-run-tests/ingestion-framework.md index 9f214965428..569d2ec7a10 100644 --- a/docs/open-source-community/developer/build-a-connector/setup.md +++ b/docs/open-source-community/developer/build-code-run-tests/ingestion-framework.md @@ -1,8 +1,18 @@ --- -description: Let's review the Python tooling to start working on the Ingestion Framework. +description: Configure Python and test the Ingestion Framework --- -# Setup +# Ingestion Framework + +## Prerequisites + +The Ingestion Framework is a Python module that wraps the OpenMetadata API and builds workflows and utilities on top of it. Therefore, you need to make sure that you have the complete OpenMetadata stack running: MySQL + ElasticSearch + OpenMetadata Server. + +To do so, you can either build and run the [OpenMetadata Server](openmetadata-server.md) locally as well, or use the `metadata` CLI to spin up the [Docker containers](../../../overview/run-openmetadata/). + +## Python Setup + +We recommend using `pyenv` to properly install and manage different Python versions in your system. Note that OpenMetadata requires Python version +3.8. This [doc](https://python-docs.readthedocs.io/en/latest/dev/virtualenvs.html) might be helpful to set up the environment virtualization. ### Generated Sources @@ -12,6 +22,8 @@ All different parts of the code rely on those definitions. The first step to sta In the Ingestion Framework, this process is handled with `datamodel-code-generator`, which is able to read JSON schemas and automatically prepare `pydantic` models representing the input definitions. Please, make sure to run `make install_dev generate` from the project root to fill the `ingestion/src/metadata/generated` directory with the required models. +Once you have generated the sources, you should be able to run the tests and the `metadata` CLI. You can test your setup by running `make coverage` and see if you get any errors. + ### Quality tools When working on the Ingestion Framework, you might want to take into consideration the following style-check tooling: @@ -20,7 +32,7 @@ When working on the Ingestion Framework, you might want to take into considerati * [black](https://black.readthedocs.io/en/stable/) can be used to both autoformat the code and validate that the codebase is compliant. * [isort](https://pycqa.github.io/isort/) helps us not lose time trying to find the proper combination of importing from `stdlib`, requirements, project files… -The main goal is to ensure standardised formatting throughout the codebase. +The main goal is to ensure standardized formatting throughout the codebase. When developing, you can run these tools with `make` recipes: `make lint`, `make black` and `make isort`. Note that we are excluding the generated sources from the JSON Schema standards. @@ -34,3 +46,4 @@ We are currently using: * `pylint` & `black` in the CI validations, so make sure to review your PRs for any warnings you generated. * `black` & `isort` in the pre-commit hooks. + diff --git a/docs/open-source-community/developer/build-code-run-tests.md b/docs/open-source-community/developer/build-code-run-tests/openmetadata-server.md similarity index 80% rename from docs/open-source-community/developer/build-code-run-tests.md rename to docs/open-source-community/developer/build-code-run-tests/openmetadata-server.md index ec77940fca4..67918f71aa5 100644 --- a/docs/open-source-community/developer/build-code-run-tests.md +++ b/docs/open-source-community/developer/build-code-run-tests/openmetadata-server.md @@ -1,4 +1,10 @@ -# Build the code & run tests +--- +description: >- + Learn how to run the OpenMetadata server in development mode by using Docker + and IntelliJ. +--- + +# OpenMetadata Server ## Prerequisites @@ -11,7 +17,7 @@ ``` * Bootstrap MySQL with tables - 1. Create a distribution as explained [here](build-code-run-tests.md#create-a-distribution-packaging) + 1. Create a distribution as explained [here](openmetadata-server.md#create-a-distribution-packaging) 2. Extract the distribution tar.gz file and run the following command ``` @@ -20,7 +26,7 @@ ``` * Bootstrap ES with indexes and load sample data into MySQL - 1. Run OpenMetadata service instances through IntelliJ IDEA following the instructions [here](build-code-run-tests.md#run-instance-through-intellij-idea) + 1. Run OpenMetadata service instances through IntelliJ IDEA following the instructions [here](openmetadata-server.md#run-instance-through-intellij-idea) 2. Once the logs indicate that the instance is up, run the following commands from the top-level directory ``` @@ -34,7 +40,7 @@ metadata ingest -c ./pipelines/sample_usage.json metadata ingest -c ./pipelines/metadata_to_es.json ``` -* You are now ready to explore the app by going to http://localhost:8585 \*If the web page doesn't work as intended, please take a look at the troubleshooting steps [here](build-code-run-tests.md#troubleshooting) +* You are now ready to explore the app by going to http://localhost:8585 \*If the web page doesn't work as intended, please take a look at the troubleshooting steps [here](openmetadata-server.md#troubleshooting) ## Building @@ -67,33 +73,33 @@ Add a new Run/Debug configuration like the below screenshot. 2. Click on "Edit Configurations" 3. Click + sign and Select Application and make sure your config looks similar to the below image -![Intellij Runtime Configuration](<../../.gitbook/assets/Intellij-Runtime Config.png>) +![Intellij Runtime Configuration](<../../../.gitbook/assets/Intellij-Runtime Config.png>) ## Add missing dependency Right-click on catalog-rest-service -![](../../../.gitbook/assets/image-1-.png) +![](../../../../.gitbook/assets/image-1-.png) Click on "Open Module Settings" -![](../../../.gitbook/assets/image-2-.png) +![](../../../../.gitbook/assets/image-2-.png) Go to "Dependencies" -![](../../../.gitbook/assets/image-3-.png) +![](../../../../.gitbook/assets/image-3-.png) Click “+” at the bottom of the dialog box and click "Add" -![](../../../.gitbook/assets/image-4-.png) +![](../../../../.gitbook/assets/image-4-.png) Click on Library -![](../../../.gitbook/assets/image-5-.png) +![](../../../../.gitbook/assets/image-5-.png) In that list look for "jersey-client:2.25.1" -![](../../../.gitbook/assets/image-6-.png) +![](../../../../.gitbook/assets/image-6-.png) Select it and click "OK". Now run/debug the application. @@ -104,7 +110,7 @@ Select it and click "OK". Now run/debug the application. * If ElasticSearch in Docker on Mac is crashing, try changing Preferences -> Resources -> Memory to 4GB * If ElasticSearch logs show `high disk watermark [90%] exceeded`, try changing Preferences -> Resources -> Disk Image Size to at least 16GB * `Public Key Retrieval is not allowed` - verify that the JDBC connect URL in `conf/openmetadata.yaml` is configured with the parameter `allowPublicKeyRetrieval=true` - * Browser console shows javascript errors, try doing a [clean build](build-code-run-tests.md#building). Some npm packages may not have been built properly. + * Browser console shows javascript errors, try doing a [clean build](openmetadata-server.md#building). Some npm packages may not have been built properly. ## Coding Style diff --git a/docs/open-source-community/developer/how-to-contribute.md b/docs/open-source-community/developer/how-to-contribute.md index 7af156a6bd2..da2a0b4c162 100644 --- a/docs/open-source-community/developer/how-to-contribute.md +++ b/docs/open-source-community/developer/how-to-contribute.md @@ -36,7 +36,7 @@ git remote -v git checkout -b ISSUE-200 ``` -Make changes. Follow the [Coding Style](https://github.com/open-metadata/OpenMetadata/blob/main/docs/open-source-community/developer/docs/open-source-community/developer/backend/coding-style.md) Guide on best practices and [Build the code & run tests](build-code-run-tests.md) on how to set up IntelliJ, Maven. +Make changes. Follow the [Coding Style](https://github.com/open-metadata/OpenMetadata/blob/main/docs/open-source-community/developer/docs/open-source-community/developer/backend/coding-style.md) Guide on best practices and [Build the code & run tests](build-code-run-tests/) on how to set up IntelliJ, Maven. ## Push your changes to Github diff --git a/docs/open-source-community/developer/python-api.md b/docs/open-source-community/developer/python-api.md index ed5a82475a6..4368f643b66 100644 --- a/docs/open-source-community/developer/python-api.md +++ b/docs/open-source-community/developer/python-api.md @@ -170,7 +170,7 @@ The same would happen if, inside the actual OpenMetadata code, there was not a w As OpenMetadata is a data-centric solution, we need to make sure we have the right ingredients at all times. That is why we have developed a high-level Python API, using `pydantic` models automatically generated from the JSON Schemas. -> OBS: If you are using a [published](https://pypi.org/project/openmetadata-ingestion/) version of the Ingestion Framework, you are already good to go, as we package the code with the `metadata.generated` module. If you are developing a new feature, you can get more information [here](build-a-connector/setup.md). +> OBS: If you are using a [published](https://pypi.org/project/openmetadata-ingestion/) version of the Ingestion Framework, you are already good to go, as we package the code with the `metadata.generated` module. If you are developing a new feature, you can get more information [here](broken-reference). This API wrapper helps developers and consumers in: diff --git a/docs/open-source-community/developer/run-integration-tests.md b/docs/open-source-community/developer/run-integration-tests.md index 76f504dda0b..d15c42e2300 100644 --- a/docs/open-source-community/developer/run-integration-tests.md +++ b/docs/open-source-community/developer/run-integration-tests.md @@ -3,13 +3,14 @@ {% hint style="info" %} **The integration tests don't work at the moment.** -Make sure OpenMetadata is up and running. Refer to instructions [building and running](build-code-run-tests.md). +Make sure OpenMetadata is up and running. Refer to instructions [building and running](build-code-run-tests/). {% endhint %} ## Run MySQL test Run the following commands from the top-level directory -```text + +``` python3 -m venv /tmp/venv source /tmp/venv/bin/activate pip install -r ingestion/requirements.txt @@ -22,7 +23,7 @@ pytest -s -c /dev/null ## Run MsSQL test -```text +``` cd ingestion source env/bin/activate cd tests/integration/mssql @@ -31,7 +32,7 @@ pytest -s -c /dev/null ## Run Postgres test -```text +``` cd ingestion source env/bin/activate cd tests/integration/postgres @@ -40,7 +41,7 @@ pytest -s -c /dev/null ## Run LDAP test -```text +``` python3 -m venv /tmp/venv source /tmp/venv/bin/activate pip install -r ingestion/requirements.txt @@ -53,7 +54,7 @@ pytest -s -c /dev/null ## Run Hive test -```text +``` python3 -m venv /tmp/venv source /tmp/venv/bin/activate pip install -r ingestion/requirements.txt @@ -64,4 +65,3 @@ pip install pyhive thrift sasl thrift_sasl cd ingestion/tests/integration/hive pytest -s -c /dev/null ``` -