Doc: Update Custom Connector Docs (#17002)

* Doc: Update Custom Connector Docs

* Doc: Update Custom Connector Docs

---------

Co-authored-by: Prajwal Pandit <prajwalpandit@Prajwals-MacBook-Air.local>
This commit is contained in:
Prajwal214 2024-07-12 14:03:47 +05:30 committed by GitHub
parent 2aef457785
commit 19c43273dc
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
4 changed files with 112 additions and 16 deletions

View File

@ -14,9 +14,9 @@ process is the same for Pipelines, Dashboard or Messaging services.
{% note %}
This guide is based on a working example in the OpenMetadata Demos repository: [link](https://github.com/open-metadata/openmetadata-demo/tree/main/custom-connector).
#### This guide is based on a working example in the OpenMetadata Demos repository: [link](https://github.com/open-metadata/openmetadata-demo/tree/main/custom-connector).
We'd recommend to go through the example to better understand how all the pieces should look like.
#### We'd recommend to go through the example to better understand how all the pieces should look like.
{% /note %}
@ -28,13 +28,13 @@ Watch OpenMetadata's [Webinar on Custom Connectors](https://www.youtube.com/watc
## Step 1 - Prepare your Connector
A connector is a class that extends from `metadata.ingestion.api.source.Source`. It should implement
all the required methods ([docs](/sdk/python/build-connector/source#for-consumers-of-openmetadata-ingestion-to-define-custom-connectors-in-their-own-package-with-same-namespace)).
A connector is a class that extends from `from metadata.ingestion.api.steps import Source`. It should implement
all the required methods ([docs](https://docs.open-metadata.org/sdk/python/build-connector/source#for-consumers-of-openmetadata-ingestion-to-define-custom-connectors-in-their-own-package-with-same-namespace)).
In [connector/my_awesome_connector.py](https://github.com/open-metadata/openmetadata-demo/blob/main/custom-connector/connector/my_awesome_connector.py) you have a minimal example of it.
Note how the important method is the `next_record`. This is the generator function that will be iterated over
to send all the Create Entity Requests to the `Sink`. Read more about the `Workflow` [here](/sdk/python/build-connector).
Note how te important method is the `_iter`. This is the generator function that will be iterated over
to send all the Create Entity Requests to the `Sink`. Read more about the `Workflow` [here](https://docs.open-metadata.org/sdk/python/build-connector).
## Step 2 - Yield the Data
@ -44,6 +44,54 @@ the different Entities, a recommended read is the Python SDK [docs](/sdk/python)
We do not have docs and examples of all the supported Services. A way to get examples on how to create and fetch
other types of Entities is to directly refer to the `ometa` [integration tests](https://github.com/open-metadata/OpenMetadata/tree/main/ingestion/tests/integration/ometa).
### Either & StackTraceError
When we `yield` the data, we are now wrapping the state of the execution being correct or not with an `Either` class:
```python
from metadata.ingestion.api.models import Either, StackTraceError
```
This `Either` will have a `left` or `right`, and we will either return:
- `right` with the correct `CreateEntityRequest`
- `left` with the exception that we want to track with `StackTraceError`.
For example:
```python
try:
1 / 0
except Exception:
yield Either(
left=StackTraceError(
name="My Error",
error="Demoing one error",
stack_trace=traceback.format_exc(),
)
)
for row in self.data:
yield Either(
right=CreateTableRequest(
...
)
)
```
Note that with the new structure, any errors are going to be properly logged at the end of the execution as:
```
+--------+---------------+-------------------+--------------------------------------------------------------------------------------------------------------------------+
| From | Entity Name | Message | Stack Trace |
+========+===============+===================+==========================================================================================================================+
| Source | My Error | Demoing one error | Traceback (most recent call last): |
| | | | File "/Users/pmbrull/github/openmetadata-demo/custom-connector/connector/my_csv_connector.py", line 182, in yield_data |
| | | | 1 / 0 |
| | | | ZeroDivisionError: division by zero |
+--------+---------------+-------------------+--------------------------------------------------------------------------------------------------------------------------+
```
## Step 3 - Prepare the Package Installation
We'll need to package the code so that it can be shipped to the ingestion container and used there. In this demo
@ -55,11 +103,11 @@ If you want to use the connector from the UI, the Python environment running the
the new code you just created. For example, if running via Docker, the `openmetadata-ingestion` image should be
aware of your new package.
We will be running the demo against the OpenMetadata version `0.13.2`, therefore, our Dockerfile looks like:
We will be running the demo against the OpenMetadata version `1.4.4`, therefore, our Dockerfile looks like:
```Dockerfile
# Base image from the right version
FROM openmetadata/ingestion:0.13.2
FROM openmetadata/ingestion:1.4.4
# Let's use the same workdir as the ingestion image
WORKDIR ingestion

View File

@ -14,9 +14,9 @@ process is the same for Pipelines, Dashboard or Messaging services.
{% note %}
This guide is based on a working example in the OpenMetadata Demos repository: [link](https://github.com/open-metadata/openmetadata-demo/tree/main/custom-connector).
#### This guide is based on a working example in the OpenMetadata Demos repository: [link](https://github.com/open-metadata/openmetadata-demo/tree/main/custom-connector).
We'd recommend to go through the example to better understand how all the pieces should look like.
#### We'd recommend to go through the example to better understand how all the pieces should look like.
{% /note %}
@ -28,13 +28,13 @@ Watch OpenMetadata's [Webinar on Custom Connectors](https://www.youtube.com/watc
## Step 1 - Prepare your Connector
A connector is a class that extends from `metadata.ingestion.api.source.Source`. It should implement
all the required methods ([docs](/sdk/python/build-connector/source#for-consumers-of-openmetadata-ingestion-to-define-custom-connectors-in-their-own-package-with-same-namespace)).
A connector is a class that extends from `from metadata.ingestion.api.steps import Source`. It should implement
all the required methods ([docs](https://docs.open-metadata.org/sdk/python/build-connector/source#for-consumers-of-openmetadata-ingestion-to-define-custom-connectors-in-their-own-package-with-same-namespace)).
In [connector/my_awesome_connector.py](https://github.com/open-metadata/openmetadata-demo/blob/main/custom-connector/connector/my_awesome_connector.py) you have a minimal example of it.
Note how the important method is the `next_record`. This is the generator function that will be iterated over
to send all the Create Entity Requests to the `Sink`. Read more about the `Workflow` [here](/sdk/python/build-connector).
Note how te important method is the `_iter`. This is the generator function that will be iterated over
to send all the Create Entity Requests to the `Sink`. Read more about the `Workflow` [here](https://docs.open-metadata.org/sdk/python/build-connector).
## Step 2 - Yield the Data
@ -44,6 +44,54 @@ the different Entities, a recommended read is the Python SDK [docs](/sdk/python)
We do not have docs and examples of all the supported Services. A way to get examples on how to create and fetch
other types of Entities is to directly refer to the `ometa` [integration tests](https://github.com/open-metadata/OpenMetadata/tree/main/ingestion/tests/integration/ometa).
### Either & StackTraceError
When we `yield` the data, we are now wrapping the state of the execution being correct or not with an `Either` class:
```python
from metadata.ingestion.api.models import Either, StackTraceError
```
This `Either` will have a `left` or `right`, and we will either return:
- `right` with the correct `CreateEntityRequest`
- `left` with the exception that we want to track with `StackTraceError`.
For example:
```python
try:
1 / 0
except Exception:
yield Either(
left=StackTraceError(
name="My Error",
error="Demoing one error",
stack_trace=traceback.format_exc(),
)
)
for row in self.data:
yield Either(
right=CreateTableRequest(
...
)
)
```
Note that with the new structure, any errors are going to be properly logged at the end of the execution as:
```
+--------+---------------+-------------------+--------------------------------------------------------------------------------------------------------------------------+
| From | Entity Name | Message | Stack Trace |
+========+===============+===================+==========================================================================================================================+
| Source | My Error | Demoing one error | Traceback (most recent call last): |
| | | | File "/Users/pmbrull/github/openmetadata-demo/custom-connector/connector/my_csv_connector.py", line 182, in yield_data |
| | | | 1 / 0 |
| | | | ZeroDivisionError: division by zero |
+--------+---------------+-------------------+--------------------------------------------------------------------------------------------------------------------------+
```
## Step 3 - Prepare the Package Installation
We'll need to package the code so that it can be shipped to the ingestion container and used there. In this demo
@ -55,11 +103,11 @@ If you want to use the connector from the UI, the Python environment running the
the new code you just created. For example, if running via Docker, the `openmetadata-ingestion` image should be
aware of your new package.
We will be running the demo against the OpenMetadata version `0.13.2`, therefore, our Dockerfile looks like:
We will be running the demo against the OpenMetadata version `1.4.4`, therefore, our Dockerfile looks like:
```Dockerfile
# Base image from the right version
FROM openmetadata/ingestion:0.13.2
FROM openmetadata/ingestion:1.4.4
# Let's use the same workdir as the ingestion image
WORKDIR ingestion

Binary file not shown.

Before

Width:  |  Height:  |  Size: 110 KiB

After

Width:  |  Height:  |  Size: 130 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 110 KiB

After

Width:  |  Height:  |  Size: 130 KiB