mirror of
https://github.com/datahub-project/datahub.git
synced 2025-08-15 12:46:53 +00:00
docs(ingest): clarify adding source guide (#9161)
This commit is contained in:
parent
81daae815a
commit
02156662b5
@ -6,7 +6,7 @@ There are two ways of adding a metadata ingestion source.
|
||||
2. You are writing the custom source for yourself and are not going to contribute back (yet).
|
||||
|
||||
If you are going for case (1) just follow the steps 1 to 9 below. In case you are building it for yourself you can skip
|
||||
steps 4-9 (but maybe write tests and docs for yourself as well) and follow the documentation
|
||||
steps 4-8 (but maybe write tests and docs for yourself as well) and follow the documentation
|
||||
on [how to use custom ingestion sources](../docs/how/add-custom-ingestion-source.md)
|
||||
without forking Datahub.
|
||||
|
||||
@ -27,6 +27,7 @@ from `ConfigModel`. The [file source](./src/datahub/ingestion/source/file.py) is
|
||||
We use [pydantic](https://pydantic-docs.helpmanual.io) conventions for documenting configuration flags. Use the `description` attribute to write rich documentation for your configuration field.
|
||||
|
||||
For example, the following code:
|
||||
|
||||
```python
|
||||
from pydantic import Field
|
||||
from datahub.api.configuration.common import ConfigModel
|
||||
@ -49,12 +50,10 @@ generates the following documentation:
|
||||
<img width="70%" src="https://raw.githubusercontent.com/datahub-project/static-assets/main/imgs/metadata-ingestion/generated_config_docs.png"/>
|
||||
</p>
|
||||
|
||||
|
||||
:::note
|
||||
Inline markdown or code snippets are not yet supported for field level documentation.
|
||||
:::
|
||||
|
||||
|
||||
### 2. Set up the reporter
|
||||
|
||||
The reporter interface enables the source to report statistics, warnings, failures, and other information about the run.
|
||||
@ -71,6 +70,8 @@ some [convenience methods](./src/datahub/emitter/mce_builder.py) for commonly us
|
||||
|
||||
### 4. Set up the dependencies
|
||||
|
||||
Note: Steps 4-8 are only required if you intend to contribute the source back to the Datahub project.
|
||||
|
||||
Declare the source's pip dependencies in the `plugins` variable of the [setup script](./setup.py).
|
||||
|
||||
### 5. Enable discoverability
|
||||
@ -131,7 +132,6 @@ class FileSource(Source):
|
||||
|
||||
```
|
||||
|
||||
|
||||
#### 7.2 Write custom documentation
|
||||
|
||||
- Create a copy of [`source-docs-template.md`](./source-docs-template.md) and edit all relevant components.
|
||||
@ -144,12 +144,14 @@ class FileSource(Source):
|
||||
Documentation for the source can be viewed by running the documentation generator from the `docs-website` module.
|
||||
|
||||
##### Step 1: Build the Ingestion docs
|
||||
|
||||
```console
|
||||
# From the root of DataHub repo
|
||||
./gradlew :metadata-ingestion:docGen
|
||||
```
|
||||
|
||||
If this finishes successfully, you will see output messages like:
|
||||
|
||||
```console
|
||||
Ingestion Documentation Generation Complete
|
||||
############################################
|
||||
@ -170,6 +172,7 @@ Ingestion Documentation Generation Complete
|
||||
You can also find documentation files generated at `./docs/generated/ingestion/sources` relative to the root of the DataHub repo. You should be able to locate your specific source's markdown file here and investigate it to make sure things look as expected.
|
||||
|
||||
#### Step 2: Build the Entire Documentation
|
||||
|
||||
To view how this documentation looks in the browser, there is one more step. Just build the entire docusaurus page from the `docs-website` module.
|
||||
|
||||
```console
|
||||
@ -178,6 +181,7 @@ To view how this documentation looks in the browser, there is one more step. Jus
|
||||
```
|
||||
|
||||
This will generate messages like:
|
||||
|
||||
```console
|
||||
...
|
||||
> Task :docs-website:yarnGenerate
|
||||
@ -220,6 +224,7 @@ BUILD SUCCESSFUL in 35s
|
||||
```
|
||||
|
||||
After this you need to run the following script from the `docs-website` module.
|
||||
|
||||
```console
|
||||
cd docs-website
|
||||
npm run serve
|
||||
@ -228,7 +233,6 @@ npm run serve
|
||||
Now, browse to http://localhost:3000 or whichever port npm is running on, to browse the docs.
|
||||
Your source should show up on the left sidebar under `Metadata Ingestion / Sources`.
|
||||
|
||||
|
||||
### 8. Add SQL Alchemy mapping (if applicable)
|
||||
|
||||
Add the source in `get_platform_from_sqlalchemy_uri` function
|
||||
|
Loading…
x
Reference in New Issue
Block a user