mirror of
https://github.com/datahub-project/datahub.git
synced 2025-12-26 01:18:20 +00:00
docs(feature-guide) Impact Analysis (#5765)
* update sidebar titles to remove About DataHub * move impact analysis guide to new folder; update links * update copy in Understand Data in Context section * adding feature guide template to sidebar * adding feature guide template * update docs readme to link to feature guide template * enhance docs-website readme * add comments to feature guide template * add links to graphql and lineage resources * linter cleanup * updating reference links * update to graphql reference links * add image and gif best practices * update feature guide template with image details * fix link * update template from YouTube -> Videos * Update docs-website/README.md Co-authored-by: Harshal Sheth <hsheth2@gmail.com> * update feature to Lineage Impact Analysis Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
This commit is contained in:
parent
91f6084473
commit
4956f5a165
@ -35,4 +35,106 @@ To regenerate GraphQL API docs, simply rebuild the docs-website directory.
|
||||
|
||||
```console
|
||||
./gradlew docs-website:build
|
||||
```
|
||||
```
|
||||
|
||||
## Managing Content
|
||||
|
||||
Please use the following steps when adding/managing content for the docs site.
|
||||
|
||||
### Leverage Documentation Templates
|
||||
|
||||
* [Feature Guide Template](./docs/_feature-guide-template.md)
|
||||
* [Metadata Ingestion Source Template](./metadata-ingestion/source-docs-template.md)
|
||||
|
||||
### Self-Hosted vs. Managed DataHub
|
||||
|
||||
The docs site includes resources for both self-hosted (aka open-source) DataHub and Managed DataHub alike.
|
||||
|
||||
* All Feature Guides should include the `FeatureAvailability` component within the markdown file itself
|
||||
* Features only available via Managed DataHub should have the `saasOnly` class if they are included in `sidebar.js` to display the small "cloud" icon:
|
||||
|
||||
```
|
||||
{
|
||||
type: "doc",
|
||||
id: "path/to/document",
|
||||
className: "saasOnly",
|
||||
},
|
||||
```
|
||||
|
||||
### Sidebar Display Options
|
||||
|
||||
`generateDocsDir.ts` has a bunch of logic to auto-generate the docs site Sidebar; here are a few ways to manage how documents are displayed.
|
||||
|
||||
1. Leverage the document's H1 value
|
||||
|
||||
By default, the Sidebar will display the H1 value of the Markdown file, not the file name itself.
|
||||
|
||||
**NOTE:** `generateDocsDir.ts` will strip leading values of `DataHub ` and `About DataHub ` to minimize repetitive values of DataHub in the sidebar
|
||||
|
||||
2. Hard-code the section title in `generateDocsDir.ts`
|
||||
|
||||
Map the file to a hard-coded value in `const hardcoded_titles`
|
||||
|
||||
3. Assign a `title` separate from the H1 value
|
||||
|
||||
You can add the following details at the top of the markdown file:
|
||||
|
||||
```
|
||||
---
|
||||
title: [value to display in the sidebar]
|
||||
---
|
||||
```
|
||||
|
||||
*This will be ignored your H1 value begins with `DataHub ` or `About DataHub `*
|
||||
|
||||
**NOTE:** Assigning a value for `label:` in `sidebar.js` is not reliable, e.g.
|
||||
|
||||
```
|
||||
{ // Don't do this
|
||||
label: "Usage Guide",
|
||||
type: "doc",
|
||||
id: "path/to/document",
|
||||
},
|
||||
```
|
||||
|
||||
### Determine the Appropriate Sidebar Section
|
||||
|
||||
When adding a new document to the site, determine the appropriate sidebar section:
|
||||
|
||||
**What is DataHub?**
|
||||
|
||||
By the end of this section, readers should understand the core use cases that DataHub addresses, target end-users, high-level architecture, & hosting options.
|
||||
|
||||
**Get Started**
|
||||
|
||||
The goal of this section is to provide the bare-minimum steps required to:
|
||||
- Get DataHub Running
|
||||
- Optionally configure SSO
|
||||
- Add/invite Users
|
||||
- Create Polices & assign roles
|
||||
- Ingest at least one source (i.e., data warehouse)
|
||||
- Understand high-level options for enriching metadata
|
||||
|
||||
**Ingest Metadata**
|
||||
|
||||
This section aims to provide a deeper understanding of how ingestion works. Readers should be able to find details for ingesting from all systems, apply transformers, understand sinks, and understand key concepts of the Ingestion Framework (Sources, Sinks, Transformers, and Recipes).
|
||||
|
||||
**Enrich Metadata**
|
||||
|
||||
The purpose of this section is to provide direction on how to enrich metadata when shift-left isn’t an option.
|
||||
|
||||
**Act on Metadata**
|
||||
|
||||
This section provides concrete examples of acting on metadata changes in real-time and enabling Active Metadata workflows/practices.
|
||||
|
||||
**Deploy DataHub**
|
||||
|
||||
The purpose of this section is to provide the minimum steps required to deploy DataHub to the vendor of your choosing.
|
||||
|
||||
**Developer Guides**
|
||||
|
||||
The purpose of this section is to provide developers & technical users with concrete tutorials on how to work with the DataHub CLI & APIs.
|
||||
|
||||
**Feature Guides**
|
||||
|
||||
This section aims to provide plain-language feature overviews for both technical and non-technical readers alike.
|
||||
@ -246,6 +246,9 @@ function markdown_guess_title(
|
||||
if (sidebar_label.startsWith("DataHub ")) {
|
||||
sidebar_label = sidebar_label.slice(8).trim();
|
||||
}
|
||||
if (sidebar_label.startsWith("About DataHub ")) {
|
||||
sidebar_label = sidebar_label.slice(14).trim();
|
||||
}
|
||||
if (sidebar_label != title) {
|
||||
contents.data.sidebar_label = sidebar_label;
|
||||
}
|
||||
|
||||
@ -217,7 +217,7 @@ module.exports = {
|
||||
// className: "saasOnly",
|
||||
// },
|
||||
// "docs/wip/metadata-analytics",
|
||||
// "docs/wip/impact-analysis",
|
||||
"docs/act-on-metadata/impact-analysis",
|
||||
// {
|
||||
// type: "doc",
|
||||
// id: "docs/wip/events-bridge",
|
||||
@ -513,6 +513,7 @@ module.exports = {
|
||||
// - "perf-test/README",
|
||||
// "metadata-jobs/README",
|
||||
// "docs/how/add-user-data",
|
||||
// "docs/_feature-guide-template"
|
||||
// ],
|
||||
},
|
||||
};
|
||||
|
||||
@ -98,7 +98,7 @@ const featureGuideContent = [
|
||||
{ title: "UI-Based Ingestion", icon: <ApiTwoTone />, to: "docs/ui-ingestion" },
|
||||
{ title: "Search", icon: <SearchOutlined />, to: "docs/how/search" },
|
||||
// { title: "Browse", icon: <CompassTwoTone />, to: "/docs/quickstart" },
|
||||
{ title: "Impact Analysis", icon: <NodeExpandOutlined />, to: "docs/wip/impact-analysis" },
|
||||
{ title: "Lineage Impact Analysis", icon: <NodeExpandOutlined />, to: "docs/act-on-metadata/impact-analysis" },
|
||||
{ title: "Metadata Tests", icon: <CheckCircleTwoTone />, to: "docs/wip/metadata-tests" },
|
||||
{ title: "Approval Flows", icon: <SafetyCertificateTwoTone />, to: "docs/wip/approval-workflows" },
|
||||
{ title: "Personal Access Tokens", icon: <LockTwoTone />, to: "docs/authentication/personal-access-tokens" },
|
||||
|
||||
@ -104,8 +104,8 @@ function Home() {
|
||||
</h2>
|
||||
<p>
|
||||
DataHub is the one-stop shop for documentation, schemas,
|
||||
ownership, lineage, pipelines and usage information. Data
|
||||
quality and data preview information coming soon.
|
||||
ownership, lineage, pipelines, data quality, usage information,
|
||||
and more.
|
||||
</p>
|
||||
</div>
|
||||
<div className="col col--6 col--offset-1">
|
||||
|
||||
@ -1 +1,51 @@
|
||||
# DataHub Docs Overview
|
||||
|
||||
DataHub's project documentation is hosted at [datahubproject.io](https://datahubproject.io/docs)
|
||||
|
||||
## Types of Documentation
|
||||
|
||||
### Feature Guide
|
||||
|
||||
A Feature Guide should follow the [Feature Guide Template](/_feature-guide-template.md), and should provide the following value:
|
||||
|
||||
* At a high level, what is the concept/feature within DataHub?
|
||||
* Why is the feature useful?
|
||||
* What are the common use cases of the feature?
|
||||
* What are the simple steps one needs to take to use the feature?
|
||||
|
||||
When creating a Feature Guide, please remember to:
|
||||
|
||||
* Provide plain-language descriptions for both technical and non-technical readers
|
||||
* Avoid using industry jargon, abbreviations, or acryonyms
|
||||
* Provide descriptive screenshots, links out to relevant YouTube videos, and any other relevant resources
|
||||
* Provide links out to Tutorials for advanced use cases
|
||||
|
||||
*Not all Feature Guides will require a Tutorial.*
|
||||
|
||||
### Tutorial
|
||||
|
||||
A Tutorial is meant to provide very specific steps to accomplish complex workflows and advanced use cases that are out of scope of a Feature Guide.
|
||||
|
||||
Tutorials should be written to accomodate the targeted persona, i.e. Developer, Admin, End-User, etc.
|
||||
|
||||
*Not all Tutorials require an associated Feature Guide.*
|
||||
|
||||
## Docs Best Practices
|
||||
|
||||
### Embedding GIFs and or Screenshots
|
||||
|
||||
* Store GIFs and screenshots in [datahub-project/static-assets](https://github.com/datahub-project/static-assets); this minimizes unnecessarily large image/file sizes in the main repo
|
||||
* Center-align screenshots and size down to 70% - this improves readability/skimability within the site
|
||||
|
||||
Example snippet:
|
||||
|
||||
```
|
||||
<p align="center">
|
||||
<img width="70%" src="https://raw.githubusercontent.com/datahub-project/static-assets/main/imgs/impact-analysis-export-full-list.png"/>
|
||||
</p>
|
||||
```
|
||||
|
||||
* Use the "raw" GitHub image link (right click image from GitHub > Open in New Tab > copy URL):
|
||||
|
||||
* Good: https://raw.githubusercontent.com/datahub-project/static-assets/main/imgs/dbt-test-logic-view.png
|
||||
* Bad: https://github.com/datahub-project/static-assets/blob/main/imgs/dbt-test-logic-view.png
|
||||
83
docs/_feature-guide-template.md
Normal file
83
docs/_feature-guide-template.md
Normal file
@ -0,0 +1,83 @@
|
||||
import FeatureAvailability from '@site/src/components/FeatureAvailability';
|
||||
|
||||
# About DataHub [Feature Name]
|
||||
|
||||
<!-- All Feature Guides should begin with `About DataHub ` to improve SEO -->
|
||||
|
||||
<!--
|
||||
Update feature availability; by default, feature availabilty is Self-Hosted and Managed DataHub
|
||||
|
||||
Add in `saasOnly` for Managed DataHub-only features
|
||||
-->
|
||||
|
||||
<FeatureAvailability/>
|
||||
|
||||
<!-- This section should provide a plain-language overview of feature. Consider the following:
|
||||
|
||||
* What does this feature do? Why is it useful?
|
||||
* What are the typical use cases?
|
||||
* Who are the typical users?
|
||||
* In which DataHub Version did this become available? -->
|
||||
|
||||
## [Feature Name] Setup, Prerequisites, and Permissions
|
||||
|
||||
<!-- This section should provide plain-language instructions on how to configure the feature:
|
||||
|
||||
* What special configuration is required, if any?
|
||||
* How can you confirm you configured it correctly? What is the expected behavior?
|
||||
* What access levels/permissions are required within DataHub? -->
|
||||
|
||||
## Using [Feature Name]
|
||||
|
||||
<!-- Plain-language instructions of how to use the feature
|
||||
|
||||
Provide a step-by-step guide to use feature, including relevant screenshots and/or GIFs
|
||||
|
||||
* Where/how do you access it?
|
||||
* What best practices exist?
|
||||
* What are common code snippets?
|
||||
-->
|
||||
|
||||
## Additional Resources
|
||||
|
||||
<!-- Comment out any irrelevant or empty sections -->
|
||||
|
||||
### Videos
|
||||
|
||||
<!-- Use the following format to embed YouTube videos:
|
||||
|
||||
**Title of YouTube video in bold text**
|
||||
|
||||
<p align="center">
|
||||
<iframe width="560" height="315" src="www.youtube.com/embed/VIDEO_ID" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
|
||||
</p>
|
||||
|
||||
-->
|
||||
|
||||
<!--
|
||||
NOTE: Find the iframe details in YouTube by going to Share > Embed
|
||||
-->
|
||||
|
||||
### GraphQL
|
||||
|
||||
<!-- Bulleted list of relevant GraphQL docs; comment out section if none -->
|
||||
|
||||
### DataHub Blog
|
||||
|
||||
<!-- Bulleted list of relevant DataHub Blog posts; comment out section if none -->
|
||||
|
||||
## FAQ and Troubleshooting
|
||||
|
||||
<!-- Use the following format:
|
||||
|
||||
**Question in bold text**
|
||||
|
||||
Response in plain text
|
||||
|
||||
-->
|
||||
|
||||
*Need more help? Join the conversation in [Slack](http://slack.datahubproject.io)!*
|
||||
|
||||
### Related Features
|
||||
|
||||
<!-- Bulleted list of related features; comment out section if none -->
|
||||
93
docs/act-on-metadata/impact-analysis.md
Normal file
93
docs/act-on-metadata/impact-analysis.md
Normal file
@ -0,0 +1,93 @@
|
||||
import FeatureAvailability from '@site/src/components/FeatureAvailability';
|
||||
|
||||
# About DataHub Lineage Impact Analysis
|
||||
|
||||
<FeatureAvailability/>
|
||||
|
||||
Lineage Impact Analysis is a powerful workflow for understanding the complete set of upstream and downstream dependencies of a Dataset, Dashboard, Chart, and many other DataHub Entities.
|
||||
|
||||
This allows Data Practitioners to proactively identify the impact of breaking schema changes or failed data pipelines on downstream dependencies, rapidly discover which upstream dependencies may have caused unexpected data quality issues, and more.
|
||||
|
||||
Lineage Impact Analysis is available via the DataHub UI and GraphQL endpoints, supporting manual and automated workflows.
|
||||
|
||||
## Lineage Impact Analysis Setup, Prerequisites, and Permissions
|
||||
|
||||
Lineage Impact Analysis is enabled for any Entity that has associated Lineage relationships with other Entities and does not require any additional configuration.
|
||||
|
||||
Any DataHub user with “View Entity Page” permissions is able to view the full set of upstream or downstream Entities and export results to CSV from the DataHub UI.
|
||||
|
||||
## Using Lineage Impact Analysis
|
||||
|
||||
Follow these simple steps to understand the full dependency chain of your data entities.
|
||||
|
||||
1. On a given Entity Page, select the **Lineage** tab
|
||||
|
||||
<p align="center">
|
||||
<img width="70%" src="https://raw.githubusercontent.com/datahub-project/static-assets/main/imgs/impact-analysis-lineage-tab.png"/>
|
||||
</p>
|
||||
|
||||
2. Easily toggle between **Upstream** and **Downstream** dependencies
|
||||
|
||||
<p align="center">
|
||||
<img width="70%" src="https://raw.githubusercontent.com/datahub-project/static-assets/main/imgs/impact-analysis-choose-upstream-downstream.png"/>
|
||||
</p>
|
||||
|
||||
3. Choose the **Degree of Dependencies** you are interested in. The default filter is “1 Degree of Dependency” to minimize processor-intensive queries.
|
||||
|
||||
<p align="center">
|
||||
<img width="70%" src="https://raw.githubusercontent.com/datahub-project/static-assets/main/imgs/impact-analysis-filter-dependencies.png"/>
|
||||
</p>
|
||||
|
||||
4. Slice and dice the result list by Entity Type, Platfrom, Owner, and more to isolate the relevant dependencies
|
||||
|
||||
<p align="center">
|
||||
<img width="70%" src="https://raw.githubusercontent.com/datahub-project/static-assets/main/imgs/impact-analysis-apply-filters.png"/>
|
||||
</p>
|
||||
|
||||
5. Export the full list of dependencies to CSV
|
||||
|
||||
<p align="center">
|
||||
<img width="70%" src="https://raw.githubusercontent.com/datahub-project/static-assets/main/imgs/impact-analysis-export-full-list.png"/>
|
||||
</p>
|
||||
|
||||
6. View the filtered set of dependencies via CSV, with details about assigned ownership, domain, tags, terms, and quick links back to those entities within DataHub
|
||||
|
||||
<p align="center">
|
||||
<img width="70%" src="https://raw.githubusercontent.com/datahub-project/static-assets/main/imgs/impact-analysis-view-export-results.png"/>
|
||||
</p>
|
||||
|
||||
## Additional Resources
|
||||
|
||||
### Videos
|
||||
|
||||
**DataHub 201: Impact Analysis**
|
||||
|
||||
<p align="center">
|
||||
<iframe width="560" height="315" src="https://www.youtube.com/embed/BHG_kzpQ_aQ" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
|
||||
</p>
|
||||
|
||||
### GraphQL
|
||||
|
||||
* [searchAcrossLineage](../../graphql/queries.md#searchacrosslineage)
|
||||
* [searchAcrossLineageInput](../../graphql/inputObjects.md#searchacrosslineageinput)
|
||||
|
||||
### DataHub Blog
|
||||
|
||||
* [Dependency Impact Analysis, Data Validation Outcomes, and MORE! - Highlights from DataHub v0.8.27 & v.0.8.28](https://blog.datahubproject.io/dependency-impact-analysis-data-validation-outcomes-and-more-1302604da233)
|
||||
|
||||
|
||||
### FAQ and Troubleshooting
|
||||
|
||||
**The Lineage Tab is greyed out - why can’t I click on it?**
|
||||
|
||||
This means you have not yet ingested Lineage metadata for that entity. Please see the Lineage Guide to get started.
|
||||
|
||||
**Why is my list of exported dependencies incomplete?**
|
||||
|
||||
We currently limit the list of dependencies to 10,000 records; we suggest applying filters to narrow the result set if you hit that limit.
|
||||
|
||||
*Need more help? Join the conversation in [Slack](http://slack.datahubproject.io)!*
|
||||
|
||||
### Related Features
|
||||
|
||||
* [DataHub Lineage](./docs/lineage/intro.md)
|
||||
@ -1,7 +0,0 @@
|
||||
import FeatureAvailability from '@site/src/components/FeatureAvailability';
|
||||
|
||||
# Impact Analysis
|
||||
|
||||
<FeatureAvailability/>
|
||||
|
||||
This page is under construction - more details coming soon!
|
||||
Loading…
x
Reference in New Issue
Block a user