diff --git a/docs/.gitbook/assets/conversation-threads.gif b/docs/.gitbook/assets/conversation-threads.gif new file mode 100644 index 00000000000..e6e4cec7409 Binary files /dev/null and b/docs/.gitbook/assets/conversation-threads.gif differ diff --git a/docs/.gitbook/assets/glossary.gif b/docs/.gitbook/assets/glossary.gif new file mode 100644 index 00000000000..261ce006651 Binary files /dev/null and b/docs/.gitbook/assets/glossary.gif differ diff --git a/docs/SUMMARY.md b/docs/SUMMARY.md index 8d5a7172c7b..1a3587c9b6d 100644 --- a/docs/SUMMARY.md +++ b/docs/SUMMARY.md @@ -172,10 +172,31 @@ * [Backup Metadata](../upgrade/upgrade-on-kubernetes/backup-metadata.md) * [Upgrade OpenMetadata on Kubernetes](../upgrade/upgrade-on-kubernetes/upgrade-openmetadata-on-kubernetes.md) -## Metadata Ingestion +## Data Discovery -* [Metadata Ingestion Overview](../metadata-ingestion/metadata-ingestion.md) -* [Ingest Sample Data](../metadata-ingestion/ingest-sample-data.md) +* [Keyword Search](data-discovery/keyword-search.md) +* [Discovery through Association](data-discovery/discovery-through-association.md) +* [Advanced Search](data-discovery/advanced-search.md) +* [Complex Data Types](data-discovery/complex-data-types.md) +* [Deleted Entity Metadata](data-discovery/deleted-entity-metadata.md) + +## Data Collaboration + +* [Activity Feeds](data-collaboration/activity-feeds.md) +* [Conversation Threads](data-collaboration/conversation-threads.md) +* [Descriptions and Tags](data-collaboration/descriptions-and-tags.md) +* [Data Importance](data-collaboration/data-importance.md) +* [Data Ownership](data-collaboration/data-ownership.md) + +## Data Lineage + +* [Lineage Ingestion](data-lineage/lineage-ingestion.md) +* [Edit Data Lineage Manually](data-lineage/edit-data-lineage-manually.md) +* [DBT Integration](data-lineage/dbt-integration.md) + +## Controlled Vocabulary + +* [Glossaries](controlled-vocabulary/glossaries.md) ## Data Quality @@ -183,6 +204,11 @@ * [Metrics](../data-quality/data-quality-overview/metrics.md) * [Tests](../data-quality/data-quality-overview/tests.md) +## Metadata Ingestion + +* [Metadata Ingestion Overview](../metadata-ingestion/metadata-ingestion.md) +* [Ingest Sample Data](../metadata-ingestion/ingest-sample-data.md) + ## Open Source Community * [Community](open-source-community/community.md) diff --git a/docs/controlled-vocabulary/glossaries.md b/docs/controlled-vocabulary/glossaries.md new file mode 100644 index 00000000000..4573ef2e35e --- /dev/null +++ b/docs/controlled-vocabulary/glossaries.md @@ -0,0 +1,19 @@ +# Glossaries + +A glossary is a controlled vocabulary to describe important concepts within your organization. A glossary helps to establish consistent meaning for terms and establish a common understanding and to build a knowledge base. + +Glossary terms can also help to organize or discover data entities. OpenMetadata models a Glossary as a Thesauri, a Controlled Vocabulary that organizes terms with hierarchical, equivalent, and associative relationships. + +Glossaries are a collection of hierarchy of Glossary Terms that belong to a domain. + +* A glossary term is specified with a preferred term for a concept or a terminology, example — **Customer**. +* A glossary term must have a unique and clear definition to establish consistent usage and understanding of the term. +* A term can include **Synonyms,** other terms used for the same concept, example — **Client**, **Shopper**, **Purchaser**, etc. +* A term can have children terms that further specialize a term. Example, a glossary term **Customer**, can have children terms — **Loyal Customer**, **New Customer**, **Online Customer**, etc. +* A term can also have **Related Terms** to capture related concepts. For **Customer**, related terms could be **Customer LTV (LifeTime Value)**, **Customer Acquisition Cost**, etc. + +A glossary term lists **Assets** through which you can discover all the data assets related to the term. Each term has a **life cycle** **status** (e.g., Draft, Active, Deprecated, and Deleted). A term also has a set of **Reviewers** who review and accept the changes to the Glossary for Governance. + +The terms from the glossary can be used for labeling or tagging as additional metadata of data assets for describing and categorizing things. Glossaries are important for data discovery, retrieval, and exploration through conceptual terms and help in data governance. + +![](../.gitbook/assets/glossary.gif) diff --git a/docs/data-collaboration/activity-feeds.md b/docs/data-collaboration/activity-feeds.md new file mode 100644 index 00000000000..384063b83a0 --- /dev/null +++ b/docs/data-collaboration/activity-feeds.md @@ -0,0 +1,9 @@ +# Activity Feeds + +The OpenMetadata home screen features a change activity feed that enables you view a summary of data change events. This feed shows all changes to data sorted with the most recent changes at the top. The entities in the activity feed are clickable including tables, dashboards, team names, etc. There are activity feeds for: + +* All data +* Data for which you are an owner +* Data you are following + +![](../.gitbook/assets/activity-feed.gif) diff --git a/docs/data-collaboration/conversation-threads.md b/docs/data-collaboration/conversation-threads.md new file mode 100644 index 00000000000..705b049e52a --- /dev/null +++ b/docs/data-collaboration/conversation-threads.md @@ -0,0 +1,13 @@ +# Conversation Threads + +Conversation threads enable you to comment on descriptions, tags, and change events across asset types. Conversations enable you to collaborate in real time or asynchronously. Conversation threads facilitate questions, feedback, and suggestions from data users to help maintain and improve data. + +You can @mention users and teams and #mention data assets. Conveniences enable you to make requests of teams that own an asset. + +Conversation threads are associated with assets. Only people with some relationship to the asset are involved in the conversation. You may comment on conversation threads from the description or tag on which the discussion was started, from an asset-level view, and from your global activity view for all assets you own and follow. + +API support enables a variety of integrations, for example with Webooks and Slack integrations + +Conversations can be used to capture knowledge and enhance understanding. In future releases, users or owners will be able to flag certain threads and pin them on the entity page. We’ll work on providing the ability to convert the conversation threads into Tasks, so that we can nurture collaboration towards an action. + +![](../.gitbook/assets/conversation-threads.gif) diff --git a/docs/data-collaboration/data-importance.md b/docs/data-collaboration/data-importance.md new file mode 100644 index 00000000000..4b793dfa280 --- /dev/null +++ b/docs/data-collaboration/data-importance.md @@ -0,0 +1,9 @@ +# Data Importance + +Tier tags enable you to annotate assets with their importance relative to other assets. The Explore UI enables you to filter assets based on importance. + +User Tier tags and usage data to identify the relative importance of data assets. + +![](<../.gitbook/assets/asset-importance (1).gif>) + +#### diff --git a/docs/data-collaboration/data-ownership.md b/docs/data-collaboration/data-ownership.md new file mode 100644 index 00000000000..151ff4c5438 --- /dev/null +++ b/docs/data-collaboration/data-ownership.md @@ -0,0 +1,5 @@ +# Data Ownership + +Use ownership metadata to determine the primary points of contact for any assets of interest in order to get help with any questions you might have. Identify owners who can help with questions about an asset. + +![](../.gitbook/assets/asset-owners.gif) diff --git a/docs/data-collaboration/descriptions-and-tags.md b/docs/data-collaboration/descriptions-and-tags.md new file mode 100644 index 00000000000..4c1985528d8 --- /dev/null +++ b/docs/data-collaboration/descriptions-and-tags.md @@ -0,0 +1,7 @@ +# Descriptions and Tags + +Add descriptions and tags to tables, columns, and other assets. OpenMetadata indexes assets based on descriptions, tags, names, and other metadata to enable keyword, advanced search, and filtering to enable you and others in your organization to discover your data. + +![](../.gitbook/assets/descriptions-tags.gif) + +### diff --git a/docs/data-discovery/advanced-search.md b/docs/data-discovery/advanced-search.md new file mode 100644 index 00000000000..b5e77afa53e --- /dev/null +++ b/docs/data-discovery/advanced-search.md @@ -0,0 +1,7 @@ +# Advanced Search + +Discover assets through frequently joined tables and columns as measured by the data profiler. You can also discover assets through relationships based on data lineage. + +![](../.gitbook/assets/discover-association.gif) + +#### diff --git a/docs/data-discovery/complex-data-types.md b/docs/data-discovery/complex-data-types.md new file mode 100644 index 00000000000..7a3682ea11e --- /dev/null +++ b/docs/data-discovery/complex-data-types.md @@ -0,0 +1,7 @@ +# Complex Data Types + +Add descriptions and tags to nested fields in complex data types like arrays and structs. Locate these assets using keyword search or advanced search. + +![](../.gitbook/assets/complex-data-types.gif) + +### diff --git a/docs/data-discovery/deleted-entity-metadata.md b/docs/data-discovery/deleted-entity-metadata.md new file mode 100644 index 00000000000..4c6c40d0e4e --- /dev/null +++ b/docs/data-discovery/deleted-entity-metadata.md @@ -0,0 +1,7 @@ +# Deleted Entity Metadata + +Entities have a lot of user-generated metadata, such as descriptions, tags, ownership, tiering. There’s also rich metadata generated by OpenMetadata through the data profiler, usage data, lineage, test results, and other graph relationships with other entities. When an entity is deleted, all of this rich information is lost, and it’s not easy to recreate it. OpenMetadata supports soft deletion in the UI and soft and permanent deletion in the API, enabling you to choose whether to maintain metadata for deleted entities. + +![](../.gitbook/assets/deleted-entities.gif) + +### diff --git a/docs/data-discovery/discovery-through-association.md b/docs/data-discovery/discovery-through-association.md new file mode 100644 index 00000000000..ec401b50225 --- /dev/null +++ b/docs/data-discovery/discovery-through-association.md @@ -0,0 +1,7 @@ +# Discovery through Association + +Discover assets through frequently joined tables and columns as measured by the data profiler. You can also discover assets through relationships based on data lineage. + +![](../.gitbook/assets/discover-association.gif) + +#### diff --git a/docs/data-discovery/keyword-search.md b/docs/data-discovery/keyword-search.md new file mode 100644 index 00000000000..dfbb9fe9815 --- /dev/null +++ b/docs/data-discovery/keyword-search.md @@ -0,0 +1,7 @@ +# Keyword Search + +Find assets based on name, description, component metadata (e.g., for columns, charts), and the containing service. + +![](../.gitbook/assets/asset-discovery-features.gif) + +#### diff --git a/docs/data-lineage/dbt-integration.md b/docs/data-lineage/dbt-integration.md new file mode 100644 index 00000000000..b7abdf95d35 --- /dev/null +++ b/docs/data-lineage/dbt-integration.md @@ -0,0 +1,7 @@ +# DBT Integration + +A DBT model provides transformation logic that creates a table from raw data. While lineage tells us broadly what data a table was generated from. A DBT model provides specifics. OpenMetadata includes an integration for DBT that enables you to see what models are being used to generate tables. + +![](../.gitbook/assets/dbt.gif) + +### diff --git a/docs/data-lineage/edit-data-lineage-manually.md b/docs/data-lineage/edit-data-lineage-manually.md new file mode 100644 index 00000000000..057d834f6eb --- /dev/null +++ b/docs/data-lineage/edit-data-lineage-manually.md @@ -0,0 +1,5 @@ +# Edit Data Lineage Manually + +Edit lineage to provide a richer understanding of the provenance of data. The OpenMetadata no-code editor provides a drag and drop interface. Drop tables, pipelines, and dashboards onto the lineage graph. You may add new edges or delete existing edges to better represent data lineage. + +![](../.gitbook/assets/manual-lineage.gif) diff --git a/docs/data-lineage/lineage-ingestion.md b/docs/data-lineage/lineage-ingestion.md new file mode 100644 index 00000000000..62b28339f96 --- /dev/null +++ b/docs/data-lineage/lineage-ingestion.md @@ -0,0 +1,9 @@ +# Lineage Ingestion + +A large subset of connectors distributed with OpenMetadata include support for lineage ingestion. Lineage ingestion processes queries to determine upstream and downstream entities for data assets. Lineage is published to the OpenMetadata catalog when metadata is ingested. + +Using the OpenMetadata user interface and API, you may trace the path of data across tables, pipelines, and dashboards. + +![](../.gitbook/assets/lineage.gif) + +###