datahub/docs/api/graphql/querying-entities.md

5.5 KiB

Querying Metadata Entities

Learn how to fetch & update Entities in your Metadata Graph programmatically

Queries: Reading an Entity

DataHub provides the following GraphQL queries for retrieving entities in your Metadata Graph.

Getting a Metadata Entity

To retrieve a Metadata Entity by primary key (urn), simply use the <entityName>(urn: String!) GraphQL Query.

For example, to retrieve a User entity, you can issue the following GraphQL Query:

As GraphQL

{
  corpUser(urn: "urn:li:corpuser:datahub") {
    username
    urn
  }
}

As CURL

curl --location --request POST 'http://localhost:8080/api/graphql' \
--header 'X-DataHub-Actor: urn:li:corpuser:datahub' \
--header 'Content-Type: application/json' \
--data-raw '{ "query":"{ corpUser(urn: \"urn:li:corpuser:datahub\") { username urn } }", "variables":{}}'

Searching for a Metadata Entity

To perform full-text search against an Entity of a particular type, use the search(input: SearchInput!) GraphQL Query.

As GraphQL:

{
  search(input: { type: "DATASET", query: "my sql dataset", start: 0, count: 10 }) {
    start
    count
    total
    searchResults {
      entity {
         urn
         type
         ...on Dataset {
            name
         }
      }
    }
  }
}

As CURL:

curl --location --request POST 'http://localhost:8080/api/graphql' \
--header 'X-DataHub-Actor: urn:li:corpuser:datahub' \
--header 'Content-Type: application/json' \
--data-raw '{ "query":"{ search(input: { type: DATASET, query: \"my sql dataset\", start: 0, count: 10 }) { start count total searchResults { entity { urn type ...on Dataset { name } } } } }", "variables":{}}'

Note that per-field filtering criteria may additionally be provided.

Coming soon

List Metadata Entities! listDatasets, listDashboards, listCharts, listDataFlows, listDataJobs, listTags

Mutations: Modifying an Entity

DataHub provides the following GraphQL mutations for updating entities in your Metadata Graph.

Authorization

Mutations which change Entity metadata are subject to DataHub Access Policies. This means that DataHub's server will check whether the requesting actor is authorized to perform the action. If you're querying the GraphQL endpoint via the DataHub Proxy Server, which is discussed more in Getting Started, then the Session Cookie provided will carry the actor information. If you're querying the Metadata Service API directly, then you'll have to provide this via a special X-DataHub-Actor HTTP header, which should contain the URN (primary key) of the actor making the request. For example, X-DataHub-Actor: urn:li:corpuser:datahub.

Updating a Metadata Entity

To update an existing Metadata Entity, simply use the update<entityName>(urn: String!, input: EntityUpdateInput!) GraphQL Query.

For example, to update a Dashboard entity, you can issue the following GraphQL mutation:

As GraphQL

mutation updateDashboard {
    updateDashboard(
        urn: "urn:li:dashboard:(looker,baz)",
        input: {
            editableProperties: {
                description: "My new desription"
            }
        }
    ) {
        urn
    }
}

As CURL

curl --location --request POST 'http://localhost:8080/api/graphql' \
--header 'X-DataHub-Actor: urn:li:corpuser:datahub' \
--header 'Content-Type: application/json' \
--data-raw '{ "query": "mutation updateDashboard { updateDashboard(urn:\"urn:li:dashboard:(looker,baz)\", input: { editableProperties: { description: \"My new desription\" } } ) { urn } }", "variables":{}}'

Be careful: these APIs allow you to make significant changes to a Metadata Entity, often including updating the entire set of Owners & Tags.

Adding & Removing Tags / Glossary Terms

To attach a Tag or Glossary Term to a Metadata Entity, you can use the addTag and addTerm mutations. To remove them, you can use the removeTag and removeTerm mutations.

For example, to add a Tag a DataFlow entity, you can issue the following GraphQL mutation:

As GraphQL

mutation addTag {
    addTag(input: { tagUrn: "urn:li:tag:NewTag", resourceUrn: "urn:li:dataFlow:(airflow,dag_abc,PROD)" })
}

As CURL

curl --location --request POST 'http://localhost:8080/api/graphql' \
--header 'X-DataHub-Actor: urn:li:corpuser:datahub' \
--header 'Content-Type: application/json' \
--data-raw '{ "query": "mutation addTag { addTag(input: { tagUrn: \"urn:li:tag:NewTag\", resourceUrn: \"urn:li:dataFlow:(airflow,dag_abc,PROD)\" }) }", "variables":{}}'

Coming soon

Entity Creation: createDataset, createDashboard, createChart, etc. Entity Removal: removeDataset, removeDashboard, removeChart, etc.

Handling Errors

In GraphQL, requests that have errors do not always result in a non-200 HTTP response body. Instead, errors will be present in the response body inside a top-level errors field.

This enables situations in which the client is able to deal gracefully will partial data returned by the application server. To verify that no error has returned after making a GraphQL request, make sure you check both the data and errors fields that are returned.

Feedback, Feature Requests, & Support

Visit our Slack channel to ask questions, tell us what we can do better, & make requests for what you'd like to see in the future. Or just stop by to say 'Hi'.