DataHub's Python SDK makes it easy to search and discover metadata across your data ecosystem. Whether you're exploring unknown datasets, filtering by environment, or building advanced search tools, this guide walks you through how to do it all programmatically.
**With the Search SDK, you can:**
- Search for data assets by keyword or using structured filters
- Filter by environment, platform, type, custom properties, or other metadata fields
- Use `AND` / `OR` / `NOT` logic for advanced queries
## Getting Started
To use DataHub SDK, you'll need to install [`acryl-datahub`](https://pypi.org/project/acryl-datahub/) and set up a connection to your DataHub instance. Follow the [installation guide](https://docs.datahub.com/docs/metadata-ingestion/cli-ingestion#installing-datahub-cli) to get started.
Query and filters can be used together for more precise searches. Check out [this example](#find-all-snowflake-datasets-related-to-forecast) for more details.
Query-based search allows you to search using simple keywords. This matches across common fields like name, description, and column names. This is useful for exploration when you're unsure of the exact asset you're looking for.
#### Find All Entities Related to Sales
For example, the script below searches for any assets that have `sales` in their metadata.
You can combine filters using logical operations like `and_`, `or_`, and `not_` to build advanced queries. Check the [Logical Operator Options](#logical-operator-options) for more details.
#### Advanced: Find entities by other searchable fields
Use `F.custom_filter()` to target specific fields such as urn, name, or description. Check the [Supported Conditions for Custom Filter](#supported-conditions-for-custom-filter) for the full list of allowed `condition` values.
With `F.custom_filter()`, the fields annotated with `@Searchable` in the PDL file can be used for filtering. For example, you can filter datajob entities by fields like `name`, `description`, or `env` since they are annotated with `@Searchable` in the [DataJobInfo.pdl](https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/datajob/DataJobInfo.pdl#L21).
| `CONTAIN` | Contains substring in string fields. |
| `START_WITH` | Begins with a specific substring. |
| `END_WITH` | Ends with a specific substring. |
| `GREATER_THAN` | For numeric or timestamp fields, checks if the value is greater than the specified value. |
| `LESS_THAN` | For numeric or timestamp fields, checks if the value is less than the specified value. |
## FAQ
**How do I handle authentication?**
Generate a Personal Access Token from your DataHub instance settings and pass it into the `DataHubClient`. Check out the [Personal Access Token Guide](../../../authentication/personal-access-tokens.md).
**Can I combine query and filters?**
Yes. Use `query` along with `filter` for more precise searches.