2020-07-31 18:48:18 -07:00
# DataHub Quickstart Guide
2021-06-14 17:15:24 -07:00
## Deploying DataHub
2021-07-13 14:56:47 -07:00
To deploy a new instance of DataHub, perform the following steps.
2021-06-14 17:15:24 -07:00
2020-07-31 18:48:18 -07:00
1. Install [docker ](https://docs.docker.com/install/ ) and [docker-compose ](https://docs.docker.com/compose/install/ ) (if using Linux). Make sure to allocate enough hardware resources for Docker engine. Tested & confirmed config: 2 CPUs, 8GB RAM, 2GB Swap area.
2021-06-14 17:15:24 -07:00
2021-07-13 14:56:47 -07:00
2. Launch the Docker Engine from command line or the desktop app.
2021-06-14 17:15:24 -07:00
3. Install the DataHub CLI
2021-07-13 14:56:47 -07:00
2021-06-14 17:15:24 -07:00
a. Ensure you have Python 3.6+ installed & configured. (Check using `python3 --version` )
2021-07-13 14:56:47 -07:00
b. Run the following commands in your terminal
2021-06-14 17:15:24 -07:00
```
python3 -m pip install --upgrade pip wheel setuptools
python3 -m pip uninstall datahub acryl-datahub || true # sanity check - ok if it fails
python3 -m pip install --upgrade acryl-datahub
datahub version
```
2021-07-13 14:56:47 -07:00
If you see "command not found", try running cli commands with the prefix 'python3 -m' instead: `python3 -m datahub version`
2021-06-14 17:15:24 -07:00
4. To deploy DataHub, run the following CLI command from your terminal
2021-07-13 14:56:47 -07:00
2021-06-14 17:15:24 -07:00
```
2021-07-13 14:56:47 -07:00
datahub docker quickstart
2021-06-14 17:15:24 -07:00
```
2021-07-13 14:56:47 -07:00
Upon completion of this step, you should be able to navigate to the DataHub UI at [http://localhost:9002 ](http://localhost:9002 ) in your browser. You can sign in using `datahub` as both the username and password.
2021-06-14 17:15:24 -07:00
5. To ingest the sample metadata, run the following CLI command from your terminal
```
datahub docker ingest-sample-data
```
2021-07-13 14:56:47 -07:00
That's it! To start pushing your company's metadata into DataHub, take a look at the [Metadata Ingestion Framework ](../metadata-ingestion/README.md ).
2021-06-14 17:15:24 -07:00
## Resetting DataHub
2021-07-13 14:56:47 -07:00
To cleanse DataHub of all of it's state (e.g. before ingesting your own), you can use the CLI `nuke` command.
2021-06-14 17:15:24 -07:00
```
datahub docker nuke
```
2021-06-29 10:30:16 -07:00
## Troubleshooting
2021-06-14 17:15:24 -07:00
### Command not found: datahub
2021-07-13 14:56:47 -07:00
If running the datahub cli produces "command not found" errors inside your terminal, your system may be defaulting to an older
2021-06-14 17:15:24 -07:00
version of Python. Try prefixing your `datahub` commands with `python3 -m` :
2021-07-13 14:56:47 -07:00
2021-06-14 17:15:24 -07:00
```
python3 -m datahub docker quickstart
2021-06-29 10:30:16 -07:00
```
### Miscellaneous Docker issues
2021-07-13 14:56:47 -07:00
There can be misc issues with Docker, like conflicting containers and dangling volumes, that can often be resolved by
pruning your Docker state with the following command. Note that this command removes all unused containers, networks, images (both dangling and unreferenced),
2021-06-29 10:30:16 -07:00
and optionally, volumes.
```
docker system prune
2021-07-13 14:56:47 -07:00
```