These tests verify that, given an index settings and mappings, data can be written to the index, and read from it with a query_all query. These are very simple sanity tests.
We can, and should, write more complex tests that specific to each index in the future.
The environment was not set correctly, so it could not fire kafka events. It (mce-cli) always worked when running outside of docker.
I also added a dev ingestion docker image / script which may be much faster if you've already built locally.
Tested:
1. Cleaned docker volumes and started datahub. Verified it is empty.
2. Built with gradle.
3. Ran ./docker/ingestion/ingestion-dev.sh. Verified data shows in DataHub.
4. Ran step 1 again.
5. Ran ./docker/ingestion/ingestion.sh. Verified data shows in DataHub.
* fix (docker): Fix install of Chrome in frontend Dockerimage
Retry installing Chrome after dependencies have been installed
* fix (docker): Install Chrome with apt-get
Install Chrome and dependencies at the same time, using apt-get
* Make docker files easier to use during development.
During development it quite nice to have docker work with locally built code. This allows you to launch all services very quickly, with your changes, and optionally with debugging support.
Changes made to docker files:
- Removed all redundant docker-compose files. We now have 1 giant file, and smaller files to use as overrides.
- Remove redundant README files that provided little information.
- Rename docker/<dir> to match the service name in the docker-compose file for clarity.
- Move environment variables to .env files. We only provide dev / the default environment for quickstart.
- Add debug options to docker files using multistage build to build minimal images with the idea that built files will be mounted instead.
- Add a docker/dev.sh script + compose file to easily use the dev override images (separate tag; images never published; uses debug docker files; mounts binaries to image).
- Added docs/docker documentation for this.
* implement search feature
* add test for dataprocessIndexBuilder; refactor code based on feedback
* update based on PR feedback
* Update DataProcessDocument.pdl
fixed typo wording.
* add not null check for data process info
* add job info as aspect of a dataset
* add job urn def., aspect and entity
* job entity with upstream and downstream lineage
* use job urn in upstream & downstream
* add Job entity rest APIs
* rest.li api, impl and factory for job entity
* code cleanup
* use pdl; onboard data process entity
* add es index json
* fix gradlew build ignored tasks
* add a comment about data process info field
* fix style warning issues
* update content based on PR
* checked in generated snapshot json
* updated based on PR feedback
* update data process data format
* updated based on code review feedback
* revert back gms & mce-job docker image
* delete temp files
* update based pr feedback
* file name and a typo
* format with linkedin style
Co-authored-by: Liangjun <liajiang@expediagroup.com>
* build(docker): refactor docker build scripts
- add "build" option to docker-compose files to simplify rebuilding of images
- create "start.sh" script so it's easier to override "command" in the quickstart's docker-compose file
- use dockerize to wait for requisite services to start up
- add a dedicated Dockerfile for kafka-setup
This fixes https://github.com/linkedin/datahub/issues/1549 & https://github.com/linkedin/datahub/issues/1550
- add "build" option to docker-compose file to simplify rebuilding of images
- move command from docker-compose.yml to Dockerfile
- add ingestion.sh script to simplify quickstart instruction and to reduce confusion
As we're now utilize GitHub Actions to build & publish docker images to docker hub under linkedin org
Also allow overriding image tags via DATAHUB_VERSION environment variable
* Convert MAE to Spring boot
* Fix after testing
* Changes after testing
* Add file appender for gms and update doc type for ESv5.6
* Review comments
* Fix Review Comments
1. Use "source" while executing quickstart.sh to bring env var to the context of calling shell
2. Use sudo while running chown in quickstart.sh
3. Update main readme
4. Add missing container names
* feature: 🐳 - Allow to store Quickstart dockers data in a folder for persistance
* bump: Update Kafka dockers to 5.4.0
* feature: 🐳 - Add kafka-topics-ui docker to the kafka docker folder
* refactor: Provide a quickstart.sh script to start all dockers