DataHub uses Gradle's [dependency locking](https://docs.gradle.org/current/userguide/dependency_locking.html) to ensure reproducible builds across all environments. Dependency locks guarantee that the exact same dependency versions are used everywhere - in development, CI, and production - preventing unexpected behavior from transitive dependency updates.
#### Why Dependency Locking?
- **Reproducible Builds**: Same dependency versions across all environments
- **Security**: Prevents unexpected transitive dependency updates that could introduce vulnerabilities
- **Stability**: Explicit control over when dependencies change
- **Audit Trail**: Lock files in git provide clear history of dependency changes
#### How It Works
Each sub-project has a `gradle.lockfile` that records the exact version of every direct and transitive dependency used across all configurations (compile, runtime, test, etc.). When you build the project, Gradle uses these locked versions instead of resolving the latest versions.
Note: This only locks configurations that get resolved by the `dependencies` task. For complete coverage, use `resolveAndLockAll` instead.
#### Common Workflows
**Adding a new dependency:**
1. Add the dependency to the appropriate `build.gradle` file
2. Run `./gradlew resolveAndLockAll --write-locks`
3. Verify the build works: `./gradlew :project-name:build`
4. Commit both the `build.gradle` change and updated lock files
**Updating an existing dependency:**
1. Change the version in `build.gradle`
2. Run `./gradlew resolveAndLockAll --write-locks`
3. Review the lock file changes to see what transitive dependencies changed
4. Test thoroughly, especially if major version upgrades
5. Commit the changes
**After pulling changes:**
If someone else updated dependencies and you pull their changes, just build normally:
```bash
./gradlew build
```
Gradle will automatically use the locked versions from the updated lock files. You don't need to regenerate locks unless you're making your own dependency changes.
#### Troubleshooting
**Build fails with dependency resolution error:**
- Ensure you've run `./gradlew resolveAndLockAll --write-locks` after dependency changes
- Check that lock files are committed and up to date
- Try `./gradlew --refresh-dependencies build` to refresh Gradle's cache
**Lock file is out of sync:**
If you see warnings about lock state, regenerate the locks:
A full restart using `./gradlew quickstartDebug` is recommended if there are significant changes and the setup/system update containers need to be run again.
Please note that we do not actively support Windows in a non-virtualized environment.
If you must use Windows, one workaround is to build within a virtualized environment, such as a VM(Virtual Machine) or [WSL(Windows Subsystem for Linux)](https://learn.microsoft.com/en-us/windows/wsl).
This approach can help ensure that your build environment remains isolated and stable, and that your code is compiled correctly.
While it may be possible to build and run DataHub using newer versions of Java, we currently only support [Java 17](https://openjdk.org/projects/jdk/17/) (aka Java 17).
You can install multiple version of Java on a single machine and switch between them using the `JAVA_HOME` environment variable. See [this document](https://docs.oracle.com/cd/E21454_01/html/821-2531/inst_jdk_javahome_t.html) for more details.
#### `:metadata-models:generateDataTemplate` task fails with `java.nio.file.InvalidPathException: Illegal char <:> at index XX` or `Caused by: java.lang.IllegalArgumentException: 'other' has different root` error
This is a [known issue](https://github.com/linkedin/rest.li/issues/287) when building the project on Windows due a bug in the Pegasus plugin. Please refer to [Windows Compatibility](/docs/developers.md#windows-compatibility).
As we generate quite a few files from the models, it is possible that old generated files may conflict with new model changes. When this happens, a simple `./gradlew clean` should resolve the issue.
This generally means that an [incompatible change](https://linkedin.github.io/rest.li/modeling/compatibility_check) was introduced to the rest.li API in GMS. You'll need to rebuild the snapshots/IDL by running the following command once
#### `:buildSrc:compileJava` task fails with `package com.linkedin.metadata.models.registry.config does not exist` and `cannot find symbol` error for `Entity`
There are currently two symbolic links within the [buildSrc](https://github.com/datahub-project/datahub/tree/master/buildSrc) directory for the [com.linkedin.metadata.aspect.plugins.config](https://github.com/datahub-project/datahub/blob/master/buildSrc/src/main/java/com/linkedin/metadata/aspect/plugins/config) and [com.linkedin.metadata.models.registry.config](https://github.com/datahub-project/datahub/blob/master/buildSrc/src/main/java/com/linkedin/metadata/models/registry/config) packages, which points to the corresponding packages in the [entity-registry](https://github.com/datahub-project/datahub/tree/master/entity-registry) subproject.
When the repository is checked out using Windows 10/11 - even if WSL is later used for building using the mounted Windows filesystem in `/mnt/` - the symbolic links might have not been created correctly, instead the symbolic links were checked out as plain files. Although it is technically possible to use the mounted Windows filesystem in `/mnt/` for building in WSL, it is **strongly recommended** to checkout the repository within the Linux filesystem (e.g., in `/home/`) and building it from there, because accessing the Windows filesystem from Linux is relatively slow compared to the Linux filesystem and slows down the whole building process.
To be able to create symbolic links in Windows 10/11 the [Developer Mode](https://learn.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development) has to be enabled first. Then the following commands can be used to enable [symbolic links in Git](https://git-scm.com/docs/git-config#Documentation/git-config.txt-coresymlinks) and recreating the symbolic links:
# reset the current branch to recreate the missing symbolic links (alternatively it is also possibly to switch branches away and back)
git reset --hard
```
See also [here](https://stackoverflow.com/questions/5917249/git-symbolic-links-in-windows/59761201#59761201) for more information on how to enable symbolic links on Windows 10/11 and Git.
This test ensures all configuration properties are explicitly classified as either sensitive (redacted) or non-sensitive (visible in system info). It prevents accidental exposure of secrets through DataHub's system information endpoints.
**When you add new configuration properties:**
1. The test will fail if your property is unclassified
2. Follow the test failure message to add your property to the appropriate classification list
3. When in doubt, classify as sensitive - it's safer to over-redact than expose secrets
Refer to the test file itself for comprehensive documentation on classification lists, template syntax, and examples. This is a mandatory security guardrail that protects against credential leaks.