mirror of
				https://github.com/datahub-project/datahub.git
				synced 2025-10-31 10:49:00 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			111 lines
		
	
	
		
			5.3 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			111 lines
		
	
	
		
			5.3 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # Policies Guide
 | |
| 
 | |
| ## Introduction 
 | |
| 
 | |
| DataHub provides the ability to declare fine-grained access control Policies via the UI & GraphQL API.
 | |
| Access policies in DataHub define *who* can *do what* to *which resources*. A few policies in plain English include
 | |
| 
 | |
| - Dataset Owners should be allowed to edit documentation, but not Tags. 
 | |
| - Jenny, our Data Steward, should be allowed to edit Tags for any Dashboard, but no other metadata.
 | |
| - James, a Data Analyst, should be allowed to edit the Links for a specific Data Pipeline he is a downstream consumer of.
 | |
| - The Data Platform team should be allowed to manage users & groups, view platform analytics, & manage policies themselves.
 | |
| 
 | |
| In this document, we'll take a deeper look at DataHub Policies & how to use them effectively. 
 | |
| 
 | |
| ## What is a Policy?
 | |
| 
 | |
| There are 2 types of Policy within DataHub:
 | |
| 
 | |
| 1. Platform Policies
 | |
| 2. Metadata Policies
 | |
| 
 | |
| We'll briefly describe each. 
 | |
| 
 | |
| ### Platform Policies
 | |
| 
 | |
| **Platform** policies determine who has platform-level privileges on DataHub. These privileges include
 | |
| 
 | |
| - Managing Users & Groups
 | |
| - Viewing the DataHub Analytics Page
 | |
| - Managing Policies themselves
 | |
| 
 | |
| Platform policies can be broken down into 2 parts:
 | |
| 
 | |
| 1. **Actors**: Who the policy applies to (Users or Groups)
 | |
| 2. **Privileges**: Which privileges should be assigned to the Actors (e.g. "View Analytics")
 | |
| 
 | |
| Note that platform policies do not include a specific "target resource" against which the Policies apply. Instead,
 | |
| they simply serve to assign specific privileges to DataHub users and groups.
 | |
| 
 | |
| ### Metadata Policies
 | |
| 
 | |
| **Metadata** policies determine who can do what to which Metadata Entities. For example, 
 | |
| 
 | |
| - Who can edit Dataset Documentation & Links?
 | |
| - Who can add Owners to a Chart?
 | |
| - Who can add Tags to a Dashboard?
 | |
| 
 | |
| and so on. 
 | |
| 
 | |
| A Metadata Policy can be broken down into 3 parts:
 | |
| 
 | |
| 1. **Actors**: The 'who'. Specific users, groups that the policy applies to.
 | |
| 2. **Privileges**: The 'what'. What actions are being permitted by a policy, e.g. "Add Tags".
 | |
| 3. **Resources**: The 'which'. Resources that the policy applies to, e.g. "All Datasets".
 | |
| 
 | |
| > Today, the set of privileges supported includes only *write* privileges. That is, there are no read restrictions implemented yet.
 | |
| 
 | |
| ## Managing Policies
 | |
| 
 | |
| Policies can be managed under the `/policies` page, or accessed inside the Control Center, a slide-out menu
 | |
| appearing on the left side of the DataHub UI. The `Policies` tab will only be visible to those users having the `MANAGE_POLICIES` privilege.
 | |
| 
 | |
| Out of the box, DataHub is deployed with a set of pre-baked Policies. The set of default policies are created at deploy 
 | |
| time and can be found inside the `policies.json` file within `metadata-service/war/src/main/resources/boot`. This set of policies serves the 
 | |
| following purposes:
 | |
| 
 | |
| 1. Assigns immutable super-user privileges for the root `datahub` user account (Immutable)
 | |
| 2. Assigns all Platform privileges for all Users by default (Editable)
 | |
| 
 | |
| The reason for #1 is to prevent people from accidentally deleting all policies and getting locked out (`datahub` super user account can be a backup)
 | |
| The reason for #2 is to permit administrators to log in via OIDC or another means outside of the `datahub` root account
 | |
| when they are bootstrapping with DataHub. This way, those setting up DataHub can start managing policies without friction. 
 | |
| Note that these privilege *can* and likely *should* be altered inside the **Policies** page of the UI.
 | |
| 
 | |
| > Pro-Tip: To login using the `datahub` account, simply navigate to `<your-datahub-domain>/login` and enter `datahub`, `datahub`. Note that the password can be customized for your
 | |
| deployment by changing the `user.props` file within the `datahub-frontend` module. Notice that JaaS authentication must be enabled. 
 | |
| 
 | |
| ## Configuration 
 | |
| 
 | |
| By default, the Policies feature is *enabled*. This means that the deployment will support creating, editing, removing, and 
 | |
| most importantly enforcing fine-grained access policies.
 | |
| 
 | |
| In some cases, these capabilities are not desirable. For example, if your company's users are already used to having free reign, you
 | |
| may want to keep it that way. Or perhaps it is only your Data Platform team who actively uses DataHub, in which case Policies may be overkill.
 | |
| 
 | |
| For these scenarios, we've provided a back door to disable Policies in your deployment of DataHub. This will completely hide
 | |
| the policies management UI and by default will allow all actions on the platform. It will be as though
 | |
| each user has *all* privileges, both of the **Platform** & **Metadata** flavor.
 | |
| 
 | |
| To disable Policies, you can simply set the `AUTH_POLICIES_ENABLED` environment variable for the `datahub-gms` service container
 | |
| to `false`. For example in your `docker/datahub-gms/docker.env`, you'd place
 | |
| 
 | |
| ```
 | |
| AUTH_POLICIES_ENABLED=false
 | |
| ```
 | |
| 
 | |
| ## Coming Soon
 | |
| 
 | |
| The DataHub team is hard at work trying to improve the Policies feature. We are planning on building out the following:
 | |
| 
 | |
| - Hide edit action buttons on Entity pages to reflect user privileges
 | |
| 
 | |
| Under consideration
 | |
| 
 | |
| - Ability to define Metadata Policies against multiple resources scoped to a particular "Domains"
 | |
| - Ability to define Metadata Policies against multiple reosurces scoped to particular "Containers" (e.g. A "schema", "database", or "collection")
 | |
| 
 | |
| ## Feedback / Questions / Concerns
 | |
| 
 | |
| We want to hear from you! For any inquiries, including Feedback, Questions, or Concerns, reach out on Slack!
 | 
