8.7 KiB
OpenMetadata Threat Model
This document outlines the security threat model for OpenMetadata, a unified metadata platform for data discovery, observability, and governance.
Important Scope Notice
OpenMetadata is a metadata-only platform. It does not:
- Store, process, or transmit actual data
- Execute queries against production data sources
- Modify data in connected systems
- Provide direct data access capabilities
OpenMetadata exclusively manages metadata - information about data, such as table schemas, column descriptions, data lineage, ownership, and quality metrics. This metadata-only architecture significantly reduces the attack surface and potential impact of security incidents.
Overview
While OpenMetadata does not handle actual data, it only manages metadata that provides insights into an organization's data architecture, which requires appropriate security controls. This threat model identifies potential risks specific to a metadata management platform.
Asset Inventory
Primary Assets (Metadata Only)
-
Business Metadata
- Table and column descriptions
- Business glossary terms
- Data ownership information
- Team structures and responsibilities
- Tags and classifications
-
Technical Metadata
- Database and table schemas (structure only)
- Column names and data types
- Data lineage relationships
- Data quality metrics and test results
- Service topology information
-
Operational Metadata
- Ingestion pipeline configurations
- Connection configurations (credentials for metadata extraction only)
- User activity logs
- Search queries and access patterns
-
Platform Components
- OpenMetadata Server (REST APIs)
- Web UI (React application)
- Ingestion Framework (metadata collectors)
- Backend Database (MySQL/PostgreSQL)
- Search Index (Elasticsearch/OpenSearch)
Threat Actors
External Threats
- Unauthorized Users: Attempting to discover information about data architecture
- Competitors: Seeking insights into data organization and business processes
- Reconnaissance Actors: Gathering information for potential future attacks on actual data systems
Internal Threats
- Curious Employees: Accessing metadata beyond their authorized scope
- Departing Employees: Exporting organizational knowledge
- Compromised Accounts: Valid accounts used inappropriately
Threat Categories
1. Information Disclosure Threats
T1.1: Metadata Reconnaissance
- Risk: Exposure of data architecture and organization structure
- Impact: Provides blueprint for potential attacks on actual data systems
- Attack Vectors:
- Unauthorized API access
- Search interface exploitation
- Excessive permissions
T1.2: Connection Configuration Exposure
- Risk: Leakage of service endpoints and authentication methods
- Impact: Reveals connection patterns but not data access
- Attack Vectors:
- Configuration file access
- API endpoint enumeration
- Log file exposure
T1.3: Business Process Disclosure
- Risk: Exposure of data flows and business logic through lineage
- Impact: Competitive disadvantage or process manipulation
- Attack Vectors:
- Lineage graph traversal
- Glossary term extraction
- Ownership mapping
2. Authentication & Authorization Threats
T2.1: Unauthorized Metadata Access
- Risk: Users accessing metadata outside their domain
- Impact: Information leakage across business units
- Attack Vectors:
- Privilege escalation
- Role misconfiguration
- Token manipulation
T2.2: Authentication Bypass
- Risk: Anonymous access to metadata
- Impact: Full metadata exposure
- Attack Vectors:
- JWT vulnerabilities
- SSO misconfiguration
- Default credentials
3. Data Integrity Threats
T3.1: Metadata Tampering
- Risk: Modification of metadata leading to confusion or misdirection
- Impact: Incorrect business decisions based on false metadata
- Attack Vectors:
- API manipulation
- Direct database access
- Ingestion pipeline compromise
T3.2: Lineage Manipulation
- Risk: False data lineage information
- Impact: Compliance violations or incorrect impact analysis
- Attack Vectors:
- Ingestion tampering
- API exploitation
4. Availability Threats
T4.1: Service Disruption
- Risk: OpenMetadata platform unavailability
- Impact: Inability to discover or govern data assets
- Attack Vectors:
- DoS attacks
- Resource exhaustion
- Database overload
T4.2: Search Index Corruption
- Risk: Search functionality failure
- Impact: Degraded data discovery capabilities
- Attack Vectors:
- Index poisoning
- Bulk operation abuse
5. Supply Chain Threats
T5.1: Dependency Vulnerabilities
- Risk: Vulnerabilities in third-party libraries
- Impact: Platform compromise
- Attack Vectors:
- Known CVEs in dependencies
- Dependency confusion attacks
T5.2: Connector Compromise
- Risk: Malicious metadata ingestion connectors
- Impact: False metadata or reconnaissance
- Attack Vectors:
- Unofficial connector installation
- Connector impersonation
Mitigations
M1: Access Control
- Make sure OpenMetadata hosted as your organization's internal tooling. You don't need to expose to public internet. Lock it behind your company's VPN
- Implement Role-Based Access Control (RBAC) for metadata domains
- Use team-based metadata visibility
- Regular access review and certification
- Principle of least privilege for metadata access
- Audit logging of all metadata access
M2: Authentication Security
- Enforce SSO/SAML for enterprise deployments
- Multi-factor authentication for administrative access
- Regular token rotation
- Session timeout policies
- Strong password requirements
M3: Connection Security
- Encrypt stored connection configurations
- Use secret management systems for credentials
- Implement credential rotation
- Limit ingestion permissions to read-only metadata operations
- Network isolation for ingestion pipelines
M4: Platform Hardening
- Regular security updates
- Dependency vulnerability scanning
- Container security scanning
- Secure default configurations
- Disable unnecessary features
M5: Monitoring & Auditing
- Comprehensive audit logs for metadata changes
- Anomaly detection for unusual access patterns
- Real-time alerting for security events
- Regular security assessments
- Compliance reporting
M6: Data Protection
- TLS for all network communications
- Encryption at rest for sensitive configurations
- Secure backup procedures
- Data retention policies
- Secure deletion practices
Residual Risks
Even with mitigations, these risks remain:
- Zero-day vulnerabilities in platform components
- Social engineering targeting metadata administrators
- Advanced persistent threats conducting long-term reconnaissance
- Insider threats with legitimate access
- Supply chain compromises in dependencies
Security Best Practices
For Deployment
- Deploy in isolated network segments
- Use read-only service accounts for metadata ingestion
- Implement network policies restricting egress
- Regular security patching schedule
- Monitoring and alerting setup
For Configuration
- Disable default accounts
- Configure appropriate session timeouts
- Enable audit logging
- Set up automated backups
- Implement rate limiting
For Operations
- Regular access reviews
- Security awareness training
- Incident response procedures
- Change management processes
- Documentation of security configurations
Compliance Considerations
While OpenMetadata doesn't store actual data, consider:
- Metadata Privacy: Some metadata might be considered sensitive
- Access Logging: Required for compliance audits
- Change Tracking: Metadata modification history
- Data Governance: Using OpenMetadata to support compliance programs
Incident Response
For security incidents involving OpenMetadata:
- Contain: Isolate affected systems
- Assess: Determine scope of metadata exposure
- Notify: Inform stakeholders of potential information disclosure
- Remediate: Patch vulnerabilities and rotate credentials
- Review: Update security controls based on lessons learned
Conclusion
OpenMetadata's metadata-only architecture inherently limits security risks compared to data platforms. The primary concern is unauthorized information disclosure about data architecture rather than data breach. By implementing appropriate access controls and monitoring, organizations can safely leverage OpenMetadata for data discovery and governance while maintaining security.
References
Last Updated: 2025-01-28 Version: 1.0