datahub/docs/developers/java-sdk-v2-design.md

# DataHub Java SDK V2 Design Document

## Executive Summary

This document describes the design of DataHub Java SDK V2, a modern, user-friendly Java client library that provides feature parity with the Python SDK V2. The new SDK addresses feedback from enterprise Java customers who require a first-class SDK experience comparable to Python developers.

This document is organized into two main sections:

- **Part 1 - User-Facing API Design**: The public API, patterns, and behaviors visible to SDK users
- **Part 2 - Developer-Facing Implementation**: Internal architecture and implementation details for contributors

> **Why Hand-Crafted?** For a deep dive into why we chose to hand-craft this SDK instead of using OpenAPI code generation, see [Java SDK V2 Philosophy](java-sdk-v2-philosophy.md).

## Background

### Problem Statement

Currently, DataHub's Java SDK (`datahub-client`) provides only low-level emission capabilities:

- Manual MCP (Metadata Change Proposal) construction required
- No high-level entity builders for Dataset, Chart, Dashboard, etc.
- No client for CRUD operations (read, update, delete)
- No patch capabilities for granular updates
- Significantly inferior developer experience compared to Python SDK V2

This gap has created issues with enterprise customers, particularly Java shops who feel like "second-class citizens" when compared to Python developers.

### Goals

1. **Feature Parity**: Match Python SDK V2 capabilities for entity management
2. **Backward Compatibility**: Maintain 100% compatibility with existing Java SDK
3. **Namespace Separation**: Use `datahub.client.v2.*` namespace for new APIs
4. **Builder Pattern**: Fluent, type-safe API for entity construction
5. **Patch Support**: Granular updates without full entity replacement
6. **CRUD Operations**: Support create, read, update, upsert operations (delete/exists deferred)
7. **Comprehensive Testing**: Unit and integration tests validating all functionality

### Non-Goals

- Rewriting existing emitter infrastructure (leverage existing)
- 100% feature parity with Python SDK (focus on core entities first)
- GraphQL client implementation (focus on REST/OpenAPI)
- Search client (future enhancement)
- Lineage client (future enhancement)

---

# Part 1: User-Facing API Design

This section describes the public API that SDK users interact with - the patterns, behaviors, and interfaces that define the developer experience.

## Design Principles

### 1. Fluent Builder Pattern

Intuitive entity construction through method chaining:

```java
Dataset dataset = Dataset.builder()
    .platform("snowflake")
    .name("my_table")
    .env("PROD")
    .description("My dataset")
    .build();

// Fluent metadata operations with type-safe method chaining
dataset.addTag("pii")
       .addOwner("urn:li:corpuser:jdoe", OwnershipType.TECHNICAL_OWNER)
       .setDomain("urn:li:domain:Analytics")
       .setStructuredProperty("io.acryl.dataQuality.qualityScore", 95.5);

client.entities().upsert(dataset);
```

### 2. Type Safety and Compile-Time Checking

Leverage Java's strong typing:

- Strongly-typed URNs (`DatasetUrn`, `ChartUrn`, etc.)
- Generic types for entity operations
- CRTP (Curiously Recurring Template Pattern) for type-safe mixin interfaces
- Builder validation at construction time

### 3. Mode-Aware Behavior

**SDK Mode vs INGESTION Mode** for proper separation of concerns:

- **SDK Mode (default)**: User edits → `editableDatasetProperties`
- **INGESTION Mode**: Pipeline writes → `datasetProperties`
- Getters intelligently prefer editable aspects over system aspects

```java
// SDK mode - user edits go to editable aspects
DataHubClientV2 client = DataHubClientV2.builder()
    .server("http://localhost:8080")
    .mode(OperationMode.SDK)  // Default
    .build();

// INGESTION mode - pipeline writes go to system aspects
DataHubClientV2 ingestionClient = DataHubClientV2.builder()
    .server("http://localhost:8080")
    .mode(OperationMode.INGESTION)
    .build();
```

### 4. Patch-First Philosophy

**Design Decision: Prioritize patches over full aspect replacement**

The SDK V2 is designed around patch-based operations because they represent the most common and intuitive way to make metadata changes:

```java
Dataset dataset = client.entities().get(datasetUrn);
Dataset mutable = dataset.mutable();  // Get mutable copy

// These create patches internally - no server calls yet
mutable.addTag("pii")
       .addTag("sensitive")
       .addOwner(ownerUrn, OwnershipType.TECHNICAL_OWNER);

// Single call emits all accumulated patches atomically
client.entities().update(mutable);
```

**Why patches?**

- **Simplicity**: Users think "add a tag" not "fetch all tags, add one, PUT entire tag aspect back"
- **Safety**: Patches don't overwrite concurrent changes from other users
- **Efficiency**: Only changed fields are transmitted and processed
- **Common use case**: Most metadata operations are incremental additions/removals

**When to use low-level SDK:**
If you need to completely replace an aspect (full PUT/upsert semantics), use the V1 SDK's `RestEmitter` directly with `MetadataChangeProposalWrapper`. The V2 SDK focuses on making common operations simple, not exposing every low-level primitive.

### 5. Composition Through Mixin Interfaces

Shared metadata operations via type-safe mixin interfaces:

- `HasTags<T>` - Add, remove, set tags
- `HasOwners<T>` - Manage ownership
- `HasGlossaryTerms<T>` - Associate glossary terms
- `DomainOperations<T>` - Domain assignment
- `HasContainer<T>` - Parent-child hierarchies

All mixins use CRTP pattern for type-safe method chaining that returns the concrete entity type.

## Architecture

### Package Structure (Actual Implementation)

```
datahub-client/
├── src/main/java/
│   ├── datahub/client/                    # Existing v1 (unchanged)
│   │   ├── Emitter.java
│   │   ├── rest/RestEmitter.java
│   │   └── ...
│   │
│   └── datahub/client/v2/                 # New v2 namespace
│       ├── DataHubClientV2.java           # Main client entry point
│       │
│       ├── entity/                        # Entity classes
│       │   ├── Entity.java                # Base entity class (490 lines)
│       │   ├── AspectCache.java           # Unified cache with dirty tracking (184 lines)
│       │   ├── CachedAspect.java          # Aspect wrapper with metadata (68 lines)
│       │   ├── AspectSource.java          # SERVER vs LOCAL enum (23 lines)
│       │   ├── ReadMode.java              # ALLOW_DIRTY vs SERVER_ONLY (28 lines)
│       │   ├── Dataset.java               # Dataset entity (564 lines)
│       │   ├── Chart.java                 # Chart entity (587 lines)
│       │   ├── Dashboard.java             # Dashboard entity (671 lines)
│       │   ├── DataJob.java               # DataJob entity (597 lines)
│       │   ├── DataFlow.java              # DataFlow entity (467 lines)
│       │   ├── Container.java             # Container entity (500 lines)
│       │   ├── MLModel.java               # ML Model entity NEW
│       │   ├── MLModelGroup.java          # ML Model Group entity NEW
│       │   ├── HasTags.java               # Tag operations mixin
│       │   ├── HasOwners.java             # Ownership operations mixin
│       │   ├── HasGlossaryTerms.java      # Terms operations mixin
│       │   ├── HasDomains.java            # Domain operations mixin
│       │   ├── HasContainer.java          # Container hierarchy mixin
│       │   └── HasStructuredProperties.java # Structured properties mixin
│       │
│       ├── operations/                    # CRUD operation clients
│       │   └── EntityClient.java          # Entity CRUD operations (570 lines)
│       │
│       └── config/                        # Configuration
│           └── DataHubClientConfigV2.java # Config with mode support
│
└── src/test/java/                         # Tests mirror structure
    └── datahub/client/v2/
        ├── DataHubClientV2Test.java       # Client tests
        ├── entity/                        # 378 unit tests
        │   ├── AspectCacheTest.java       # 30 tests (cache infrastructure)
        │   ├── CachedAspectTest.java      # 13 tests (cache infrastructure)
        │   ├── DatasetTest.java           # 37 tests
        │   ├── ChartTest.java             # 43 tests
        │   ├── DashboardTest.java         # 52 tests
        │   ├── DataJobTest.java           # 45 tests
        │   ├── DataFlowTest.java          # 40 tests
        │   ├── ContainerTest.java         # 40 tests
        │   ├── MLModelTest.java           # 44 tests
        │   └── MLModelGroupTest.java      # 38 tests
        └── integration/                   # 79 integration tests
            ├── DatasetIntegrationTest.java
            ├── ChartIntegrationTest.java
            ├── DashboardIntegrationTest.java
            ├── DataJobIntegrationTest.java
            ├── DataFlowIntegrationTest.java
            ├── ContainerIntegrationTest.java
            ├── MLModelIntegrationTest.java
            └── MLModelGroupIntegrationTest.java
```

**Key Design Decisions:**

- No separate `patch/` package - patches accumulate internally within entities
- Mixin interfaces in `entity/` package using CRTP pattern for type safety
- Support for 8 entity types including ML entities (MLModel, MLModelGroup)
- Mode-aware configuration for SDK vs INGESTION behavior

### Core Classes

#### 1. DataHubClientV2 (Main Entry Point)

**File**: `metadata-integration/java/datahub-client/src/main/java/datahub/client/v2/DataHubClientV2.java` (266 lines)

```java
package datahub.client.v2;

/**
 * Main entry point for DataHub Java SDK V2.
 * Provides high-level operations for entity management with mode-aware behavior.
 *
 * <p>Example usage:
 * <pre>
 * DataHubClientV2 client = DataHubClientV2.builder()
 *     .server("http://localhost:8080")
 *     .token("my-token")
 *     .mode(OperationMode.SDK)  // SDK or INGESTION mode
 *     .build();
 *
 * Dataset dataset = Dataset.builder()
 *     .platform("snowflake")
 *     .name("my_table")
 *     .env("PROD")
 *     .description("My dataset")
 *     .build();
 *
 * client.entities().upsert(dataset);
 * </pre>
 */
public class DataHubClientV2 implements AutoCloseable {
    private final RestEmitter emitter;
    private final DataHubClientConfigV2 config;
    private final EntityClient entityClient;

    // Builder for client configuration
    public static Builder builder() { ... }

    // Entity operations
    public EntityClient entities() { return entityClient; }

    // Low-level emitter access (for advanced users)
    public RestEmitter emitter() { return emitter; }

    // Configuration access
    public DataHubClientConfigV2 config() { return config; }

    @Override
    public void close() throws IOException { ... }

    public static class Builder {
        public Builder server(String serverUrl) { ... }
        public Builder token(String token) { ... }
        public Builder timeout(int timeoutMs) { ... }
        public Builder mode(OperationMode mode) { ... }  // NEW
        public Builder config(DataHubClientConfigV2 config) { ... }
        public DataHubClientV2 build() { ... }
    }
}
```

**Design Features:**

- Mode-aware behavior (SDK vs INGESTION) for proper aspect routing
- Environment variable support for configuration
- Builder pattern with sensible defaults
- AutoCloseable interface for resource management

#### 2. Entity (Base Class) - User-Facing API

**File**: `metadata-integration/java/datahub-client/src/main/java/datahub/client/v2/entity/Entity.java` (490 lines)

The Entity base class provides a unified interface for all DataHub entities. From a user perspective, all entities support:

**Public API Methods:**

```java
// URN access
public Urn getUrn()
public abstract String getEntityType()

// Convert to MCPs for emission (primarily internal)
public List<MetadataChangeProposalWrapper> toMCPs()
```

**Entity Construction:**

Entities are constructed via fluent builders:

```java
Dataset dataset = Dataset.builder()
    .platform("snowflake")
    .name("my_table")
    .env("PROD")
    .description("My dataset")
    .build();
```

**Fluent Metadata Operations:**

All entities support method chaining for metadata operations (via mixin interfaces):

```java
dataset.addTag("pii")
       .addOwner(ownerUrn, OwnershipType.TECHNICAL_OWNER)
       .setDomain(domainUrn)
       .addTerm(termUrn);
```

**Lazy Loading:**

Entities loaded from the server fetch aspects on-demand:

```java
Dataset dataset = client.entities().get(datasetUrn);  // Only URN loaded
String description = dataset.getDescription();         // Aspect fetched now
List<String> tags = dataset.getTags();                // Another aspect fetch
```

**Patch Accumulation:**

Metadata operations create patches that accumulate until save:

```java
Dataset dataset = client.entities().get(datasetUrn);
Dataset mutable = dataset.mutable();  // Get mutable copy
mutable.addTag("pii");           // Creates patch (not sent yet)
mutable.addTag("sensitive");     // Another patch (not sent yet)
client.entities().update(mutable); // Emits all patches atomically
```

**Immutability-by-Default:**

Entities fetched from the server are read-only to prevent accidental mutations:

```java
Dataset dataset = client.entities().get(datasetUrn);
dataset.isReadOnly();  // true
dataset.isMutable();   // false

// Attempting mutation throws ReadOnlyEntityException
// dataset.addTag("pii");  // ERROR!

// Get mutable copy for updates
Dataset mutable = dataset.mutable();
mutable.isMutable();  // true
mutable.addTag("pii");  // Works
client.entities().upsert(mutable);
```

**Entity Lifecycle:**

1. **Builder-created entities** - Mutable from creation

   ```java
   Dataset dataset = Dataset.builder()
       .platform("snowflake")
       .name("my_table")
       .build();
   dataset.isMutable();  // true - can mutate immediately
   ```

2. **Server-fetched entities** - Immutable by default

   ```java
   Dataset dataset = client.entities().get(urn);
   dataset.isReadOnly();  // true - must call .mutable()
   ```

3. **Mutable copies** - Created via `.mutable()`
   ```java
   Dataset mutable = dataset.mutable();
   mutable.isMutable();  // true - can mutate
   ```

**The .mutable() method:**

- Creates a shallow copy with independent mutability flags
- Shares aspect cache with original (read-your-own-writes semantics)
- Idempotent - returns self if already mutable
- Original entity remains read-only after creating mutable copy

**Why immutability-by-default?**

- Makes mutations explicit and intentional
- Prevents accidental modification when passing entities between functions
- Clear separation between read and write workflows
- Enables safe entity sharing across threads
- Common pattern in modern APIs (Rust, Python, Java immutable collections)

See "Developer-Facing Implementation Design" section below for internal architecture details.

#### 3. Supported Entities

The SDK V2 implements 8 entity types with full metadata support:

**Data Entities:**

- **Dataset** - Tables, views, files with schema support
- **Container** - Databases, schemas, folders (hierarchical structures)

**Pipeline Entities:**

- **DataFlow** - Pipelines, workflows (Airflow DAGs, Spark jobs, dbt projects)
- **DataJob** - Individual tasks with inlet/outlet lineage

**Visualization Entities:**

- **Chart** - Visualizations with input dataset lineage
- **Dashboard** - Dashboards with chart relationships and input datasets

**ML Entities:**

- **MLModel** - Machine learning models with metrics, hyperparameters, training jobs
- **MLModelGroup** - Model families with version management

**Common Entity Operations:**

All entities support these fluent operations (via mixin interfaces):

```java
// Tags
entity.addTag("pii")
      .removeTag("deprecated")
      .setTags(Arrays.asList("tag1", "tag2"))
      .clearTags()

// Owners
entity.addOwner(ownerUrn, OwnershipType.TECHNICAL_OWNER)
      .removeOwner(ownerUrn)
      .setOwners(ownerList)
      .clearOwners()

// Glossary Terms
entity.addTerm(termUrn)
      .removeTerm(termUrn)
      .setTerms(termList)
      .clearTerms()

// Domains
entity.setDomain(domainUrn)
      .removeDomain(domainUrn)
      .clearDomains()

// Container (for hierarchical entities)
entity.setContainer(containerUrn)
      .clearContainer()

// Structured Properties (custom typed metadata)
entity.setStructuredProperty("io.acryl.dataManagement.replicationSLA", "24h")
      .setStructuredProperty("io.acryl.dataQuality.qualityScore", 95.5)
      .setStructuredProperty("io.acryl.dataManagement.certifications",
                             Arrays.asList("SOC2", "HIPAA", "GDPR"))
      .setStructuredProperty("io.acryl.privacy.retentionDays", 90, 180, 365)
      .removeStructuredProperty("io.acryl.dataManagement.deprecated")
```

**Entity-Specific Documentation:**

See comprehensive guides in `metadata-integration/java/docs/sdk-v2/`:

- `dataset-entity.md` - Dataset with schema support
- `chart-entity.md` - Chart with lineage
- `dashboard-entity.md` - Dashboard with chart relationships
- `container-entity.md` - Container hierarchies
- `dataflow-entity.md` - DataFlow pipelines
- `datajob-entity.md` - DataJob with inlet/outlet lineage
- `mlmodel-entity.md` - MLModel with metrics
- `mlmodelgroup-entity.md` - MLModelGroup with versions

#### 4. EntityClient (CRUD Operations)

**File**: `metadata-integration/java/datahub-client/src/main/java/datahub/client/v2/operations/EntityClient.java` (570 lines)

```java
package datahub.client.v2.operations;

/**
 * Client for entity CRUD operations.
 * Provides create, read, update, and upsert operations.
 */
public class EntityClient {
    private final RestEmitter emitter;
    private final DataHubClientConfigV2 config;

    /**
     * Create a new entity (convenience method - same as upsert).
     */
    public <T extends Entity> void create(T entity) throws IOException, ExecutionException, InterruptedException {
        upsert(entity);
    }

    /**
     * Upsert an entity (create or update).
     * Emits all aspects and accumulated patches.
     */
    public <T extends Entity> void upsert(T entity) throws IOException, ExecutionException, InterruptedException {
        List<MetadataChangeProposalWrapper> mcps = entity.toMCPs();
        // Emit all MCPs asynchronously and wait for completion
        // ...
    }

    /**
     * Update an existing entity.
     * Emits only accumulated patches (not full aspects).
     */
    public <T extends Entity> void update(T entity) throws IOException, ExecutionException, InterruptedException {
        // Emit only pending patches
        // ...
    }

    /**
     * Get an entity by URN.
     * Returns entity with lazy-loaded aspects.
     */
    public <T extends Entity> T get(Urn urn, Class<T> entityClass) throws IOException {
        // Fetch entity aspects from server
        // Construct entity with lazy loading support
        // ...
    }

    // Note: delete(Urn) and exists(Urn) operations deferred to future releases
}
```

**Supported Operations:**

- `create()` - Create new entities (wrapper for upsert)
- `upsert()` - Create or update entities (emits all aspects + patches)
- `update()` - Update existing entities (emits only patches)
- `get()` - Retrieve entities with lazy loading
- `delete()` and `exists()` - Deferred to future releases

**Patch Behavior:**

Patches are accumulated **inside entities** during metadata operations and emitted automatically during `upsert()`/`update()`:

```java
Dataset dataset = client.entities().get(datasetUrn);
Dataset mutable = dataset.mutable();  // Get mutable copy
mutable.addTag("pii");           // Creates internal patch
mutable.addTag("sensitive");     // Creates another internal patch
client.entities().update(mutable); // Emits both patches atomically
```

There is **no separate `patch()` method** - patches are managed internally by entities.

#### 5. Mixin Interfaces (CRTP Pattern)

**Files**: `metadata-integration/java/datahub-client/src/main/java/datahub/client/v2/entity/Has*.java`

Mixin interfaces provide reusable metadata operations across entities using the **Curiously Recurring Template Pattern (CRTP)** for type-safe method chaining:

```java
/**
 * Interface for entities that support tags.
 * Uses CRTP for type-safe method chaining.
 */
public interface HasTags<T extends Entity & HasTags<T>> {

    /**
     * Add a tag to this entity.
     * Creates a patch that will be emitted on save.
     */
    default T addTag(@Nonnull String tagUrn) {
        // Implementation creates patch internally
        return (T) this;
    }

    default T removeTag(@Nonnull String tagUrn) { ... }
    default T setTags(@Nonnull List<String> tagUrns) { ... }
    default T clearTags() { ... }

    // Getter methods
    default List<String> getTags() { ... }
}
```

**Available Mixin Interfaces:**

1. **`HasTags<T>`** - Tag operations (`addTag`, `removeTag`, `setTags`, `clearTags`)
2. **`HasOwners<T>`** - Ownership operations (`addOwner`, `removeOwner`, `setOwners`, `clearOwners`)
3. **`HasGlossaryTerms<T>`** - Glossary term operations (`addTerm`, `removeTerm`, `setTerms`, `clearTerms`)
4. **`DomainOperations<T>`** - Domain operations (`setDomain`, `removeDomain`, `clearDomains`)
5. **`HasContainer<T>`** - Container hierarchy (`setContainer`, `clearContainer`)
6. **`HasStructuredProperties<T>`** - Structured properties operations (`setStructuredProperty`, `removeStructuredProperty`)

**Why CRTP?**

The CRTP pattern enables type-safe method chaining that returns the concrete entity type:

```java
// Without CRTP: returns Entity
Entity entity = dataset.addTag("pii");  // Loses Dataset type!

// With CRTP: returns Dataset
Dataset dataset = dataset.addTag("pii")
                         .addOwner(ownerUrn, type)  // Still Dataset type!
                         .setDomain(domainUrn);     // Still Dataset type!
```

**Entity Implementations:**

Entities implement mixin interfaces by declaring them in the class signature:

```java
public class Dataset extends Entity
    implements HasTags<Dataset>,
               HasOwners<Dataset>,
               HasGlossaryTerms<Dataset>,
               DomainOperations<Dataset>,
               HasContainer<Dataset>,
               HasStructuredProperties<Dataset> {
    // Mixin methods provided by default implementations
}
```

---

# Part 2: Developer-Facing Implementation Design

This section describes the internal architecture and implementation details for developers contributing to the SDK.

## Internal Architecture

### Entity Base Class - Internal Implementation

**File**: `metadata-integration/java/datahub-client/src/main/java/datahub/client/v2/entity/Entity.java` (490 lines)

The Entity base class implements three core subsystems:

#### 1. AspectCache System with Read-Your-Own-Writes

**Unified Cache Architecture**: The SDK uses a unified `AspectCache` that provides read-your-own-writes semantics with proper dirty tracking. This architecture fixes bugs where fetched aspects would override patches.

**Core Implementation Files:**

- `AspectCache.java` (184 lines) - Main cache with dirty tracking
- `CachedAspect.java` (68 lines) - Aspect wrapper with metadata
- `AspectSource.java` (23 lines) - Enum for SERVER vs LOCAL aspects
- `ReadMode.java` (28 lines) - Enum for ALLOW_DIRTY vs SERVER_ONLY reads

**Key Architectural Features:**

1. **AspectSource Tracking**: Distinguishes between SERVER-fetched aspects (subject to TTL) and LOCAL-created aspects (no expiration)

2. **Dirty Tracking**: Explicit marking of aspects that need write-back to server via `markDirty()` method

3. **Read-Your-Own-Writes**: Default `ReadMode.ALLOW_DIRTY` returns local modifications immediately, `SERVER_ONLY` mode skips dirty aspects

4. **TTL Management**: 60-second TTL enforced only for SERVER-sourced aspects, LOCAL aspects never expire

5. **Thread Safety**: Uses `ConcurrentHashMap` for safe concurrent access

**Internal State (Entity.java):**

```
protected final AspectCache cache;  // Unified cache with dirty tracking
protected final Map<String, List<MetadataChangeProposal>> pendingPatches;
private DataHubClientV2 boundClient = null;
```

**Cache Operations:**

- `getAspectLazy()` - Lazy loads from server, stores as clean SERVER-sourced aspect
- `getOrCreateAspect()` - Gets from cache or creates new LOCAL-sourced aspect (marked dirty)
- `markAspectDirty()` - Marks aspect dirty after in-place modification (used by domain operations)
- `toMCPs()` - Returns **only dirty aspects** for emission (excludes clean fetched aspects)

**Why This Architecture?**

The unified cache solves a critical bug: when entities are fetched from the server and then patch operations are applied (e.g., `removeTerm()`), the cached aspect would be included in `toMCPs()` and override the patches. With dirty tracking, `toMCPs()` only returns modified aspects, allowing patches to work correctly.

#### 2. Patch Accumulation and MCP Generation

Metadata operations create patches that accumulate until emission. The system supports two types of operations:

**Patch-Based Operations** (incremental updates):

- Tags, owners, glossary terms use `PatchBuilder` classes
- Patches accumulate in `pendingPatches` map (aspect name → list of patches)
- Multiple operations on same aspect create multiple patches

**Cache-Based Operations** (full aspect replacement):

- Domains, custom properties modify aspects in cache
- Aspects marked dirty via `markAspectDirty()` after modification
- Dirty aspects included in `toMCPs()` output

**MCP Generation:**

The `toMCPs()` method returns **only dirty aspects** and accumulated patches:

```
public List<MetadataChangeProposalWrapper> toMCPs() {
    // 1. Add dirty aspects from cache (excludes clean fetched aspects)
    for (Map.Entry<String, RecordTemplate> entry : cache.getDirtyAspects().entrySet()) {
        mcps.add(createMCP(entry.getKey(), entry.getValue()));
    }

    // 2. Add accumulated patches
    for (PatchBuilder builder : patchBuilders.values()) {
        mcps.add(builder.build());
    }

    // 3. Add pending MCPs
    mcps.addAll(pendingMCPs);

    return mcps;
}
```

**Critical Design Point**: `toMCPs()` uses `cache.getDirtyAspects()` instead of all cached aspects. This ensures that fetched aspects don't override patches - only locally modified aspects are emitted.

#### 3. Mode-Aware Aspect Routing

SDK mode vs INGESTION mode for proper aspect selection:

````java
/**
 * Get aspect name based on operation mode.
 * SDK mode: prefer editable aspects
 * INGESTION mode: use system aspects
 */
protected String getAspectName(Class<? extends RecordTemplate> aspectClass, OperationMode mode) {
    if (mode == OperationMode.SDK) {
        // Check if editable variant exists
        String editableAspectName = getEditableAspectName(aspectClass);
        if (editableAspectName != null) {
            return editableAspectName;
        }
    }
    return aspectClass.getSimpleName();
}

/**
 * Get getter preference order: editable aspects first, then system aspects.
 */
protected <T extends RecordTemplate> T getAspectWithPreference(
    Class<T> editableClass,
    Class<T> systemClass
) {
    // Try editable aspect first
    T editable = getAspectLazy(editableClass);
    if (editable != null) {
        return editable;
    }

    // Fall back to system aspect
    return getAspectLazy(systemClass);
}

## Implementation Phases

### Phase 1: Core Framework

Base functionality for all entities:

- Base `Entity` class with aspect management, lazy loading, and patch accumulation
- `DataHubClientV2` main client class with mode-aware behavior
- `EntityClient` with create, read, update, upsert operations
- Configuration classes with environment variable support
- Mixin interfaces using CRTP pattern for type safety

### Phase 2: Dataset Entity

Reference implementation demonstrating all patterns:

- `Dataset` entity with fluent builder
- Dataset-specific aspects (properties, schema, lineage)
- Mixin interface implementations
- Comprehensive unit tests

### Phase 3: Additional Entities

Seven additional entity types:

- `Chart` - Visualizations with lineage
- `Dashboard` - Dashboards with chart relationships
- `Container` - Hierarchical data structures
- `DataJob` - Pipeline tasks with inlet/outlet lineage
- `DataFlow` - Pipeline workflows
- `MLModel` - Machine learning models
- `MLModelGroup` - ML model families

### Phase 4: Patch Capabilities

Patch-based updates for efficient metadata changes:

- Internal patch accumulation within entities (not separate patch builders)
- Automatic patch emission on `update()` and `upsert()`
- Leverages existing `PatchBuilder` classes from entity-registry module
- Patches tested via entity unit tests

### Phase 5: Testing & Documentation

Comprehensive validation and user guides:

- Integration tests with live DataHub server
- API documentation (Javadoc) and 13 comprehensive Markdown guides
- 19 working example files demonstrating real-world usage
- Migration guide from V1
- Design principles document
- Patch operations deep-dive
- Entity-specific guides for all 8 entities

## Testing Strategy

### Unit Tests

Each entity and component has comprehensive unit tests:

- Builder validation (required fields, optional fields, validation logic)
- Aspect management (getters, setters, mode-aware routing)
- MCP generation (full aspects + patches)
- Patch operations (accumulation, emission)
- Fluent API chaining (type safety via CRTP)
- Mixin operations (tags, owners, terms, domains)

**Test Coverage by Entity:**
- Dataset: 37 tests
- Chart: 43 tests
- Dashboard: 52 tests
- DataJob: 45 tests
- DataFlow: 40 tests
- Container: 40 tests
- MLModel: 44 tests
- MLModelGroup: 38 tests

### Integration Tests

Full end-to-end tests against a real DataHub instance:

```java
@Test
public void testDatasetCreateAndRead() throws Exception {
    // Create client
    DataHubClientV2 client = DataHubClientV2.builder()
        .server(TEST_SERVER)
        .token(TEST_TOKEN)
        .build();

    // Create dataset
    Dataset dataset = Dataset.builder()
        .platform("snowflake")
        .name("db.schema.test_table_" + System.currentTimeMillis())
        .env("PROD")
        .description("Test dataset created by Java SDK V2")
        .build();

    dataset.addTag("test-tag")
           .addOwner("urn:li:corpuser:datahub", OwnershipType.TECHNICAL_OWNER);

    // Upsert
    client.entities().upsert(dataset);

    // Read back
    Dataset retrieved = client.entities().get(dataset.getUrn(), Dataset.class);
    assertNotNull(retrieved);
    assertEquals("Test dataset created by Java SDK V2", retrieved.getDescription());
}

@Test
public void testDatasetPatchOperations() throws Exception {
    DataHubClientV2 client = DataHubClientV2.builder()
        .server(TEST_SERVER)
        .token(TEST_TOKEN)
        .build();

    // Create dataset first
    Dataset dataset = Dataset.builder()
        .platform("snowflake")
        .name("db.schema.test_table_patch_" + System.currentTimeMillis())
        .env("PROD")
        .build();
    client.entities().upsert(dataset);

    // Retrieve and apply patches
    Dataset retrieved = client.entities().get(dataset.getUrn(), Dataset.class);
    Dataset mutable = retrieved.mutable();  // Get mutable copy
    mutable.addTag("pii")                    // Creates patch
           .addTag("sensitive")              // Another patch
           .addTerm("urn:li:glossaryTerm:CustomerData");  // Another patch

    // All patches emitted atomically
    client.entities().update(mutable);

    // Verify patches were applied
    Dataset verified = client.entities().get(dataset.getUrn(), Dataset.class);
    assertTrue(verified.getTags().contains("urn:li:tag:pii"));
}
````

**Integration Test Coverage:**

- Entity creation and retrieval
- Tag, owner, term, domain operations
- Lineage relationships (charts → datasets, jobs → datasets)
- Custom properties
- Full metadata workflows
- Batch operations
- Patch accumulation and emission

**Running Integration Tests:**

```bash
export DATAHUB_SERVER=http://localhost:8080
export DATAHUB_TOKEN=your_token

./gradlew :metadata-integration:java:datahub-client:test --tests "*Integration*"
```

### Test Coverage Results

- Unit test coverage: **>80%** for new code (378 unit tests + 79 integration tests = 457 total)
- All public APIs covered
- Edge cases tested (null values, invalid inputs, mode switching)
- Async operations tested with proper synchronization
- Cache infrastructure thoroughly tested (43 tests for AspectCache + CachedAspect)
- Full end-to-end integration tests (79 tests)

## API Documentation

All public classes and methods have comprehensive Javadoc plus extensive Markdown documentation:

**Javadoc Coverage:**

- Class-level documentation explaining purpose and usage
- Method-level documentation with parameters, returns, exceptions
- Code examples for common use cases
- Links to related classes and methods

**Markdown Documentation (13 files):**

Located in `metadata-integration/java/docs/sdk-v2/`:

1. **getting-started.md** - Quick start guide for new users
2. **design-principles.md** - Architecture and design decisions
3. **dataset-entity.md** - Dataset operations and schema support
4. **chart-entity.md** - Chart operations and lineage
5. **dashboard-entity.md** - Dashboard operations and relationships
6. **container-entity.md** - Container hierarchies
7. **dataflow-entity.md** - DataFlow pipeline operations
8. **datajob-entity.md** - DataJob inlet/outlet lineage
9. **mlmodel-entity.md** - MLModel metrics and hyperparameters
10. **mlmodelgroup-entity.md** - MLModelGroup version management
11. **patch-operations.md** - Deep dive into patch-based updates
12. **migration-from-v1.md** - Migration guide from V1 SDK
13. **java-sdk-v2-design.md** - This comprehensive design document

**Working Examples (19 files):**

Located in `metadata-integration/java/examples/src/main/java/io/datahubproject/examples/v2/`:

- Dataset examples: DatasetCreateExample, DatasetFullExample, DatasetPatchExample
- Chart examples: ChartCreateExample, ChartFullExample, ChartLineageExample
- Dashboard examples: DashboardCreateExample, DashboardFullExample, DashboardLineageExample
- DataFlow examples: DataFlowCreateExample, DataFlowFullExample
- DataJob examples: DataJobCreateExample, DataJobFullExample, DataJobLineageExample
- Container examples: ContainerCreateExample, ContainerFullExample, ContainerHierarchyExample
- MLModel examples: MLModelCreateExample, MLModelFullExample
- MLModelGroup examples: MLModelGroupCreateExample, MLModelGroupFullExample

## Migration Guide

For users of the existing Java SDK:

### Before (V1):

```java
RestEmitter emitter = RestEmitter.create(b -> b.server("http://localhost:8080"));

DatasetUrn urn = new DatasetUrn(
    new DataPlatformUrn("postgres"),
    "my_table",
    FabricType.PROD
);

DatasetProperties props = new DatasetProperties();
props.setDescription("My dataset");

MetadataChangeProposalWrapper mcpw = MetadataChangeProposalWrapper.builder()
    .entityType("dataset")
    .entityUrn(urn)
    .upsert()
    .aspect(props)
    .build();

emitter.emit(mcpw).get();
```

### After (V2):

```java
DataHubClientV2 client = DataHubClientV2.builder()
    .server("http://localhost:8080")
    .build();

Dataset dataset = Dataset.builder()
    .platform("postgres")
    .name("my_table")
    .description("My dataset")
    .build();

client.entities().upsert(dataset);
```

## Decision Log

### 1. Use Pegasus Models vs OpenAPI Models

**Decision**: Use Pegasus models (`com.linkedin.*`) for aspect classes.

**Rationale**:

- Pegasus models are the canonical representation in DataHub
- Already used by v1 SDK, maintains consistency
- Generated from PDL schemas, always in sync with backend
- OpenAPI models are less mature and have fewer utilities

**Result**: Proven correct - seamless integration with existing infrastructure.

### 2. Namespace Separation

**Decision**: Use `datahub.client.v2.*` namespace.

**Rationale**:

- Clear separation from v1 API
- Allows side-by-side usage
- Follows semantic versioning principles
- Easy to deprecate v1 in future

**Result**: 100% backward compatibility achieved - v1 code unchanged.

### 3. Builder Pattern

**Decision**: Use nested static Builder classes.

**Rationale**:

- Idiomatic Java pattern
- Type-safe construction
- Optional parameters handled cleanly
- Better than telescoping constructors

**Result**: Excellent developer experience with fluent API.

### 4. Synchronous vs Async

**Decision**: Provide synchronous API that wraps async operations.

**Rationale**:

- Simpler for most users
- Matches Python SDK V2 API
- Can expose async API later for advanced users
- RestEmitter already provides async primitives

**Result**: Simplified API widely adopted in examples and tests.

### 5. Error Handling

**Decision**: Throw checked exceptions for I/O operations.

**Rationale**:

- Forces callers to handle errors
- Consistent with Java conventions
- Clear distinction between programmer errors and runtime failures

**Result**: Clear error handling patterns in all code.

**Exception Hierarchy:**

The SDK introduces custom exceptions for common error conditions:

**ReadOnlyEntityException** - Thrown when attempting to mutate a read-only entity:

```java
try {
  Dataset dataset = client.entities().get(urn);
  dataset.addTag("pii");  // Throws ReadOnlyEntityException
} catch (ReadOnlyEntityException e) {
  // Exception message explains the issue and provides fix
  System.err.println(e.getMessage());

  // Fix: Get mutable copy first
  Dataset mutable = dataset.mutable();
  mutable.addTag("pii");
  client.entities().upsert(mutable);
}
```

**PendingMutationsException** - Thrown when reading from entity with pending mutations:

```java
Dataset dataset = Dataset.builder()
    .platform("snowflake")
    .name("my_table")
    .build();

dataset.setDescription("New description");
// dataset.getDescription();  // Throws PendingMutationsException!

// Fix: Save first, then read
client.entities().upsert(dataset);  // Clears dirty flag
String desc = dataset.getDescription();  // Now works
```

**Why these restrictions?**

- **ReadOnlyEntityException**: Makes mutations explicit, prevents accidental changes when passing entities between functions
- **PendingMutationsException**: Prevents reading stale cached data, enforces explicit save-then-fetch workflow

Both restrictions enforce clear separation between read and write workflows. These may be relaxed in future versions as the API matures and usage patterns emerge.

### 6. Patch-First over Full Aspect Replacement

**Decision**: Prioritize patch-based operations as the primary API, defer full aspect replacement to V1 SDK.

**Rationale**:

- **User mental model**: "Add a tag" is more natural than "fetch all tags, modify list, PUT entire aspect"
- **Safety**: Patches don't clobber concurrent changes from other users/systems
- **Simplicity**: Most metadata operations are incremental (add owner, remove tag, etc.)
- **Efficiency**: Only changed fields transmitted and processed by server
- **Escape hatch exists**: Users needing full PUT semantics can use V1 SDK's `RestEmitter` directly

**Why not both?**
V2 SDK focuses on making common operations simple, not exposing every low-level primitive. This keeps the API focused and prevents confusion about when to use patches vs full replacement.

**Result**: Clean, intuitive API for 95% of use cases. Power users can drop to V1 SDK for remaining 5%.

### 7. Internal Patch Accumulation vs External Patch Builders

**Decision**: Accumulate patches **inside entities** rather than separate patch builder classes.

**Rationale**:

- More intuitive API - metadata operations just work
- Patches automatically emitted on save
- Reduces API surface area
- Simplifies user code

**Original Design**: Separate `DatasetPatch`, `ChartPatch` builder classes

**Actual Implementation**: Patches accumulate in `Entity.pendingPatches` and emit via `toMCPs()`

**Result**: Superior developer experience - no need to learn separate patch API.

### 8. CRTP Pattern for Mixin Interfaces

**Decision**: Use Curiously Recurring Template Pattern for type-safe mixin interfaces.

**Rationale**:

- Type-safe method chaining returns concrete entity type
- Compile-time type checking
- No casting required in user code
- Idiomatic Java generics pattern

**Original Design**: Simple interfaces returning `Entity`

**Actual Implementation**:

```java
public interface HasTags<T extends Entity & HasTags<T>> {
    default T addTag(String tagUrn) { return (T) this; }
}
```

**Result**: Excellent type safety and developer experience.

### 9. Mode-Aware Behavior (SDK vs INGESTION)

**Decision**: Support SDK mode and INGESTION mode for aspect routing.

**Rationale**:

- Proper separation of user edits vs pipeline writes
- SDK mode → editable aspects (user overrides)
- INGESTION mode → system aspects (pipeline data)
- Getters prefer editable over system

**Original Design**: Not specified

**Actual Implementation**: `OperationMode` enum with aspect routing logic

**Result**: Clear separation of concerns, aligns with DataHub's aspect model.

### 10. Lazy Loading for GET Operations

**Decision**: Implement lazy loading for aspects when entities are retrieved.

**Rationale**:

- Performance - only fetch aspects when accessed
- Client binding enables on-demand fetching
- Cache management with timestamps

**Original Design**: Not specified (GET deferred)

**Actual Implementation**: Full lazy loading with `getAspectLazy()` and client binding

**Result**: Efficient entity retrieval with on-demand aspect fetching.

## Design Questions and Resolutions

1. **GET operation implementation**: Should we implement REST client for reading entities, or defer to future?

   - **Resolution**: Implemented with lazy loading support

2. **Search client**: Should we include search functionality in V2?

   - **Resolution**: Deferred to future (out of scope for V2)

3. **Lineage client**: Should we include lineage management?

   - **Resolution**: Basic lineage on Dataset, Chart, Dashboard, DataJob entities

4. **Schema field builders**: Should we provide fluent builders for schema fields?
   - **Resolution**: Yes, schema field support in Dataset entity

## References

- [Python SDK V2 Implementation](https://github.com/datahub-project/datahub/tree/master/metadata-ingestion/src/datahub/sdk)
- [Existing Java SDK](https://github.com/datahub-project/datahub/tree/master/metadata-integration/java/datahub-client)
- [DataHub Metadata Model](https://github.com/datahub-project/datahub/tree/master/metadata-models)

## Quick Links for Reviewers

**Start Here:**

1. `metadata-integration/java/docs/sdk-v2/getting-started.md` - Quick start guide
2. `metadata-integration/java/docs/sdk-v2/design-principles.md` - Architecture overview

**Core Implementation:** 3. `metadata-integration/java/datahub-client/src/main/java/datahub/client/v2/entity/Entity.java` (490 lines) - Base entity class 4. `metadata-integration/java/datahub-client/src/main/java/datahub/client/v2/operations/EntityClient.java` (570 lines) - CRUD operations 5. `metadata-integration/java/datahub-client/src/main/java/datahub/client/v2/DataHubClientV2.java` (266 lines) - Main client

**Sample Entities:** 6. `metadata-integration/java/datahub-client/src/main/java/datahub/client/v2/entity/Dataset.java` (564 lines) - Reference implementation 7. `metadata-integration/java/datahub-client/src/main/java/datahub/client/v2/entity/HasTags.java` (145 lines) - CRTP mixin example

**Examples:** 8. `metadata-integration/java/examples/src/main/java/io/datahubproject/examples/v2/DatasetFullExample.java` - Complete workflow 9. `metadata-integration/java/examples/src/main/java/io/datahubproject/examples/v2/ChartLineageExample.java` - Lineage relationships

**Tests:** 10. `metadata-integration/java/datahub-client/src/test/java/datahub/client/v2/entity/DatasetTest.java` (37 unit tests) 11. `metadata-integration/java/datahub-client/src/test/java/datahub/client/v2/integration/DatasetIntegrationTest.java` - End-to-end validation

---

**Document Status**: Design document reflecting implemented architecture (includes AspectCache refactoring)
**Author**: DataHub OSS Team
**Last Updated**: 2025-01-06