Update documentation

2025-12-28 02:17:53 +00:00 · 2019-09-09 01:45:49 -07:00 · 2019-09-09 01:45:49 -07:00 · 6716e38279
commit 6716e38279
parent 8bfb086e09
7 changed files with 369 additions and 55 deletions
--- a/README.md
+++ b/README.md
@ -24,6 +24,6 @@ as username and password.
 * [Metadata Ingestion](metadata-ingestion)

 ## Roadmap
-1. [Neo4J](http://neo4j.com) graph query support 
-2. User profile page
+1. Add [Neo4J](http://neo4j.com) graph query support 
+2. Add user profile page
 3. Deploy Data Hub to [Azure Cloud](https://azure.microsoft.com/en-us/)
--- a/docker/frontend/README.md
+++ b/docker/frontend/README.md
@ -6,7 +6,7 @@ responsibility of this service for the Data Hub.

 ## Build
 ```
-docker image build -t keremsahin/datahub-frontend -f docker/datahub-frontend/Dockerfile .
+docker image build -t keremsahin/datahub-frontend -f docker/frontend/Dockerfile .
 ```
 This command will build and deploy the image in your local store.

--- a/docker/quickstart/README.md
+++ b/docker/quickstart/README.md
@ -1,9 +1,31 @@
-# Quickstart
+# Data Hub Quickstart
 To start all Docker containers at once, please run below command:
 ```
 cd docker/quickstart && docker-compose up
 ```
-After `elasticsearch` container is initialized, run below to create the search indices:
+After containers are initialized, we need to create the `dataset` and `users` search indices by running below command:
 ```
 cd docker/elasticsearch && bash init.sh
-```
+```
+At this point, all containers are ready and Data Hub can be considered up and running. Check specific containers guide
+for details:
+* [Elasticsearch & Kibana](../elasticsearch)
+* [Data Hub Frontend](../frontend)
+* [Data Hub GMS](../gms)
+* [Kafka, Schema Registry & Zookeeper](../kafka)
+* [Data Hub MAE Consumer](../mae-consumer)
+* [Data Hub MCE Consumer](../mce-consumer)
+* [MySQL](../mysql) 
+
+From this point on, if you want to be able to sign in to Data Hub and see some sample data, please see 
+[Metadata Ingestion Guide](../../metadata-ingestion) for `bootstrapping Data Hub`.
+
+## Debugging Containers
+If you want to debug containers, you can check container logs:
+```
+docker logs <<container_name>>
+```
+Also, you can connect to container shell for further debugging:
+```
+docker exec -it <<container_name>> bash
+```
--- a/gms/README.md
+++ b/gms/README.md
@ -1,57 +1,304 @@
 # Data Hub Generalized Metadata Store (GMS)
+Data Hub GMS is a [Rest.li](https://linkedin.github.io/rest.li/) service written in Java. It is following common 
+Rest.li server development practices and all data models are Pegasus(.pdsc) models.

-## Starting GMS
+## Pre-requisites
+* You need to have [JDK8](https://www.oracle.com/java/technologies/jdk8-downloads.html) 
+installed on your machine to be able to build `Data Hub GMS`.

+## Build
+`Data Hub GMS` is already built as part of top level build:
 ```
-./gradlew build && ./gradlew :gms:war:JettyRunWar
+./gradlew build
+```
+However, if you only want to build `Data Hub GMS` specifically:
+```
+./gradlew :gms:war:build
 ```

-### Example GMS Curl Calls
+## Dependencies
+Before starting `Data Hub GMS`, you need to make sure that [Kafka, Schema Registry & Zookeeper](../docker/kafka),  
+[Elasticsearch](../docker/elasticsearch) and [MySQL](../docker/mysql) Docker containers are up and running.

-#### Create
+## Start via Docker image
+Quickest way to try out `Data Hub GMS` is running the [Docker image](../docker/gms).
+
+## Start via command line
+If you do modify things and want to try it out quickly without building the Docker image, you can also run
+the application directly from command line after a successful [build](#build):
 ```
-curl 'http://localhost:8080/corpUsers/($params:(),name:fbar)/snapshot' -X POST -H 'X-RestLi-Method: create' -H 'X-RestLi-Protocol-Version:2.0.0' --data '{"aspects": [{"com.linkedin.identity.CorpUserInfo":{"active": true, "fullName": "Foo Bar", "email": "fbar@linkedin.com"}}, {"com.linkedin.identity.CorpUserEditableInfo":{}}], "urn": "urn:li:corpuser:fbar"}' -v
-curl 'http://localhost:8080/datasets/($params:(),name:x.y,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Afoo)/snapshot' -X POST -H 'X-RestLi-Method: create' -H 'X-RestLi-Protocol-Version:2.0.0' --data '{"aspects":[{"com.linkedin.common.Ownership":{"owners":[{"owner":"urn:li:corpuser:ksahin","type":"DATAOWNER"}],"lastModified":{"time":0,"actor":"urn:li:corpuser:ksahin"}}},{"com.linkedin.dataset.UpstreamLineage":{"upstreams":[{"auditStamp":{"time":0,"actor":"urn:li:corpuser:ksahin"},"dataset":"urn:li:dataset:(urn:li:dataPlatform:foo,barUp,PROD)","type":"TRANSFORMED"}]}},{"com.linkedin.common.InstitutionalMemory":{"elements":[{"url":"https://www.linkedin.com","description":"Sample doc","createStamp":{"time":0,"actor":"urn:li:corpuser:ksahin"}}]}},{"com.linkedin.schema.SchemaMetadata":{"schemaName":"FooEvent","platform":"urn:li:dataPlatform:foo","version":0,"created":{"time":0,"actor":"urn:li:corpuser:ksahin"},"lastModified":{"time":0,"actor":"urn:li:corpuser:ksahin"},"hash":"","platformSchema":{"com.linkedin.schema.KafkaSchema":{"documentSchema":"{\"type\":\"record\",\"name\":\"MetadataChangeEvent\",\"namespace\":\"com.linkedin.mxe\",\"doc\":\"Kafka event for proposing a metadata change for an entity.\",\"fields\":[{\"name\":\"auditHeader\",\"type\":{\"type\":\"record\",\"name\":\"KafkaAuditHeader\",\"namespace\":\"com.linkedin.avro2pegasus.events\",\"doc\":\"Header\"}}]}"}},"fields":[{"fieldPath":"foo","description":"Bar","nativeDataType":"string","type":{"type":{"com.linkedin.schema.StringType":{}}}}]}}],"urn":"urn:li:dataset:(urn:li:dataPlatform:foo,bar,PROD)"}' -v
+./gradlew :gms:war:JettyRunWar
 ```

-#### Get
+## Sample API Calls
+
+### Create user
 ```
-curl -H 'X-RestLi-Protocol-Version:2.0.0' -H 'X-RestLi-Method: get' 'http://localhost:8080/corpUsers/($params:(),name:fbar)/snapshot/($params:(),aspectVersions:List((aspect:com.linkedin.identity.CorpUserInfo,version:0)))' | jq
-curl -H 'X-RestLi-Protocol-Version:2.0.0' -H 'X-RestLi-Method: get' 'http://localhost:8080/datasets/($params:(),name:x.y,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Afoo)/snapshot/($params:(),aspectVersions:List((aspect:com.linkedin.common.Ownership,version:0)))' | jq
+➜ curl 'http://localhost:8080/corpUsers/($params:(),name:fbar)/snapshot' -X POST -H 'X-RestLi-Method: create' -H 'X-RestLi-Protocol-Version:2.0.0' --data '{"aspects": [{"com.linkedin.identity.CorpUserInfo":{"active": true, "displayName": "Foo Bar", "fullName": "Foo Bar", "email": "fbar@linkedin.com"}}, {"com.linkedin.identity.CorpUserEditableInfo":{}}], "urn": "urn:li:corpuser:fbar"}' -v
 ```

-### Get all
+### Create dataset
 ```
-curl -H 'X-RestLi-Protocol-Version:2.0.0' -H 'X-RestLi-Method: get_all' 'http://localhost:8080/corpUsers' | jq
+➜ curl 'http://localhost:8080/datasets/($params:(),name:bar,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Afoo)/snapshot' -X POST -H 'X-RestLi-Method: create' -H 'X-RestLi-Protocol-Version:2.0.0' --data '{"aspects":[{"com.linkedin.common.Ownership":{"owners":[{"owner":"urn:li:corpuser:fbar","type":"DATAOWNER"}],"lastModified":{"time":0,"actor":"urn:li:corpuser:fbar"}}},{"com.linkedin.dataset.UpstreamLineage":{"upstreams":[{"auditStamp":{"time":0,"actor":"urn:li:corpuser:fbar"},"dataset":"urn:li:dataset:(urn:li:dataPlatform:foo,barUp,PROD)","type":"TRANSFORMED"}]}},{"com.linkedin.common.InstitutionalMemory":{"elements":[{"url":"https://www.linkedin.com","description":"Sample doc","createStamp":{"time":0,"actor":"urn:li:corpuser:fbar"}}]}},{"com.linkedin.schema.SchemaMetadata":{"schemaName":"FooEvent","platform":"urn:li:dataPlatform:foo","version":0,"created":{"time":0,"actor":"urn:li:corpuser:fbar"},"lastModified":{"time":0,"actor":"urn:li:corpuser:fbar"},"hash":"","platformSchema":{"com.linkedin.schema.KafkaSchema":{"documentSchema":"{\"type\":\"record\",\"name\":\"MetadataChangeEvent\",\"namespace\":\"com.linkedin.mxe\",\"doc\":\"Kafka event for proposing a metadata change for an entity.\",\"fields\":[{\"name\":\"auditHeader\",\"type\":{\"type\":\"record\",\"name\":\"KafkaAuditHeader\",\"namespace\":\"com.linkedin.avro2pegasus.events\",\"doc\":\"Header\"}}]}"}},"fields":[{"fieldPath":"foo","description":"Bar","nativeDataType":"string","type":{"type":{"com.linkedin.schema.StringType":{}}}}]}}],"urn":"urn:li:dataset:(urn:li:dataPlatform:foo,bar,PROD)"}' -v
 ```

-### Browse
-
+### Get user
 ```
-curl "http://localhost:8080/datasets?action=browse" -d '{"path": "", "start": 0, "limit": 10}' -X POST -H 'X-RestLi-Protocol-Version: 2.0.0' | jq
+➜ curl -H 'X-RestLi-Protocol-Version:2.0.0' -H 'X-RestLi-Method: get' 'http://localhost:8080/corpUsers/($params:(),name:fbar)/snapshot/($params:(),aspectVersions:List((aspect:com.linkedin.identity.CorpUserInfo,version:0)))' | jq
+{
+  "urn": "urn:li:corpuser:fbar",
+  "aspects": [
+    {
+      "com.linkedin.identity.CorpUserInfo": {
+        "displayName": "Foo Bar",
+        "active": true,
+        "fullName": "Foo Bar",
+        "email": "fbar@linkedin.com"
+      }
+    }
+  ]
+}
 ```

-### Search
-
+### Get dataset
 ```
-curl "http://localhost:8080/corpUsers?q=search&input=foo&" -X GET -H 'X-RestLi-Protocol-Version: 2.0.0' -H 'X-RestLi-Method: finder' | jq
-curl "http://localhost:8080/datasets?q=search&input=foo&" -X GET -H 'X-RestLi-Protocol-Version: 2.0.0' -H 'X-RestLi-Method: finder' | jq
+➜ curl -H 'X-RestLi-Protocol-Version:2.0.0' -H 'X-RestLi-Method: get' 'http://localhost:8080/datasets/($params:(),name:bar,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Afoo)/snapshot/($params:(),aspectVersions:List((aspect:com.linkedin.common.Ownership,version:0)))' | jq
+{
+  "urn": "urn:li:dataset:(urn:li:dataPlatform:foo,bar,PROD)",
+  "aspects": [
+    {
+      "com.linkedin.common.Ownership": {
+        "owners": [
+          {
+            "owner": "urn:li:corpuser:fbar",
+            "type": "DATAOWNER"
+          },
+          {
+            "owner": "urn:li:corpuser:ksahin",
+            "type": "DATAOWNER"
+          }
+        ],
+        "lastModified": {
+          "actor": "urn:li:corpuser:ksahin",
+          "time": 1568015476480
+        }
+      }
+    }
+  ]
+}
 ```

-### Autocomplete
-
+### Get all users
 ```
-curl "http://localhost:8080/datasets?action=autocomplete" -d '{"query": "foo", "field": "name", "limit": 10}' -X POST -H 'X-RestLi-Protocol-Version: 2.0.0' | jq
+➜ curl -H 'X-RestLi-Protocol-Version:2.0.0' -H 'X-RestLi-Method: get_all' 'http://localhost:8080/corpUsers' | jq
+{
+  "elements": [
+    {
+      "editableInfo": {},
+      "username": "fbar",
+      "info": {
+        "displayName": "Foo Bar",
+        "active": true,
+        "fullName": "Foo Bar",
+        "email": "fbar@linkedin.com"
+      }
+    },
+    {
+      "editableInfo": {
+        "skills": [],
+        "teams": [],
+        "pictureLink": "https://content.linkedin.com/content/dam/me/business/en-us/amp/brand-site/v2/bg/LI-Bug.svg.original.svg"
+      },
+      "username": "ksahin",
+      "info": {
+        "displayName": "Kerem Sahin",
+        "active": true,
+        "fullName": "Kerem Sahin",
+        "email": "ksahin@linkedin.com"
+      }
+    },
+    {
+      "editableInfo": {
+        "skills": [],
+        "teams": [],
+        "pictureLink": "https://content.linkedin.com/content/dam/me/business/en-us/amp/brand-site/v2/bg/LI-Bug.svg.original.svg"
+      },
+      "username": "datahub",
+      "info": {
+        "displayName": "Data Hub",
+        "active": true,
+        "fullName": "Data Hub",
+        "email": "datahub@linkedin.com"
+      }
+    }
+  ],
+  "paging": {
+    "count": 10,
+    "start": 0,
+    "links": []
+  }
+}
 ```

-### Ownership
-
+### Browse datasets
 ```
-curl -H 'X-RestLi-Protocol-Version:2.0.0' -H 'X-RestLi-Method: get' 'http://localhost:8080/datasets/($params:(),name:x.y,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Afoo)/rawOwnership/0' | jq
+➜ curl "http://localhost:8080/datasets?action=browse" -d '{"path": "", "start": 0, "limit": 10}' -X POST -H 'X-RestLi-Protocol-Version: 2.0.0' | jq
+{
+  "value": {
+    "numEntities": 0,
+    "metadata": {
+      "totalNumEntities": 2,
+      "path": "",
+      "groups": [
+        {
+          "name": "prod",
+          "count": 2
+        }
+      ]
+    },
+    "entities": [],
+    "pageSize": 10,
+    "from": 0
+  }
+}
 ```

-### Schema
-
+### Search users
 ```
-curl -H 'X-RestLi-Protocol-Version:2.0.0' -H 'X-RestLi-Method: get' 'http://localhost:8080/datasets/($params:(),name:x.y,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Afoo)/schema/0' | jq
+➜ curl "http://localhost:8080/corpUsers?q=search&input=foo&" -X GET -H 'X-RestLi-Protocol-Version: 2.0.0' -H 'X-RestLi-Method: finder' | jq
+{
+  "metadata": {
+    "searchResultMetadatas": [
+      {
+        "name": "title",
+        "aggregations": {}
+      }
+    ]
+  },
+  "elements": [
+    {
+      "editableInfo": {},
+      "username": "fbar",
+      "info": {
+        "displayName": "Foo Bar",
+        "active": true,
+        "fullName": "Foo Bar",
+        "email": "fbar@linkedin.com"
+      }
+    }
+  ],
+  "paging": {
+    "total": 1,
+    "count": 10,
+    "start": 0,
+    "links": []
+  }
+}
+```
+
+### Search datasets
+```
+➜ curl "http://localhost:8080/datasets?q=search&input=bar" -X GET -H 'X-RestLi-Protocol-Version: 2.0.0' -H 'X-RestLi-Method: finder' | jq
+{
+  "metadata": {
+    "searchResultMetadatas": [
+      {
+        "name": "platform",
+        "aggregations": {
+          "foo": 1
+        }
+      },
+      {
+        "name": "origin",
+        "aggregations": {
+          "prod": 1
+        }
+      }
+    ]
+  },
+  "elements": [
+    {
+      "urn": "urn:li:dataset:(urn:li:dataPlatform:foo,bar,PROD)",
+      "origin": "PROD",
+      "name": "bar",
+      "platform": "urn:li:dataPlatform:foo"
+    }
+  ],
+  "paging": {
+    "total": 1,
+    "count": 10,
+    "start": 0,
+    "links": []
+  }
+}
+```
+
+### Typeahead for datasets
+```
+➜ curl "http://localhost:8080/datasets?action=autocomplete" -d '{"query": "bar", "field": "name", "limit": 10}' -X POST -H 'X-RestLi-Protocol-Version: 2.0.0' | jq
+{
+  "value": {
+    "query": "bar",
+    "suggestions": [
+      "bar"
+    ]
+  }
+}
+```
+
+### Get dataset ownership
+```
+➜ curl -H 'X-RestLi-Protocol-Version:2.0.0' -H 'X-RestLi-Method: get' 'http://localhost:8080/datasets/($params:(),name:bar,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Afoo)/rawOwnership/0' | jq
+{
+  "owners": [
+    {
+      "owner": "urn:li:corpuser:fbar",
+      "type": "DATAOWNER"
+    },
+    {
+      "owner": "urn:li:corpuser:ksahin",
+      "type": "DATAOWNER"
+    }
+  ],
+  "lastModified": {
+    "actor": "urn:li:corpuser:ksahin",
+    "time": 1568015476480
+  }
+}
+```
+
+### Get dataset schema
+```
+➜ curl -H 'X-RestLi-Protocol-Version:2.0.0' -H 'X-RestLi-Method: get' 'http://localhost:8080/datasets/($params:(),name:bar,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Afoo)/schema/0' | jq
+{
+  "created": {
+    "actor": "urn:li:corpuser:fbar",
+    "time": 0
+  },
+  "platformSchema": {
+    "com.linkedin.schema.KafkaSchema": {
+      "documentSchema": "{\"type\":\"record\",\"name\":\"MetadataChangeEvent\",\"namespace\":\"com.linkedin.mxe\",\"doc\":\"Kafka event for proposing a metadata change for an entity.\",\"fields\":[{\"name\":\"auditHeader\",\"type\":{\"type\":\"record\",\"name\":\"KafkaAuditHeader\",\"namespace\":\"com.linkedin.avro2pegasus.events\",\"doc\":\"Header\"}}]}"
+    }
+  },
+  "lastModified": {
+    "actor": "urn:li:corpuser:fbar",
+    "time": 0
+  },
+  "schemaName": "FooEvent",
+  "fields": [
+    {
+      "fieldPath": "foo",
+      "description": "Bar",
+      "type": {
+        "type": {
+          "com.linkedin.schema.StringType": {}
+        }
+      },
+      "nativeDataType": "string"
+    }
+  ],
+  "version": 0,
+  "platform": "urn:li:dataPlatform:foo",
+  "hash": ""
+}
 ```
--- a/metadata-jobs/README.md
+++ b/metadata-jobs/README.md
@ -0,0 +1,10 @@
+# MXE Consumer Jobs
+Data Hub uses Kafka as the pub-sub message queue in the backend. There are 2 Kafka topics used by Data Hub which are 
+`MetadataChangeEvent` and `MetadataAuditEvent`.
+* `MetadataChangeEvent:` This message is emitted by any data platform or crawler in which there is a change in the metadata.
+* `MetadataAuditEvent:` This message is emitted by [Data Hub GMS](../gms) to notify that metadata change is registered.
+
+To be able to consume from these two topics, there are two [Kafka Streams](https://kafka.apache.org/documentation/streams/)
+ jobs Data Hub uses:
+* [MCE Consumer Job](mce-consumer-job): Writes to [Data Hub GMS](../gms)
+* [MAE Consumer Job](elasticsearch-index-job): Writes to [Elasticsearch](../docker/elasticsearch)
--- a/metadata-jobs/elasticsearch-index-job/README.md
+++ b/metadata-jobs/elasticsearch-index-job/README.md
@ -1,17 +1,33 @@
 # MetadataAuditEvent (MAE) Consumer Job
+MAE Consumer is a [Kafka Streams](https://kafka.apache.org/documentation/streams/) job. Its main function is to listen
+`MetadataAuditEvent` Kafka topic for messages and process those messages using [index builders](../../metadata-builders).
+Index builders create search document model by processing MAE and then these documents are indexed into Elasticsearch.
+So, this job is providing us a near-realtime search index update. 

-## Starting job
-Run below to start Elasticsearch indexing job.
+## Pre-requisites
+* You need to have [JDK8](https://www.oracle.com/java/technologies/jdk8-downloads.html) 
+installed on your machine to be able to build `Data Hub GMS`.
+
+## Build
+`MAE Consumer Job` is already built as part of top level build:
+```
+./gradlew build
+```
+However, if you only want to build `MAE Consumer Job` specifically:
+```
+./gradlew :metadata-jobs:elasticsearch-index-job:build
+```
+
+## Dependencies
+Before starting `MAE Consumer Job`, you need to make sure that [Kafka, Schema Registry & Zookeeper](../../docker/kafka) and  
+[Elasticsearch](../../docker/elasticsearch) Docker containers are up and running.
+
+## Start via Docker image
+Quickest way to try out `MAE Consumer Job` is running the [Docker image](../../docker/mae-consumer).
+
+## Start via command line
+If you do modify things and want to try it out quickly without building the Docker image, you can also run
+the application directly from command line after a successful [build](#build):
 ```
 ./gradlew :metadata-jobs:elasticsearch-index-job:run
-```
-To test the job, you should've already started Kafka, GMS, MySQL and ElasticSearch/Kibana.
-After starting all the services, you can create a record in GMS by Snapshot endpoint as below.
-```
-curl 'http://localhost:8080/metrics/($params:(),name:a.b.c01,type:UMP)/snapshot' -X POST -H 'X-RestLi-Method: create' -H 'X-RestLi-Protocol-Version:2.0.0' --data '{"aspects": [{"com.linkedin.common.Ownership":{"owners":[{"owner":"urn:li:corpuser:ksahin","type":"DATAOWNER"}]}}], "urn": "urn:li:metric:(UMP,a.b.c01)"}' -v
-```
-This will fire an MAE and search index will be updated by indexing job after reading MAE from Kafka.
-Then, you can check ES index if document is populated by below command.
-```
-curl localhost:9200/metricdocument/_search -d '{"query":{"match":{"urn":"urn:li:metric:(UMP,a.b.c01)"}}}' | jq
 ```
--- a/metadata-jobs/mce-consumer-job/README.md
+++ b/metadata-jobs/mce-consumer-job/README.md
@ -1,14 +1,33 @@
-# MetadataChangeEvent (MCE) Consumer Job
+# MetadataChangeEvent (MAE) Consumer Job
+MCE Consumer is a [Kafka Streams](https://kafka.apache.org/documentation/streams/) job. Its main function is to listen
+`MetadataChangeEvent` Kafka topic for messages and process those messages and writes new metadata to `Data Hub GMS`.
+After every successful update of metadata, GMS fires a `MetadataAuditEvent` and this is consumed by 
+[MAE Consumer Job](../elasticsearch-index-job).

-## Starting job
-Run below to start MCE consuming job.
+## Pre-requisites
+* You need to have [JDK8](https://www.oracle.com/java/technologies/jdk8-downloads.html) 
+installed on your machine to be able to build `Data Hub GMS`.
+
+## Build
+`MCE Consumer Job` is already built as part of top level build:
+```
+./gradlew build
+```
+However, if you only want to build `MCE Consumer Job` specifically:
+```
+./gradlew :metadata-jobs:mce-consumer-job:build
+```
+
+## Dependencies
+Before starting `MCE Consumer Job`, you need to make sure that [Kafka, Schema Registry & Zookeeper](../../docker/kafka) and  
+[Data Hub GMS](../../docker/gms) Docker containers are up and running.
+
+## Start via Docker image
+Quickest way to try out `MCE Consumer Job` is running the [Docker image](../../docker/mce-consumer).
+
+## Start via command line
+If you do modify things and want to try it out quickly without building the Docker image, you can also run
+the application directly from command line after a successful [build](#build):
 ```
 ./gradlew :metadata-jobs:mce-consumer-job:run
-```
-Create your own MCE to align the models in bootstrap_mce.dat.
-Tips: one line per MCE with Python syntax.
-
-Then you can produce MCE to feed your GMS.
-```
-cd metadata-ingestion && python mce_cli.py produce
 ```