docs(ingest): spark-lineage update artifact name and version (#3760)

This commit is contained in:
MugdhaHardikar-GSLab 2021-12-17 03:25:25 +05:30 committed by GitHub
parent 21fbfbba08
commit e6f8c1c17c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -11,7 +11,7 @@ When running jobs using spark-submit, the listener is to be configured in the co
spark.master spark://spark-master:7077 spark.master spark://spark-master:7077
#Configuring datahub spark listener jar #Configuring datahub spark listener jar
spark.jars.packages io.acryl:spark-lineage:0.0.1 spark.jars.packages io.acryl:datahub-spark-lineage:0.0.2
spark.extraListeners com.linkedin.datahub.lineage.spark.interceptor.DatahubLineageEmitter spark.extraListeners com.linkedin.datahub.lineage.spark.interceptor.DatahubLineageEmitter
spark.datahub.lineage.mcpEmitter.gmsUrl http://localhost:8080 spark.datahub.lineage.mcpEmitter.gmsUrl http://localhost:8080
``` ```
@ -23,7 +23,7 @@ When running interactive jobs from a notebook, the listener can be configured wh
spark = SparkSession.builder \ spark = SparkSession.builder \
.master("spark://spark-master:7077") \ .master("spark://spark-master:7077") \
.appName("test-application") \ .appName("test-application") \
.config("spark.jars.packages","io.acryl:spark-lineage:0.0.1") \ .config("spark.jars.packages","io.acryl:datahub-spark-lineage:0.0.2") \
.config("spark.extraListeners","com.linkedin.datahub.lineage.interceptor.spark.DatahubLineageEmitter") \ .config("spark.extraListeners","com.linkedin.datahub.lineage.interceptor.spark.DatahubLineageEmitter") \
.config("spark.datahub.lineage.mcpEmitter.gmsUrl", "http://localhost:8080") \ .config("spark.datahub.lineage.mcpEmitter.gmsUrl", "http://localhost:8080") \
.enableHiveSupport() \ .enableHiveSupport() \
@ -42,7 +42,7 @@ The following custom properties in pipelines and tasks relate to the Spark UI:
Other custom properties of pipelines and tasks capture the start and end times of execution etc. Other custom properties of pipelines and tasks capture the start and end times of execution etc.
The query plan is captured in the *queryPlan* property of a task. The query plan is captured in the *queryPlan* property of a task.
## Release notes for v0.0.1 ## Release notes for v0.0.2
In this version, basic dataset-level lineage is captured using the model mapping as mentioned earlier. In this version, basic dataset-level lineage is captured using the model mapping as mentioned earlier.
### Spark versions supported ### Spark versions supported