mirror of
https://github.com/datahub-project/datahub.git
synced 2025-06-27 05:03:31 +00:00
Update README.md (#13668)
This commit is contained in:
parent
faae9a6b1f
commit
5404ee9b39
@ -1,7 +1,7 @@
|
||||
# Spark
|
||||
|
||||
To integrate Spark with DataHub, we provide a lightweight Java agent that listens for Spark application and job events
|
||||
and pushes metadata out to DataHub in real-time. The agent listens to events such application start/end, and
|
||||
and pushes metadata out to DataHub in real-time. The agent listens to events such as application start/end, and
|
||||
SQLExecution start/end to create pipelines (i.e. DataJob) and tasks (i.e. DataFlow) in Datahub along with lineage to
|
||||
datasets that are being read from and written to. Read on to learn how to configure this for different Spark scenarios.
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user