Update README.md (#13668)

This commit is contained in:
leaderofrogue 2025-06-14 02:20:01 +10:00 committed by GitHub
parent faae9a6b1f
commit 5404ee9b39
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -1,7 +1,7 @@
# Spark
To integrate Spark with DataHub, we provide a lightweight Java agent that listens for Spark application and job events
and pushes metadata out to DataHub in real-time. The agent listens to events such application start/end, and
and pushes metadata out to DataHub in real-time. The agent listens to events such as application start/end, and
SQLExecution start/end to create pipelines (i.e. DataJob) and tasks (i.e. DataFlow) in Datahub along with lineage to
datasets that are being read from and written to. Read on to learn how to configure this for different Spark scenarios.