The Kafka Event Source is the default Event Source used within the DataHub Actions Framework.
Under the hood, the Kafka Event Source uses a Kafka Consumer to subscribe to the topics streaming
out of DataHub (MetadataChangeLog_v1, PlatformEvent_v1). Each Action is automatically placed into a unique
[consumer group](https://docs.confluent.io/platform/current/clients/consumer.html#consumer-groups) based on
the unique `name` provided inside the Action configuration file.
This means that you can easily scale-out Actions processing by sharing the same Action configuration file across
multiple nodes or processes. As long as the `name` of the Action is the same, each instance of the Actions framework will subscribe as a member in the same Kafka Consumer Group, which allows for load balancing the
topic traffic across consumers which each consume independent [partitions](https://developer.confluent.io/learn-kafka/apache-kafka/partitions/#kafka-partitioning).
Because the Kafka Event Source uses consumer groups by default, actions using this source will be **stateful**.
This means that Actions will keep track of their processing offsets of the upstream Kafka topics. If you
stop an Action and restart it sometime later, it will first "catch up" by processing the messages that the topic
has received since the Action last ran. Be mindful of this - if your Action is computationally expensive, it may be preferable to start consuming from the end of the log, instead of playing catch up. The easiest way to achieve this is to simply rename the Action inside the Action configuration file - this will create a new Kafka Consumer Group which will begin processing new messages at the end of the log (latest policy).
### Processing Guarantees
This event source implements an "ack" function which is invoked if and only if an event is successfully processed
by the Actions framework, meaning that the event made it through the Transformers and into the Action without
any errors. Under the hood, the "ack" method synchronously commits Kafka Consumer Offsets on behalf of the Action. This means that by default, the framework provides *at-least once* processing semantics. That is, in the unusual case that a failure occurs when attempting to commit offsets back to Kafka, that event may be replayed on restart of the Action.
If you've configured your Action pipeline `failure_mode` to be `CONTINUE` (the default), then events which
fail to be processed will simply be logged to a `failed_events.log` file for further investigation (dead letter queue). The Kafka Event Source will continue to make progress against the underlying topics and continue to commit offsets even in the case of failed messages.
If you've configured your Action pipeline `failure_mode` to be `THROW`, then events which fail to be processed result in an Action Pipeline error. This in turn terminates the pipeline before committing offsets back to Kafka. Thus the message will not be marked as "processed" by the Action consumer.