Skip to main content

Streams

Description

This DynamoDB feature enables an event-driven approach to process changes in the items of a DynamoDB table, which is suitable for post-processing tasks such as replication to secondary datastores, updating analytics, archiving data or triggering notifications.

Implementation details

There are two approaches to consuming a DynamoDB stream:

  1. Using lambdas: This is the most common approach, in which the developer sets up an association between a stream and a lambda function.

    A big advantage here is that AWS takes care of consuming the stream on their side and triggering the lambda function automatically according to the event source mapping declared by the developer.

    On the other hand, this approach has the downside of coupling the stack to AWS lambdas, which are notoriously tricky to deploy as code with their dependencies using terraform and cannot run makes, making it harder to achieve local reproducibility.

  2. Using the Kinesis Client Library (KCL): DynamoDB streams was designed with a similar API to that of Kinesis, another AWS data streaming service, thus it is possible to use the KCL with an adapter to develop a custom consumer application.

    KCL is a java library that supports a language agnostic interface known as MultiLangDaemon, which works by spawning the consumer process and communicating with it via stdin/stdout.

    While it is possible to consume the DynamoDB streams API using a language-specific SDK like boto3, there are many tricks and details to take care of in order to guarantee reliable processing, which is why AWS provides this purpose-built library abstracting all those behaviors.

We decided to use the latter, aiming to circumvent the complexities and environment disparities inherent to using lambdas.