Skip to main content

DynamoDB

Rationale#

DynamoDB is the database we use for storing all the business-related data in our ASM.

The main reasons why we chose it over other alternatives are:

  1. It is a NoSQL database, being the perfect fit for our ASM, as it has clear access patterns and needs to be performant and scalable.
  2. It is a SaaS-oriented database, as it does not require managing any type of infrastructure like networking or servers.
  3. It complies with several certifications from ISO and CSA. Many of these certifications are focused on granting that the entity follows best practices regarding secure cloud-based environments and information security.
  4. It allows us to retrieve data with single-digit millisecond performance without having to worry about scalability or availability.
  5. It is accessed using a public API, considerably simplifying the process of connecting applications to it.
  6. It has a partition-based architecture, allowing to handle hundreds of TiBs of data and peaks of up to 20 million requests per second.
  7. Database designs can be versioned as code using NoSQL Workbench for DynamoDB.
  8. It supports pagination, which is essential for keeping applications performant when queries return too much data.
  9. It supports Global secondary indexes, allowing to easily add new access patterns as applications evolve.
  10. It supports classic On-demand backups, allowing us to have backups of all our data stored in the cloud.
  11. It supports Point-in-Time Recovery, which helps to restore tables to previous states in time by using incremental backups.
  12. It Integrates with Redshift, partially allowing us to move data to our data warehouse.
  13. It supports local deployments, allowing us to run DynamoDB on local machines. This is especially useful for ephemeral environments.
  14. All its settings can be written as code using Terraform.
  15. It is supported by Terraform state locking, allowing us to avoid race conditions when applying infrastructure changes.
  16. DynamoDB performance can be monitored via CloudWatch.
  17. It is supported by many programming languages, including Python.
  18. It supports Encryption at rest, allowing us to easily keep stored data secure.
  19. It fully integrates with IAM, allowing to keep a least privilege approach regarding authentication and authorization.

Alternatives#

  1. Google Cloud Spanner: It is a RDBMS, meaning that it is not as sacalable and performant for web-scale applications. It requires managing infrastructure like clusters, nodes and networks. Connecting it to other AWS services increased complexity. It had an unpredictable pricing model at the time.
  2. AWS RDS: It is a RDBMS, meaning that it is not as sacalable and performant for web-scale applications. It requires managing infrastructure like clusters, nodes and networks.
  3. Azure Cosmos DB: Pending to review.

Usage#

We use DynamoDB for:

  1. Storing and retrieving all the business-related data in our ASM.
  2. Storing Point-in-Time Recovery backups of all our data.
  3. Storing On-demand backups of all our data.
  4. Keeping a versioned design of our database.
  5. Managing Terraform state locks for all our infrastructure modules.

We are currently migrating to the new Database design, you can track progress here.

Guidelines#

  1. You can access the DynamoDB console after authenticating on AWS.
  2. Any changes to DynamoDB infrastructure must be done via Merge Requests.
  3. To learn how to test and apply infrastructure via Terraform, visit the Terraform Guidelines.
  4. In order to maximize peformance and keeping a simple architecture, we use a single table approach for our database. Please make sure you keep all data within that table.
  5. Please adhere to our current design when modifying the DynamoDB logic, that way we can keep a consistent architecture.
  6. You can open the design with NoSQL Workbench for DynamoDB.