CloudWatch
Last updated: Mar 2, 2026
Rationale
We use CloudWatch for monitoring our entire AWS infrastructure. We can monitor our applications, react to performance changes within them, optimize resource utilization, and get a unified view of operational health. The main reasons why we chose it over other alternatives are the following:
- It is a core AWS service. Once we start creating infrastructure, CloudWatch begins to monitor it.
- It integrates seamlessly with most AWS services. Some examples are EC2, S3, and DynamoDB.
- It complies with several certifications from ISO and CSA. Many of these certifications are focused on ensuring that the entity follows best practices regarding secure cloud-based environments and information security.
- It supports custom dashboards for visualizing metrics using diagrams like bars, pies, and numbers, among others. Other customizations, such as timespans and resource metrics as axes, are also available.
- It supports alarms using AWS SNS, allowing email notifications to be triggered when resource metric conditions are not met or when anomalies are detected.
- Resources can be written as code using Terraform.
Alternatives
GCP Cloud Monitoring and Azure Monitor are alternatives that did not exist at the time we migrated to the cloud (a review of each of them is pending).
Usage
We use CloudWatch for monitoring
- EC2 instance performance;
- EBS disk usage and performance;
- S3 bucket size and object number;
- ELB load balancer performance;
- Redshift database usage and performance;
- DynamoDB tables usage and performance;
- SQS sent, delayed, received and deleted messages;
- ECS cluster resource reservation and utilization, and
- Lambda invocations, errors, duration, among others.
We do not use CloudWatch for
- Synthetic monitoring (we use Checkly instead);
- ServiceLens (it only supports Lambda functions, API Gateway, and Java-based applications);
- Contributor Insights (we use Cloudflare instead);
- Container Insights (we use New Relic; pending review);
- Lambda Insights (we currently use Lambda for a few non-critical tasks);
- CloudWatch agent (it could increase visibility for EC2 machines; pending review);
- CloudWatch Application Insights (it only supports Java-based applications), or
- writing our alarms as code using Terraform (pending to be done).