Skip to main content

Cloud Object Storage (S3)

Rationale

AWS S3 is the service we use for storing files in the cloud.

The main reasons why we chose it over other alternatives are:

  1. It is SaaS oriented, meaning that in order to start storing data, we only need to create a bucket. We do not have to worry about storage space, infrastructure scalability, data availability, data persistence, among many other infrastructure-related concerns.
  2. It complies with several certifications from ISO and CSA. Many of these certifications are focused on granting that the entity follows best practices regarding secure cloud-based environments and information security.
  3. Resources can be written as code using Terraform.
  4. It supports static website hosting, allowing us to easily host sites like our website and our documentation.
  5. Its static website hosting provides direct endpoints, meaning that dealing with load balancers and static IP addresses is not required in order to expose a site to the Internet.
  6. It can be easily integrated with Cloudflare, allowing us to implement DNS, edge cache, Redirections, Security headers, among many other Cloudflare features.
  7. It supports presigned URLs that can be used for creating signed download links that can only be accessed by the user with the generated key. Such links can have an expiration date. This feature greatly reduces the chance of data leaks.
  8. It supports versioning, allowing us to keep a complete historic of all stored objects.
  9. It supports storage lifecycle, allowing to declare policies for expiring files and moving them to different storage classes.
  10. It can be programmatically accessed using the AWS CLI and other language-specific libraries like Python's Boto3, allowing us to connect our applications to it.
  11. It can be used by Terraform as a backend to store its state.
  12. It supports AES256 server-side-encryption.
  13. It supports access control lists with an object-level granularity, allowing to have full control regarding object access privileges.
  14. It supports bucket policies, which are specially useful when making a bucket only accessible from a CDN in order to avoid CDN bypassing.
  15. It supports Storage Lens, an analytics module for visualizing insights and trends and optimizing usage.

Alternatives

  1. Google Cloud Storage: It did not exist at the time we migrated to the cloud. It does not provide direct endpoints, meaning that load balancers and static IP addresses are needed in order to expose a site to the Internet.
  2. Azure Blob Storage: It did not exist at the time we migrated to the cloud. Pending to review.

Usage

We use AWS S3 for:

  1. Serving Docs environments.
  2. Serving Airs environments.
  3. Serving ARM front environments.
  4. Creating ARM signed URLs
  5. Storing ARM resources, evidences, reports and analytics.
  6. Storing Sorts trainings.
  7. Storing Skims data.
  8. Storing GitLab CI cache.
  9. Storing Terraform states.

We do not use AWS S3 for:

  1. Storing multimedia for our sites like images and videos. We use Cloudinary instead.

Guidelines

  1. You can access the AWS S3 console after authenticating on AWS.
  2. Any changes to S3's infrastructure must be done via Merge Requests.
  3. To learn how to test and apply infrastructure via Terraform, visit the Terraform Guidelines.