Cost Optimization in the Cloud
This guide provides general guidance for strategies to optimize various assets in the cloud. When talking about the cloud we will be using the most popular cloud providers as examples ([[Amazon Web Services|AWS]], [[Google Cloud Platform|GCP]], and [[Microsoft Azure|Azure]]).
General compute refers to servers that can be used to handle a large variety of general purpose work in the cloud. Typically, this kind of compute is used for transforming data or hosting a service. General compute services range from fully customizable to managed services where you have less control over the environment and settings.
Examples of general compute services:
- AWS: EC2, Fargate, Batch
- Azure: Virtual Machine, Container Instances, Batch
- GCP: Compute Engine, Cloud Run, Batch on GKE
Before you can optimize anything, you need to turn on metrics to monitor the performance of your service. This monitoring is usually an additional expense but reasonable. If you don’t believe you’ll need it long term you can turn it on while you optimize and then turn it off later.
Examples of metrics monitoring services:
- AWS: CloudWatch
- Azure Monitor
- GCP: Cloud Monitoring
- Datadog
Once monitoring is turned on, focus on understanding your workload patterns and assessing whether your current usage is over-provisioned or under-provisioned. If you realize at this point that your workload is unpredictable, you may want to consider switching to a serverless service.
Rightsizing is a term that means identifying and adjusting specific resources to increase resource utilization and potentially save costs. This adjustment usually happens when there’s an over-provisioning situation. Now that you’ve activated metric monitoring and gathered data on your resource usage, ensure that your instance size is suitable. This is the point where you’ll fine-tune the instance size to match the CPU and memory requirements of your workload.
After rightsizing your compute service, you can typically enable autoscaling to dynamically adjust resources up and down based on demand in your workload. This means that if demand is low, autoscaling will reduce the amount of resources provisioned allowing you to save money. Along with autoscaling, you will typically set high and low thresholds which should be based around your typical workload.
Finally, after exploring the above options, you can usually get significant savings by purchasing savings plans which are typically longer range commitments to use a predetermined amount of a resource. These are great when you know that your workload is relatively steady and predictable. Savings plans are a great high impact and low effort option for saving money.
#placeholder