Azure FinOps Essentials

Optimizing Azure Costs with Autoscaling

Welcome to this week's edition of Azure FinOps Essentials! 🎉

I'm thrilled to have you on board as we continue our journey to master cost efficiency on Azure. Each week, you'll receive actionable tips to help you optimize your Azure environment and keep your cloud costs under control.

In this edition, we delve into the power of autoscaling. Discover how to use autoscaling to dynamically adjust resources based on demand, ensuring optimal performance and significant cost savings.

Dive in and start scaling smartly!

Cheers,
Michiel

Introduction

Autoscaling is one of the most powerful features of using a cloud platform like Azure. It allows you to dynamically adjust the number of compute resources based on the current demand, ensuring optimal performance and cost efficiency. In the context of FinOps, autoscaling is a prime example of how to effectively leverage the value of the cloud.

The Concept of Autoscaling

In a traditional data center, you need to purchase and provision resources in advance to handle peak loads, often leading to overprovisioning. This means you have to invest in and maintain hardware that may sit idle for most of the time, resulting in wasted resources and higher costs. The cloud, however, eliminates this need. With autoscaling, you can automatically scale your resources up or down based on the workload demand, ensuring you only pay for what you actually use.

Workload Optimization in FinOps

Within FinOps, this practice is referred to as Workload Optimization. Workload Optimization involves a set of practices that ensure cloud resources are:

- Properly Selected: Choosing the right type of resources for your workload.

- Correctly Sized: Ensuring resources are not over or under-provisioned.

- Only Run When Needed: Shutting down resources when they are not in use.

- Appropriately Configured: Setting up resources to meet performance requirements efficiently.

- Highly Utilized: Maximizing resource utilization to avoid waste.

These practices aim to meet all functional and non-functional requirements at the lowest possible cost.

Why Use Autoscaling?

Your workload is rarely static. It can vary significantly based on various factors such as:

- Time of Day or Month: Higher workloads at the start or end of the day/month.

- Campaigns or Events: Sudden bursts in demand due to marketing campaigns or special events.

Maintaining maximum resource levels to accommodate these variable workloads would be inefficient and costly. By applying autoscaling, you can automatically adjust resources to meet demand, scaling up during peak times and scaling down when demand decreases. This ensures that you are not paying for idle resources and can save a substantial amount of money.

Horizontal and Vertical Scaling

There are two major ways to scale: horizontal scaling and vertical scaling. With horizontal scaling, you add or remove instances. This is highly effective when there is no state on the machines, although it might introduce the overhead of managing multiple machines.

Vertical scaling involves adding or removing capacity of a single instance or a group of instances, essentially throwing more hardware at existing systems. Unfortunately, this means a redeployment which can make the system unavailable.

Combining both methods can be very effective; for example, you can scale down your development environment outside business hours by reducing the number of instances (horizontal scaling) and selecting lower SKUs to further reduce costs (vertical scaling).

Scaling in Azure

So let's see what is needed to implement this in Azure. You need some instrumentation, such as CPU usage, queue length, memory, or in its simplest form, a schedule. Next is the logic to actually make a decision to invoke the scaler. The scale function should then talk to the component and validate the effect.

There are a number of services in Azure that handle this scaling for you, including Azure Virtual Machines using Virtual Machine Scale Sets, Service Fabric, Azure App Services, and Azure Functions. You can set the scaling rules in the UI, using ARM/Bicep, or through the API.

Azure App Services also has an automatic scaling option, next to manual and rule based scaling

Configuring the right rules is a complicated process. You need to consider which metrics make the most sense and monitor the effect closely so you can adjust as needed.

Carefully design scale-down rules as well. For example, Event Hub has an auto-inflate option but lacks an auto-deflate, so you can end up running with a maximum number of throughput units.

Also, consider different scaling rules for each environment. Your non-prod environment can use more time-based scaling and might even scale down to zero. While production might need a timely start at the beginning of the day and react to queue length to scale along.

Conclusion

Scaling, both horizontally and vertically, is an excellent way to optimize costs in the cloud while delivering value. By analyzing your workload and leveraging automation offered by various Azure services, you can scale up and down according to demand. Although we haven't touched on KEDA, a framework used by Azure Container Apps to control scaling based on various metrics, it's worth noting that KEDA can even scale based on metrics like datacenter emissions.

Embrace scaling to enhance cost efficiency, but always implement it with automation and continuously monitor the effects to ensure optimal performance and cost savings.

Thanks for reading this weeks edition. Share with your colleagues and make sure to subscribe in order to receive more weekly tips. See you next time!

P.S. I have another newsletter about GitHub, Azure and dotnet news. Subscribe as well to keep informed:

MindByte Weekly Pulse: Quick GitHub, Azure, & .NET UpdatesGet to the heart of GitHub, Azure, and .NET with MindByte Weekly Pulse. Every week, find concise, expert-curated insights and trends straight in your inbox. Designed for IT professionals, it's your...

Reply

or to participate.