The Magic Of Autoscaling With Serverless

One of the advantages of Serverless touted by cloud providers is seamless autoscaling. Ever since cloud computing gained traction, autoscaling is seen as the biggest advantage organizations can leverage for unexpected traffic. Autoscaling was seen as a major leap from traditional IT where capacity planning is critical and resource inefficiencies (and hence increased costs) are the norm. With the success of web applications and the unpredictability in traffic patterns, traditional approaches to scaling servers were not working. In this post, we will talk about how autoscaling works in the cloud and the advantage of Serverless in scaling up and down to meet the demands.

Autoscaling in Virtual Machines, Kubernetes and Serverless

With Cloud Computing and programmatic access to infrastructure, autoscaling became the norm. Users need not worry about whether their applications can meet a sudden spike in demand and the underlying infrastructure can be scaled up the meet the demand without any manual intervention. Once the demand cools down, the resources can be automatically brought down. In this case, the resources can be used efficiently with the associated cost savings compared to traditional approaches to scaling resources to meet demand.

The autoscaling in the cloud comes in different flavors depending on what type of cloud service you are using:

  • Unlike traditional IT, where servers should be procured in advance to meet the scaling needs, the cloud gives programmatic access to scaling up just as the demand spikes and scaling down immediately after the demand goes down. With virtual machines on the cloud, users can use an autoscaling service that will automatically add additional virtual machines to meet the demand by taking into account CPU, memory, and network usage. While the autoscaling services can seamlessly bring up virtual machines and route traffic, it adds up significant operational overhead for developers to ensure that applications and the dependencies scale well in a scale-out architecture. Plus, virtual machines take a few minutes to boot up, and scaling is not instantaneous and some minimal capacity planning is critical for high performance. Plus, autoscaling with virtual machines might lead to resource inefficiencies if it is not managed well
  • With containers gaining traction and Kubernetes becoming the de facto container orchestration tool, scaling became much more seamless because Kubernetes uses a declarative model and developers can simply define their end state in a YAML file. Kubernetes will take care of autoscaling, drastically reducing the operational overhead on developers. However, if Kubernetes is deployed on virtual machines in the cloud (say, on top of Amazon EC2), there is still an operational overhead in scaling the underlying nodes. Services like AWS Fargate take away these operational complexities of managing the virtual machines but developers are faced with YAML complexity to ensure that the Kubernetes environment meets the scaling needs of the application
  • Serverless computing, especially the hosted offerings, takes the pain out of autoscaling and makes it seamless for developers to scale their application to meet the demand. Serverless offerings like AWS Lambda, Azure Functions, Catalyst, and others scale the infrastructure to meet the demand without any operational overhead or YAML complexity. Developers can just focus on the business logic and code and the Serverless compute offering will seamlessly handle the scaling needs. Since the compute costs are calculated based on invocations and it shuts down automatically after execution, autoscaling is so easy that anyone with zero operational knowledge can handle it. Moreover, the resource usage in the case of Serverless is more fine-grained and, therefore, the cost savings are much better without any resource waste

Autoscaling Patterns in Serverless

With Serverless compute, also known as Functions as a Service, there are some distinct autoscaling patterns that can be used to meet the demand. You could either invoke the function for every request and scale based on the requests. With certain Serverless offerings like AWS Lambda, it meets the cold start problem and you need to use a warm pool to avoid the delay due to cold start. But, keeping a warm pool costs more money in the case of hyperscale providers like AWS. The next-gen Serverless platforms like Catalyst, Nimbella, IBM Functions, and others ensure resource optimization in the backend to avoid the delays due to cold start.

Another approach to scaling with AWS Lambda and other Serverless offerings with the cold start problem is to allow a function invocation to handle multiple requests per function invocation. While this solves the colds start problem and even save some money in the invocation costs, it doesn’t completely avoid the cold start. Plus, it adds additional overhead on how applications are architected to follow this complex invocation pattern.

Serverless offers the most efficient (both in terms of resources and cost) way to autoscale to meet the demands. While using services like AWS, it is important for the developers to understand some of the constraints such as cold start and the complexity associated with concurrency. There are other platforms available that eliminate the cold start problem and give you a more straightforward way to autoscale your applications.

Why Enterprises Should Go Serverless?

Most of the discussions in the enterprise user community is focussed more on Containers than on Serverless computing. Part of the reason for this focus on containers is due to the legacy enterprise mindset which is still focussed on the idea of managing servers. Containers are seen as a natural evolution to the legacy idea of servers. But, many enterprises are facing the risk of missing out on the Serverless advantages due to their mindset of “clinging to servers”.

Relative to on-premises and private cloud solutions, the public cloud makes it significantly simpler to build, deploy, and manage fleets of virtual servers and to run applications on them. However, companies today have additional options beyond classic server or VM-based architectures when they take advantage of the public cloud. Although the cloud eliminates the need for companies to purchase and maintain their own hardware, any server-based architecture still requires them to architect their applications for scalability and reliability. This adds high development costs to organizations. Plus, companies need to own the challenges of patching and deploying to those (virtual) server fleets as their applications evolve over time. Moreover, they must also handle the scaling of their server fleets to account for peak load and then scale them down after the load returns to normal to lower their costs—all while protecting the experience of end users and the integrity of internal systems. Idle, underutilized servers prove to be costly and wasteful. Analysts estimate that as much as 85 percent of servers in practice have underutilized capacity. By “clinging to servers”, organizations are wasting their resources and not maximizing their move to public cloud.

Serverless compute services like AWS LambdaGoogle Cloud FunctionsZoho CatalystAzure Functions, are designed to address these challenges by offering companies a different way of handling the application design, development and deployment, one with inherently lower costs and faster time to market. These Functions as a Service offerings eliminates the complexity of dealing with servers or server based architectures at all levels of the technology stack, and introduces a more fluid pay-per-request billing model where there are no costs from idle compute capacity. Additionally, these FaaS offerings enable organizations to easily adopt microservice architectures that are more modular and resilient. Eliminating the need to manage the infrastructure and moving to a Serverless model offers enterprises dual cost advantages:

  • Problems like idle servers simply cease to exist, along with their economic consequences. A serverless compute service like AWS Lambda is never “cold” because charges only accrue when useful work is being performed, with millisecond-level billing granularity. Enterprises can offload the idle compute costs to the cloud provider and this adds up big over a period of time
  • Infrastructure management and server operations (including the security patching, deployments, and monitoring of servers) are no longer necessary. This means that it isn’t necessary to maintain the associated tools, processes, and on-call rotations required to support 24×7 server management to ensure uptime. Without the high costs of operations, organizations can direct their scarce IT resources to foster innovation, thereby benefiting the bottom line

Clearly, moving to Serverless computing can help enterprises save costs, be agile and ensure continuous innovation. The biggest roadblock to achieving this nirvana is their legacy “clinging to servers” mindset. If enterprises can overcome this legacy mindset and embrace Serverless, they can face the needs of modern economy with more confidence.