Machine learning (ML) is getting very popular across organizations of all sizes and shapes, driven by the desire to gain insights from the data they collect from their customers and across the organization. Whether you are a large organization collecting data using sensors and other IoT devices or a small business storing customer and employee information in SaaS applications, machine learning can help you leverage the insights hidden in the data. Even though there are many open-source frameworks that aid developers build ML models, deploying them, and making it available to users is not an easy task. It usually involves running virtual machines or containers that are expensive and comes with high operational overhead. Even fully managed solutions like AWS Sagemaker are expensive for many ML/AI use cases.
Serverless Computing services can help developers and organizations deploy machine learning models in a more cost-effective manner. Instead of running a virtual machine or a cluster of containers 24/7 and handle the operational overheads, Serverless Computing can be tapped to run cost-effectively. With Functions as a Service (FaaS), deploying machine learning models or tapping AI services will be straightforward without any operational overhead and users only pay for the function invocations. While Serverless Computing may appear like a no brainer for ML and deep learning workloads, it is not as simple as it appears and nuance is key.
Machine Learning with FaaS
Functions as a Service is useful for deploying machine learning models where there is a straightforward prediction for the input. Some of the examples of machine learning with FaaS are:
- Image recognition or Object recognition
- Face detection
and similar lightweight use cases like these. Also, FaaS can be used with an API gateway to provide API access to your machine learning models. This allows an easy on-ramp to offer ML models as a service to end-users.
The basic architecture for hosting a machine learning model is pretty simple. You need to frontend the function using a CDN or API gateway. Use object storage or some other datastore (depending on the purpose of your machine learning model) to host your input for the function, the FaaS offering (one could use AWS Lambda with AWS Lambda Layers to package the necessary libraries, Azure Functions, Zoho Catalyst, etc.) to execute the function hosting the model and object storage or other datastore to host the model and frontend. Even though this architecture is too simplistic for deploying complex ML and deep learning models, it gives an idea about how easy it is to deploy ML models in FaaS.
Using FaaS for ML/AI applications makes sense when the model is lightweight or the function is an encapsulation of a call to an AI service like AWS Rekognition or Zoho Object Detection or Google AutoML. If your ML application has sparse or limited traffic, it makes complete sense to host it on FaaS as the cost of deploying the model will be in pennies. If your application requires quick scaling, FaaS can come handy as the functions can be invoked concurrently and it is easy to scale the model without any operational overhead. Clearly, ML/AI with FaaS is a viable option for most small businesses wanting to tap machine learning for their business needs.
However, FaaS is not the right solution for all machine learning/deep learning models:
- FaaS is limited by the size of compute and memory. Similarly, most FaaS providers limit the execution time. So, FaaS is not suitable for some heavy-duty ML and deep learning jobs. While one could use nested function to handle some of the more complex models, it is fair to say that FaaS is mostly suitable for lightweight machine learning models. Most of the deep learning models are not a good fit for FaaS
- Deep learning models requiring GPU are not supported by FaaS providers right now
- The FaaS options like cold start and limitations related to concurrency could also limit the use of FaaS for certain ML workloads
In conclusion, even though Serverless Computing is a good fit for machine learning and AI workloads, it is important to understand its limitations and use the service with the right workloads. When done right, Serverless Computing offers dramatic cost savings due to the efficient use of resources and lack of operational overhead.