This article contains everything you need to know about serverless. I've updated it with the latest trends and information for 2021.

There has been some disagreement in tech circles about what actually is Serverless? So let's start there.

What is serverless?

Serverless computing (or serverless for short), is a way for developers to execute code without having to deal with the complexities of a server. The developer simply provides code to be run, and the serverless platform manages deployment, resource allocation, routing and execution.

Of course, a server still exists within the serverless provider's infrastructure, but from the perspective of the developer, it is serverless, as they do not see or interact with any server.

What is a serverless service?

A service is used to describe a co-located group of code, that is compiled and run as one unit. Usually a service will expose a HTTPS endpoint, which can be used to trigger different parts of the code based on the URL parameters (and headers). Serverless services are the same as normal services, except they run in a serverless environment.

How is it triggered?

Serverless code is usually triggered in response to a HTTPS request, but some providers offer other options such as via a message queue or other event system. Most serverless platforms cannot be triggered at a regular interval or with a delay, but this can be achieved using external systems like Ralley.

How does it scale?

Serverless providers scale your service based on the number of requests they receive. They can do this, as they control the deployment and resource allocation and can therefore dynamically scale your service up (by deploying the code to more of their servers) or down (removing it from their servers) in response to the amount of execution requests.

Is serverless scaling infinite?

Yes, most providers can scale services infinitely, but there are sometimes limits on how quickly you scale up for new accounts. This is to prevent abuse or to prevent customers accidentally over-scaling and receiving unexpectedly large bills. If you're expecting a very sudden increase in traffic, you may want to reach out to your provider to make sure any limits are adjusted or removed.

What is a cold start?

A cold start is when a serverless provider scales your service down to 0 (because there have not been any execution requests recently). As the code is not deployed on any of their servers, there will be a significant delay before executing your code while they deploy and provision it. This can take anywhere from 5-30 seconds, depending on the provider.

How can I prevent a cold start?

More recently, some providers like GCP Cloud Run enable you to set a minimum. Another solution is to use a service like Ralley, to send a dummy request (e.g. to an endpoint that simply returns OK) to your server every 1 minute to ensure that the code is not scaled down to 0.

How are you charged for Serverless?

This will vary by provider, but it most cases you are charged only for the CPU and memory time used to execute your code. That means, if you code is not executed it you will not be charged.