Usage-based billing is a popular payment model in cloud-based SaaS platforms. The idea is that you only pay for the amount of data your project uses, which is great for companies that have finite metrics on their usage amount.
It’s like getting a glass of water from your kitchen sink. Every time you open the faucet, you know your water bill will go up a few cents (actually much less than that, but for example). So let’s say the water faucet is the cloud platform, and the glass represents your API requests.
One of the ways people receive exorbitant usage-based bills is from critical mistakes in the API request code. For example, your neighbor asks for a glass of water, and you fill 10 glasses. Or you forget to turn the faucet off after you fill the glass. This is very similar to what can happen in usage-based billing on SaaS platforms, especially for cloud-based API requests.
An expensive example of data requests gone wrong
A crowdfunding app in Colombia known as Vaki was shocked to receive a $30,000 USD bill from Google Firebase, due to some pretty simple mistakes. The company had rushed their product to launch, then noticed the app was running incredibly slowly after a version merge.
Vaki had two choices – either take down the website, or try to debug the app in a race against the money-clock. They chose the latter option, and discovered that their API was coded in such a way that with every visitor to their site, they needed to call every document of payments in order to see the number of supporters of a campaign, or the total collected, on every page of their app.
In other words, they’d sent over 40 billion requests to Firestore in less than 48 hours. An immensely costly mistake, although they managed to work it out with Google in the end. One problem is that they worked on the code while the site was already live, when any of these tools for API testing may have helped them find the problem before launch.
It is very important that tech teams debug every request to servers before release. Analyze if the number of requests and data transfer make sense, and if your company will support the costs of the host with a big load of traffic. Otherwise, you will just catch loops, or un-optimum requests with a huge bill or with your site down.
What exactly can you learn from this?
While cloud-based databases like Amazon, Google Cloud, and Microsoft Azure provide a pay-per-hour model, Firebase bills based on a 100k – 250k read/write/delete requests to the database. So technically if you stay within this range, your bill shouldn’t amount to more than $25.
But this also means you need to have perfect code with no mistakes or loops. What happened with Vaki is that they were reading every document entry in a certain collection, to calculate the total collected and support values for each time a user looked at a particular view.
In other words, if a crowdfunding campaign had 100 donations, the app would call to the database once, but the reading would equate to 100 reads twice – which means a total of 200 read requests.
Obviously, when you hit something like 2 million views on a campaign, the read requests to the database will very quickly exceed the billions. And to Google’s credit, this isn’t a get-money schemed cooked up by them, because it’s very easy to stay within the target metrics.
In Vaki’s case, it was a simple human-error in data structure that wasn’t optimized for efficiency, and eventually they would’ve ran into other problems such as maxed out database connections.
How to keep usage-billing to a minimum
Hosting apps today are fairly cheap, but databases can quickly grow out of control if they aren’t optimized efficiently. When it comes to traffic flow between the different parts of your application, read and write requests are the most frequent occurrence in the entire flow.
So even though the front end may be fast to boot, it’s your backend that needs to be checked, especially as serverless architectures don’t typically offer backend data sessions. So as the backend requires more space and processing power, your overall bill increases.
There’s a few key takeaways from all this. One is that you’ll need to re-think your approach to how your data will have an impact on the total number of reads and writes, and how much processing power is required, which basically means optimizing your code for efficiency and long-term stability.
The other takeaway is the importance of testing your API before launch, so that you can catch the sort of flaws that Vaki experienced before the product actually goes live.