Why we made the choice for AWS instead of a Data Center

SparkPost has been in the email infrastructure game for 15 years, better known as Message Systems and Port25.  Over that time, our software—the MTA that powers our cloud service—has run in the data centers of the world’s top Email Service Providers (ESP), as well as other high-volume senders such as LinkedIn and Twitter.

Why We Chose AWS

When planning the launch of our email API and SMTP service in 2014, we decided not to build out our own data center and instead to build on top of Amazon Web Services (AWS).  This choice has given us a lot of flexibility and enabled us to grow our service rapidly. Since we’re not splitting our focus between our core business and the details of operating a data center, we can keep our technical staff focused on building new features and improving the overall customer experience.  In early January 2014 we were in beta and handling a few million messages a month.  Eighteen months later we are handling tens of billions of messages a month for not just the tens of thousands of active users on our SparkPost cloud offering, but also for very large SparkPost customers including Pinterest, Zillow, and Careerbuilder.

Our #1 goal is to be the leader in cloud email infrastructure. We have no desire to be experts in data centers, much the same way our customers do not want to be experts in running email infrastructure at scale.  When building out a data center, by definition, you can only solve the problems you’re facing at that point in time. It can be very difficult to make the adjustments you need as your business and technical needs change rapidly over subsequent years.

In Good Company

The trend in the last few years has been increasingly for companies to leverage public cloud infrastructure instead of building out and maintaining data centers.  As told in the Wall Street Journal, in 2011 Zynga spent $100M to move off of AWS and build their own data center.  However, by mid 2015 they reversed course and decided to move back to AWS as part of a $100M cost cutting effort.  “There’s a lot of places that are not strategic for us to have scale and we think not appropriate, like running our own data centers,” Zynga CEO Mark Pincus told investors on a conference call. “We’re going to let Amazon do that.”

When looking at the costs between using public cloud infrastructure and building out your own it is hard to get a true apples to apples comparison.  It can be easy to calculate the hard costs of each but the real challenge is calculating all of the soft costs.  Running your own data center requires a large number of highly skilled staff responsible for hardware and networking, redundancy and disaster recovery, data center security, database administration, and hardware lifecycle management.   With a public cloud infrastructure these things are taken care of behind the scenes, allowing your engineering team to focus on building out your service.

We let the thousands of experts at AWS manage the infrastructure in a much more reliable, secure, and cost effective way than we would.  For example, we use Amazon’s Availability Zones to easily distribute our servers across their fiber linked data centers which provides huge high availability gains that would be hard and very expensive to accomplish on our own.  And the level of proactive security measures employed by Amazon’s team is a tremendous value that helps us sleep better at night.

The Real Advantages

However, a server to server cost comparison is missing the point of using the public cloud.  The real advantages are not in immediate hardware cost savings.  When using the cloud, you only pay for what you use. The flip side is you can get what you need as soon as you need it.  We have found this to be very useful in making our capacity planning so much simpler.  As we need more capacity we add it just in time.  No purchase orders required, no waiting for servers to be shipped and then racked.  We can use the AWS console or APIs to spin up new servers or databases in minutes, including automatically using Amazon’s autoscaling capabilities.

This increased speed in provisioning more computing resources was critical to us when Mandrill announced that they were moving away from the transactional email space and recommended SparkPost as an alternative.  We had to add capacity to our platform several times over the course of 2016 to keep up with growing demand.

Netflix provides some good insights into their multi-year transition to AWS they completed earlier this year.  As Yury Izrailevsky, VP Cloud and Platform Engineering for Netflix notes, they realized significant improvements in service availability along with more efficiency due to the economies of scale and improved utilization rates available in the on-demand public cloud.  We’ve seen similar service availability benefits from the underlying stability and high availability features of AWS and the improved utilization efficiencies.

Focus on Strengths

We can reduce our costs even further by purchasing Reserved Instances (RI) on one and three year terms. For capacity we know we will need on an ongoing basis, this significantly reduces our baseline costs. It also maintains the flexibility of on-demand pricing for when we need to spin up a new virtual server to handle peak workloads.

Shifting focus from data centers to building the leading cloud email service is one of the best decisions we’ve made. It greatly reduced our time to market and continues to help fuel our ongoing agility in meeting customer needs.  We have found AWS to be a great partner with fantastic technology and services. I encourage you to read more about how we use AWS in our blog post about our Technical Operations stack. If you have any questions about our experience at AWS please reach out on Twitter or our community Slack channel.

-Chris McFadden
VP Engineering, SparkPost