From On-Prem to Continuous Deployment

These days, we have our heads in the clouds. Or the cloud, rather, as we engineer a fast-moving cloud service and deploy software dozens of times each week. SparkPost helps customers send billions of emails each month using our cloud APIs.

But that has not always been the case.

SparkPost began as Message Systems, which for over seven years specialized in an on-premises commercial software product called Momentum. If you’re unfamiliar with this legendary messaging platform, Momentum powers the email infrastructure of many large senders including Twitter, LinkedIn, Comcast and our own cloud email delivery service SparkPost. Our journey to the cloud has taken us from quarterly release cycles of Momentum to continuous delivery of SparkPost.  We safely and quickly deploy new features and fixes to production as soon as they are ready.  And now we are ready to gradually remove that final manual push button deployment step for most services and transition more fully to continuous deployment.

We decided to provide a cloud service to take advantage of the growth and transition to cloud computing in the email market. It was a fantastic opportunity for us to bring the power of Momentum to a broader developer audience. Our cloud service has brought the company meteoric growth in 2016. We have now completed our transformation to a cloud-first engineering team and company, which required a significant infrastructure and organization evolution.

Here’s what worked and didn’t work on our devops journey towards continuous deployment.

Agile Adoption

In early 2014 we were a little less than a year into the adoption of agile engineering practices. These included user stories, sprint planning and reviews, daily stand-ups, retrospectives, peer reviews, continuous integration, and unit and acceptance test automation. We also started using Atlassian’s tool suite of JIRA, Bamboo, and Confluence. Not long before that we had split up into smaller development teams, with one of these focused on building RESTful microservices, developing advanced analytics capabilities, and a new JavaScript client-side Web UI. This new team, made up of mostly new staff members, incorporated automated tests and continuous integration into their daily practices from the beginning. They became the primary adopter and evangelizing agent of agile development and devops for the rest of the organization over the next two years.

Meanwhile, the core Momentum development team was also building RESTful APIs to support templates and message generation. All of this new functionality was in support of Momentum v4, the next major release of our on-premise product, and would prove to be an excellent foundational API-first architecture on which to build our cloud service when the time came. The core Momentum team gradually adopted a more agile workflow which was no small feat considering the size and maturity of this code base. What a huge improvement over the prior days of development throwing code over the wall to QA.  However, the build and test cycles still measured in days and weeks.

To the Cloud!

Our Managed Cloud service launched mid-year of 2014,  essentially Momentum hosted in AWS.  We did this under the assumption that our Tech Ops team could build out a customer environment in AWS and then install Momentum, just like any on-premise customers would. We targeted this offering at our traditional enterprise customers who did not want to operate the Momentum email infrastructure themselves. The newly formed Tech Ops team consisted of former Support and Remote Management team members and was separate from Engineering at the time. With little initial AWS experience they did a great job building what was our first generation of AWS infrastructure.

We chose AWS because it allowed us to get going quickly and provided a lot of flexibility not available to us if we had decided to build out our own data center. Nevertheless, we borrowed heavily from a normal data center approach, especially when it came to networking, since that is what our team had the most experience with. Additionally, our managed cloud business would be a customer, albeit an important customer, but still without any fundamental changes to the underlying product or how we build, ship, and deploy it. As we rapidly added more features it wasn’t long before we realized that this approach was problematic. There were disconnects between the dev teams and the operations team resulting in inefficiencies. The traditional on-premise installation and upgrade methods were not compatible with a rapidly changing cloud service.

A Startup Within a Startup

Meanwhile, that same summer we formed a small team focused on delivering the beta release of our as-yet-unnamed public cloud service targeted at developers. This team included a handful of application developers, along with a few engineers from the Momentum and Tech Ops teams. We took the approach of “a startup within a startup” to ensure focus on the mission and avoid distraction or blockers from the core on-prem enterprise business. This team built out our second generation AWS environment based on lessons learned from Managed Cloud.  Collaboration improved between development and operations. Now developers deployed code on their own (manually) and provided more guidance on the infrastructure.

To bring this new service to life, with the help of the awesome UX service provider Intridea, the app dev team designed and built a new Web UI. The team followed a very light weight Kanban process with very little overhead. By September we settled on the name “SparkPost” and began to sign up beta users while we readied things for our official beta launch at our user conference later that year.

Part 2 of this series will focus on how we adopted continuous delivery and deployment automation to become more nimble. If you have any questions or comments about our DevOps journey please don’t hesitate to connect with us on Twitter – and we are always hiring.

Chris McFadden

VP Engineering and Cloud Operations

Big Rewards Blog Footer