Tech Tips: Momentum Performance Tuning Tips
by Oleksiy Kovyrin, Senior Technical Operations Engineer, LivingSocial
We’re pleased to present Oleksiy Kovyrin as our first guest columnist for Tech Tips. This article originally appeared in Oleksiy’s blog and he shared it in the Message Systems LinkedIn group, which is how we first learned about it. If you’re digging into Momentum or any of our products, we urge you to join this smart and highly vocal group. Take it away Oleksiy:
One of my first tasks as part of the technical operations team at LivingSocial was to figure out a way to make our messaging software perform better and deliver faster. We use Momentum, and it is really fast, but I’m always looking for ways to squeeze as much speed out of our system as possible.
While working on it I’ve created a set of scripts to integrate Momentum with Graphite, for all kinds of crazy stats graphing. Those scripts will be opensourced soon, but for now I’ve decided to share a few tips about performance-related changes we’ve made to improve our performance at least 2x:
- Use EXT2 Filesystem for the spool storage — After a lot of benchmarking we noticed that we’ve been doing way too much I/O compared to our throughput. Some investigation showed that the EXT3 filesystem we were using for the spool partition resulted in way too high metadata-update overhead because the spool storage uses a lot of really small files. Switching to EXT2 helped us gain at least 50-75% additional performance. Additional performance gain was caused by turning on the noatime option for our spool.There are some sources who claim that using XFS for spool directories is a better option, but we’ve decided to stick with EXT2 for now.
- Do not use %h{X} macro in your custom logs —Custom logs is an awesome feature of Momentum and we use it to log our bounces along with some information from mail headers. Unfortunately, the most straightforward thing to do (using %h{X} macro) was not the best option for I/O-loaded servers because every time Momentum needs to log a bounce it needs to swap message body in from the disk and parse it to get you the header value.To solve this issue we’ve created a Sieve+ policy script that would extract the headers we need from a message during the initial spooling phase (when the message is still in memory) and put those values into the message metadata. This way, when we need to log those values we won’t have to swap message body in from the disk. Here is the Sieve script to extract header value:SEE FORMATTING FOR THE SCRIPT HERE: http://kovyrin.net/2012/01/07/momentum-ecelerity-tuning-tips/
require [ “ec_header_get”, “vctx_mess_set”, “ec_log” ];
($send_id) = ec_header_get “x-ls-send-id”;
vctx_mess_set “LsSendId” $send_id;
Toggle Comments »After this we could use it in a custom logger like this:
custom_logger “custom_logger1”
{
delivery_logfile = “cluster:///var/log/ecelerity/ls-delivery_log.cluster=>master”
delivery_format =”%t@%BI@%i@%CI@D@%r@%R@%m@%M@%H@%p@%g@%b@%vctx_mess{LsSendId}”
delivery_log_mode = 0664
}
Editor’s Note: For those who are not familiar with Momentum, it was formerly named Ecelerity, and this is still how the product is designated in the code, as in line 3.
- Choose a proper size for your I/O-related thread pools —In the default Momentum configuration the SwapIn and SwapOut thread pool sizes are set to 20. Under a really high load even on our 4xSAS15k RAID10 this tends to be too high a value. We’ve switched those pools to 8 threads each, and this helped to reduce I/O contention and overall I/O throughput.In summary, as with any optimizations, before tuning your system, it really helps to set up as much monitoring for your Momentum servers as possible: Cacti graphs, Graphite (mentioned above), Ganglia or something else — it doesn’t matter. Just make sure you can observe all the aspects of your system performance and understand what is going on with your system before changing any performance-related settings.
# (later used in deliver log)