Performance tuning DRBD setups

These days, we seem to be getting a lot of inquiries from new or would-be DRBD adopters, especially MySQL and PostgreSQL DBAs wanting to add high availability to their master database servers. And these, unsurprisingly, turn out to be two of their most popular questions:

How will using DRBD affect my write performance?

… and …

What are DRBD’s most important tunables with regard to write performance?

Let’s take a look at both of these issues. Ready? Let’s go.

Basically, there’s usually one important potential bottleneck in any DRBD “Protocol C” (synchronous replication) setup, and it’s not the network connection. With ubiquitous Gigabit Ethernet links available to be dedicated for DRBD replication, network latency and throughput become negligable variables. The potential bottleneck is, normally, write I/O on the Secondary.

So don’t buy a crappy box for use as the Secondary. The idea of oh I can take a performance hit in case of failover just doesn’t fly. By using a slow Secondary, you take a major performance hit during normal operation as well. Don’t. You want your Primary and Secondary to be on equal terms and interchangeable.

By the way, if your CIO hates the idea of idle hardware, run two cluster services on two servers, with Primary and Secondary roles defined reversely. Just stick an additional NIC in for another dedicated DRBD link, and use separate physical disks for the data used by your services. In that case you still take a performance hit in case of failover, but only in case of failover.

A few basic rules apply for any effort aimed at getting maximum write performance out of DRBD.

First of all, any time is a good time to optimize your I/O subsystem’s write performance. But when you’re thinking about adopting DRBD and HA, it’s usually a particularly good time. You’re probably already in the middle of setting up a testing environment, or have one available already. Why not kill two birds with one stone. Start your optimization efforts without DRBD, and when you’ve found your best setup, add DRBD to the mix.

Secondly, ensure controlled experimentation. Performance tuning is boring and tedious work when done right, so don’t expect to be having a lot of fun. Do record your experimental setups in a reproducible fashion. Do document your findings including benchmark results. When you find out your perfect setup was experiment number 6 out of twenty your ran, you want to be able to actually go back to that number 6. Reliably.

Thirdly, understand that there’s no silver bullet. There exists no “perfect” set of parameters that makes everyone’s storage system break the speed of light. I/O performance depends on multiple factors that will vary greatly based on hardware characteristics and application requirements.

Finally — and if you’re reading this post, I suppose that this is what you’re reading it for –, once you’ve made your I/O subsystem lightning fast, and you’ve added DRBD to your setup, here are the most important DRBD options you might like to tweak:

Activity Log size (config option: al-extents): If fast writes are a priority, try setting this higher. You can climb from the default of 127 up to around 4000, prime numbers work best.
Note that the downside of using a higher AL size is longer resync times when a failed node rejoins the cluster. During resync, your service will be fully available, but the resync itself will inevitably eat some of your I/O bandwidth.
Unplug watermark (unplug-watermark): Think of pending I/O requests on the Secondary as water flowing out of a tap, and the request queue as a plugged sink. When the water (I/O requests) hits the watermark, DRBD pulls the plug (kicks the I/O subsystem for request processing) and thus drains the sink (flushes the queue). We’ve seen controllers that like their buffers filled and never be bothered about them, and others that deliver better performance when being “kicked” frequently.
Maximum I/O request buffers allocated on the Secondary (max-buffers), specified in units of memory pages. The default is 2048 (8MB); if your I/O subsystem is fast, you can increase that number.

As a closing note, I need to emphasize that all your performance tuning efforts will be reduced to esoteric witchcraft if you can’t quantify your results. You must be capable of reliably measuring your I/O throughput and latency at all times. If you’re just a little uncomfortable with that — please don’t take this personally — don’t attempt to tune on your own. There’s plenty of experts out there who will be happy to help out.

This entry was posted on Friday, June 22nd, 2007 at 14:38 and is filed under MySQL, PostgreSQL, Technical. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

3 Responses to Performance tuning DRBD setups

Maxim says:

June 22, 2007 at 17:20

Such an interesting article and your counsel is really informative.

Talking about benchmarks, Bonnie++ is used in the german article (http://www.pro-linux.de/work/virtual-ha/virtual-ha5.html#ToC15) you talked about in a previous article (https://fghaas.wordpress.com/2007/06/05/high-availability-virtualization-with-xen-and-drbd/)

Did you have an experience with this tool ? Is it reliable in your point of view ?

Reply
Florian Haas says:

June 22, 2007 at 21:26

Maxim, there are quite a few benchmark suites available. Which one suits you best depends largely on your application. As for “pure” filesystem-centered I/O throughput benchmarks, Bonnie++ is one of your options, as is IOzone. However, other benchmarks may suite you better if you’re tuning a database.
Please don’t hesitate to drop me an email giving some more details of your setup if you’d like to get into this further.

Reply
xentutorial.com says:

December 31, 2010 at 5:22

Performance tuning DRBD setups…

How will using DRBD affect my write performance? … //fghaas.wordpress.com/2007/06/05/high…

Reply

Florian's blog