One of the things that repeatedly seem to puzzle users about the DRBD is the question of whether to use internal or external metadata. Remember, DRBD sets aside a small area on a local disk (on every cluster node) where it keeps the Activity Log, the quick-sync bitmap, data generation UUIDs, and a few other bits and pieces for local housekeeping.
The specific aspect that is to be discussed here is the Activity Log. Without going into too much detail, let’s be satisfied with the factoid that DRBD “occasionally” (it’s a little more involved in reality) writes to the AL, and has to wait for that write to complete before it can handle user data again. This wait is the crucial point. It’s usually on the order of just a few milliseconds, but on busy systems this can add up to where it throttles throughput just a little.
Now, what makes I/O fast or slow (on rotational hard drives, solid state is a different matter)? That’s right, it’s disk seeks. So when we use internal meta data, so the theory goes, the read-write head has to do something in the data area, then move to the AL and do something there, then move back to the data area, and so forth. Which, intuitively, can be speeded up if you put user data and meta data on different spindles. Different “logical” disks won’t do, it has to be on a separate spindle, so read-write heads can move in parallel. Again, this is as the naïve theory goes. Use external meta data, devise a clever scheme on how to spread your meta data apart from your user data, and you’ll be fine. And you can call yourself a great wizard in storage subsystem tuning. Well, not quite, unfortunately.
The problem is, you’ve made a crucial mistake in performance tuning. You are completely ignoring the effects of a battery-backed write cache. If, as we always recommend, you use a reasonable useful storage controller, which comes with a decent write cache and a battery backup unit, then the whole issue is moot. Because then you are no longer waiting for actual disk seeks to complete. What you think you are writing into disk sectors actually goes into a piece of controller RAM, and completes pretty much instantaneously. It’s the controller’s job to get this stuff onto stable storage later, and guarantee that it does so even in the face of a power failure. That’s what the BBU is for. But the whole idea of avoiding disk seeks for meta data writes is pretty much irrelevant now.
Which means you can scrap your grand user data/meta data distribution scheme and focus on important issues.
Bottom line: if using external metadata actually improves your performance versus internal metadata, you have underlying performance problems to fix. And you should fix those rather than patch them up at the DRBD level.