Why DRBD won’t let you mount the Secondary

As I’m sure you’re aware, DRBD disallows access (any access, including read-only) to a DRBD device in Secondary mode. This always raises questions like the one I’ve taken the liberty to quote here. It came up in a MySQL webinar on replication and HA:

Because of the asynchronous nature of [MySQL] replication we end up with a dilemma when looking at using slaves as read nodes in that the only time we go to the database for information is to build a local cache file, and that local cache file is ONLY removed when information related to that cache file changes, it is NOT based on time. If we had a synchronous method of replication we would then know the cache files were always getting the right information, but because of the asynchronous nature we are prone to old data that never gets invalidated.

One thought I had was to see if using the “backup” DRBD node as a read only type filesystem might accomplish this. I haven’t looked into it much but I know we have DRBD running on a shared storage system now and it only seems that we can mount one of the volumes at a time.

The short version of this, and related questions, is I have a really good reason to mount the Secondary read-only, but DRBD won’t let me, why is that so?

Let’s get into this briefly.

A file system is typically ignorant of what sort of block device lives underneath it — it may be an ordinary partition, a software RAID volume, an LVM LV or EVMS volume, DRBD, any combination of these, or anything else that implements the kernel block device interface. However, any standard-issue filesystem will assume that it is the only thing that accesses a particular block device, that all I/O to the block device occurs through it, and that nothing else will modify the file system while it’s mounted.

Now, think of what would happen if you mounted, say, an ext3 filesystem off DRBD in Secondary mode, even just read-only, and a write occurred to that filesystem on the Primary:

  1. The application issues a write to the filesystem on the Primary.
  2. The filesystem translates that write to block I/O on the DRBD device.
  3. DRBD pushes that write to its underlying backing device, and replicates the written blocks over to the Secondary.
  4. The Secondary now has a modified block device underneath a mounted filesystem.
  5. The filesystem is unaware that blocks may be modified underneath it by applications other than itself, and has no way to deal with this modification.
  6. The filesystem is now grossly inconsistent until it is remounted.

So even if DRBD would allow read-only mounts on the Secondary, all that Secondary would serve you would be garbage. That’s why it’s disallowed.

If you do want access to the device from both nodes, use DRBD 8 in allow-two-primaries mode. No-one forces you to actually build a real dual-master setup, from the application’s point of view. One can always be the “real” master with the other one acting like a slave. However, you must make sure your view of data is consistent on both nodes — either via a shared cluster file system and distributed lock manager (if that is supported by your application), or perhaps via some other facility your application may offer.

Leave a comment