An underrated cluster admin’s companion: dopd

Split brain, with DRBD, is much less of a disaster than in conventional cluster setups employing shared storage. But, you ask, how can I protect my DRBD cluster against split brain in the first place? Here’s how.

Let’s briefly reiterate what split brain, in the DRBD sense, really means. DRBD split brain occurs when your nodes have lost their replication link due to network failure, and you make both nodes Primary after that.

When just the replication link dies, Heartbeat as the cluster manager will still be able to “see” the peer node via an alternate communication path (which you hopefully have configured, see this post). Thus, there is nothing that would keep Heartbeat from migrating resources to that DRBD-wise disconnected node if it so decides or is so instructed. That would cause precisely the DRBD split brain situation described above.

If that were to happen, your cluster manager will have created two diverging sets of data, which are no longer identical. When that occurs, manual intervention is, for all practical purposes, inevitable. Not a desirable situation.

Enter dopd, the DRBD outdate-peer daemon. What dopd does for you is that the second it detects a connection failure between peer DRBD nodes, it will talk to Heartbeat and instruct it to use whatever communication paths it has still available to make contact with the remote node. Then, dopd on the peer node with outdate the DRBD resource there (set the Outdated flag in DRBD metadata). DRBD will subsequently stubbornly refuse to become Primary on that node under any circumstances. That is until the network connection is re-established and DRBD is confident that the local copy of the data is UpToDate again. This effectively prevents DRBD split brain from happening, and will make sure that you cluster service will not run on a cluster node that has a bad (outdated) set of data.

To enable dopd, just add these lines to your ha.cf on both nodes:

respawn hacluster /usr/lib/heartbeat/dopd
apiauth dopd gid=haclient uid=hacluster

You may have to adjust dopd‘s path according to your preferred distribution.

Afterwards, run /etc/init.d/heartbeat reload or the equivalent command for your distribution. You should now see dopd as a running process in your process table (hint: ps ax | grep dopd)

Then, add these items to your DRBD resource configuration (again, on both nodes):

common {
  handlers {
    outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater";
  }
  # other common settings go here
}
resource my-resource {
  disk {
    fencing    resource-only;
  }
  #other resource-specific settings go here
}

Finally, issue drbdadm adjust all on both nodes to reconfigure your resources and reflect your drbd.conf changes.

Now, unplug your DRBD replication link. Observe /proc/drbd on your Secondary:

version: 8.0.5 (api:86/proto:86)
SVN Revision: 3011 build by buildsystem@barschlampe, 2007-08-03 07:44:08
 0: cs:WFConnection st:Secondary/Unknown ds:Outdated/DUnknown C r---
    ns:0 nr:14 dw:14 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
      resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
      act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0

The Secondary is now considered Outdated. If you feel like it, you may now attempt to manually switch over one of your DRBD-backed resources. It won’t come up on the remote node because it now potentially has outdated data.

Re-plug your DRBD replication link. Your Secondary will briefly re-sync and then be in UpToDate state again. A manual Heartbeat resource switch-over should now succeed.

This entry was posted on Monday, October 1st, 2007 at 17:41 and is filed under MySQL, Technical. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

14 Responses to An underrated cluster admin’s companion: dopd

Raynard Lawson says:

February 2, 2011 at 12:07

When our primary host fails, the secondary host becomes the primary, as expected. However, when the troubled host comes back online, it becomes the primary again.
How can we change it so that the troubled host becomes the secondary when it returns from down state and the primary stays as primary.

Reply
- Florian Haas says:
  
  February 2, 2011 at 13:24
  
  Ray, set a non-zero resource stickiness. As in, crm configure rsc_defaults resource-stickiness=200, for example.
  
  Reply
Raynard Lawson says:

February 3, 2011 at 18:59

Florian,
We are on our second attempt now and are building up slowly so we are still at a simple stage. I set the resource stickiness to 200 like you suggested but the troubled host still returns to primary as soon as it recovers. We have also tried setting it to INFINITY but we cannot get the changes to stick (no pun intended).
What are we doing wrong or simply failing to do?

Thanks in advance!
Ray

[root@iscsi1 ~]# crm configure show
node $id=”2a2e2547-45ea-41db-b9c3-2148d60668a7″ iscsi1.elite.net.uk
node $id=”cdb7688f-f502-4062-b07e-be9b0378c124″ iscsi2.elite.net.uk
primitive FAILOVER-IP ocf:heartbeat:IPaddr2 \
params ip=”217.68.243.114″ \
op monitor interval=”5s” \
meta target-role=”Started”
property $id=”cib-bootstrap-options” \
dc-version=”1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3″ \
cluster-infrastructure=”Heartbeat” \
stonith-enabled=”false” \
no-quorum-policy=”ignore” \
default-resource-stickiness=”1″
rsc_defaults $id=”rsc-options” \
resource-stickiness=”INFINITY”

Reply
Raynard Lawson says:

February 4, 2011 at 11:50

We have managed to sort our issue but thank you for your time anyway.

Reply

Florian's blog