LINBIT mount their bikes to support Butterfly Children

September 17, 2009

A completely non-technical post for a change.

Last weekend, seven employees of LINBIT’s European division participated in the World Games of Mountain Biking in Saalbach-Hinterglemm, Austria. We took part as Marathon race participants and co-sponsors of Biking for Butterfly Children, a charity dedicated to the fight against epidermolysis bullosa (EB).

Currently, no cure for any of the over 30 subtypes of EB exists — dermatologists and care givers can, however, greatly improve patients’
quality of life. Still, EB can be an excruciatingly painful, disfiguring, and debilitating disease that affects one in 20,000 live births and as such, makes the condition an orphan disease. The health care industry has little incentive to research the condition (as there is little money to be made off of it), and those who dedicate their careers to EB patient care and research rely on charitable donations for funding. Biking for Butterfly Children acts as a reliable fund raiser rounding up much-needed donations in the course of amateur cycling events (such as the World Games).

During this year’s World Games, B4BC raised a total of about 7,000 euros in donations — a respectable sum for an all-amateur event, but a lot more money is needed to improve the quality of life of EB patients, and potentially discover a cure to the disease. If you consider joining the fight against EB, please contact your local DebRA chapter.


“Alternatives” to DRBD

September 16, 2009

Every once in a while, people ask us something along the lines of “why do I need DRBD? Can’t I accomplish what it does by other means?” You mean build high availability clusters with block-level synchronous replication? Well, sure you can. But all of the available alternatives have serious drawbacks.

  • Use a SAN. Well, SANs are great in terms of management, even though they’re sometimes prohibitively expensive. But your regular SAN box does not offer physically distributed redundancy at the data level. In order words, if your SAN box goes down, all your beautiful high availability infrastructure turns to shreds. And even if you naïvely believe (and you shouldn’t) that a storage box can never crash, just think about air conditioning going down in just that part of your data center where your storage shelf is at. Repair time of several hours means down time of several hours, even though your servers in a different cabinet may be up and running. You’re dealing with a single point of failure. Not high availability in my book.
  • Use a SAN with native replication. This means using two separate storage boxes with synchronous block-level replication between them. This eliminates the above-mentioned SPOF and is available from just about any SAN vendor (using proprietary implementations under various product names). The downside is that it costs you serious bucks, and I am not referring to just the additional piece of hardware. Those firmware licenses can hit six figures. Plus, switchover times (changing the direction of replication) can be extremely long, up to 4 minutes in some cases. And there is little to no support for replication management from open source cluster management software.
  • Use a SAN with host-based mirroring. This means that you have two separate SAN boxes, hosts import LUNs from both, and mirror those pairs using software RAID (such as md). Eliminates the SPOF and saves you dollars on firmware licensing. Downsides: you still need a SAN (and the associated infrastructure — fibre channel, for example, isn’t exactly cheap), and as such your clusters are still not shared-nothing. And the integration with open source cluster management is also lacking.
  • Use host-based mirroring between a local device, and a network block device. In this case, you have one disk that is local, and another that is exported from a remote host using NBD or iSCSI. Those two disks are then mirrored with software RAID. Now this one is really terrible in terms of management. Role reversal always requires some custom glue, no support from cluster manages is available whatsoever, and split brain detection is poor or non existant. So if you really want to go down that alley then do — but please don’t call it a high availability cluster.

Compare this to DRBD: no need for a SAN, so you can use it to operate a fully shared-nothing cluster. No firmware licensing cost. No need for expensive infrastructure as everything can replicate over regular IP networks. Role switch in a matter of seconds. Tight integration with both Pacemaker and Red Hat Cluster Suite.  Excellent split brain detection to make sure you don’t wreck your data accidentally. So if you are considering alternatives then that’s perfectly fine, but our soaring usage numbers are there for a reason.


On MySQL Replication, cluster managers, and DRBD (again)

August 31, 2009

With all of the discussions over the past years about MySQL Replication vs. DRBD (where the “vs.” part is in fact grossly misled of course — they are two technologies that complement each other quite well), here’s one with a slightly different angle: does it make sense to roll your own cluster manager around MySQL Replication, or is it smarter to plug into an existing, proven cluster architecture?

You’ll expect my own view to be fairly well defined, and it is. But make up your own mind!


Full drbd-user list functionality restored

August 27, 2009

Unfortunately, we’ve had a small issue with the drbd-user mailing list with some posts from new members not properly coming through. If you’ve been affected by this, our sincere apologies. Full functionality of the mailing list has now been restored.


On DRBD connection timeouts

August 26, 2009

Here is a question recently seen on drbd-user:

I cannot get the timeout parameter in [drbd.conf] to work (I set it up as in all the examples I saw). I set it low (say 1 second), kill the remote box IO [re]commences after 10 seconds (as the other parameters state).

Anything I’m doing wrong?

Well, sort of. Read the rest of this entry »