On MySQL Replication, cluster managers, and DRBD (again)

With all of the discussions over the past years about MySQL Replication vs. DRBD (where the “vs.” part is in fact grossly misled of course — they are two technologies that complement each other quite well), here’s one with a slightly different angle: does it make sense to roll your own cluster manager around MySQL Replication, or is it smarter to plug into an existing, proven cluster architecture?

You’ll expect my own view to be fairly well defined, and it is. But make up your own mind!

14 Responses to On MySQL Replication, cluster managers, and DRBD (again)

  1. I think HA solutions should be made after the “KISS” principle. Looking at the diagram at http://www.clusterlabs.org/wiki/Main_Page – and trying to understand it and the text below – repeatedly overtaxed my mind. So maybe I should start playing “Dr. Kawashima’s Brain Training” or just keep using MMM. πŸ˜‰

    • Florian Haas says:

      What’s on the Clusterlabs wiki main page is an overview of the internals of the Pacemaker cluster stack. Compare that to an overview of the MySQL server architecture found here. Does that have you running away screaming from MySQL? I think it doesn’t. And that’s not even getting into MMM at all.

  2. The overview of the MySQL server architecture is not on the main page of mysql.com πŸ˜‰

    • Florian Haas says:

      Point taken, and indeed I agree. But the underlying assumption that Pacemaker “is more complicated to use” just because the project’s creator chooses to let you, the potential user, know a lot about its internals from the outset — that assumption hardly holds up.

      • It is more complicated to get started with – I think you should change this. Maybe just put a simpler “non-internal” graph on top of it with an short explanation without abbreviations below. πŸ™‚

  3. Hi Florian,

    Your question about integrated cluster vs. roll-your-own hits the nail on the head. However, I think that cluster requirements are changing and I don’t believe that Pacemaker is a complete solution at this point. There’s a fully developed argument on my blog (http://scale-out-blog.blogspot.com/2009/09/future-of-database-clustering.html). I would be very interested in your counter-argument on behalf of PaceMaker. DRBD is in there by the way.

    p.s., I absolutely adore Heartbeat v.1 but have not worked with later versions. V.2 was pretty complex.

    • Florian Haas says:

      Robert, one significant thing were Pacemaker is still lacking, undoubtedly, is simplicity. There have been huge improvements on that end, specifically with the addition of the crm CLI, but there’s a lot more work to do. So you definitely hitting the mark in that regard. Other than that, quite frankly I fail to see how Pacemaker does not fit in with your requirements.

      Control failover using configurable policies based on business rules — Pacemaker does this.

      Schedule recurring tasks using job management queues — Pacemaker has provisions for recurring tasks. Currently these are used for resource monitoring only (with short intervals, on the order of seconds or minutes), but conceivably they could be extended for recurring management tasks (with much longer intervals) as well.

      Parallel database replication or disk-level approaches — no limitations in Pacemaker either.

      Cross-site replication is increasingly common as well — split-site/cross-site clustering is something Pacemaker is still lacking in, indeed, but that is changing. Cross-site replication can be done with DRBD already.

      Replication methods will need to be pluggable — you can plug pretty much anything into Pacemaker. As far as database replication is concerned, I guess the pluggability is best handled by the database engine like the drizzle guys are apparently planning.

      Database clusters must be software only, and make very minimal assumptions about resources — check, check.

      The base clustering components have to be open source — Pacemaker is.

      Maybe Andrew wants to pitch in a bit more.

      • Beekhof says:

        Well, Robert is understandably framing the criteria to benefit his pet project.
        I don’t think he’s really looking to be convinced πŸ™‚

        “Fast, flexible replication” and “Transparent application access” for example, are not issues for a cluster resource manager. Those are resource and setup specific requirements and Pacemaker is resource agnostic by design.

        Which pays off nicely since, as you said, Pacemaker can cluster pretty much anything and makes no additional requirements (with or without STONITH, shared storage, drbd, rsync, internal replication – use whatever you like).

        “Cloud and virtualized operation”, “Partition management”, “Open source”, they’re all slam dunks for Pacemaker. Nothing particularly interesting to add there.

        “Simple management and monitoring”
        Pacemaker did have some issues here in the past, I think we’ve mostly got it under control now, although we are constantly making further improvements.

        The crm shell provides a single coherent, and scriptable, interface for configuring the cluster.
        Resource monitoring and other scheduled jobs are supported and we also have some features planned to ensure that resources are in a consistent state when those jobs run – to facilitate backups.

        In combination with STONITH, scheduled jobs also make “Top-to-bottom data protection” a non-issue. There’s nothing that says the jobs can only check if the service is up. Simply add a new one that performs whatever validation or backup you want.

        The hardest part is writing the resource agents (scripts), but happily there is a large pool of existing scripts that people can use and/or improve upon.

      • Hi Florian!

        Thanks for the detailed reply. I see both you and Andrew read the article quite closely. The main difficulty I have with PaceMaker is a very simple one–it’s not specifically adapted for databases. Let me give you a very simple example: testing a database backup to make sure it’s good.

        With Tungsten I can pick any database, including a new one, load a backup with a couple of trivial commands, and then run consistency checks from the master once it loads to ensure the data are good. If there are problems the slave will halt in a failed state, which is then immediately visible on any manager node in the cluster. This all works with little extra configuration beyond installing the software. There are some tweaks for specialized backups like LVM but they are very simple. We will have this procedure down to a single command within a couple of releases.

        There are many other examples like intelligent failover without breaking application connections, specialized replication tricks, or broadcast management operations on databases but I think this illustrates the point. PaceMaker seems very powerful but users still have to do a lot of work to address the requirements I described in the blog post. Where PaceMaker does seem quite excellent is in the failure detection and STONITH features. We looked closely at these before deciding to implement an alternative based on group communications and business rules.

      • Hi Andrew,

        Tungsten is framed to a real set of problems and not the other way around. I talked to between one and two hundred customers while selling our older product(s) and found that existing database clustering is too complicated and solves too few of the real problems database users have. That situation is worsening as hardware becomes more capable.

        I would be delighted to be proven wrong on any of these points–it saves time. πŸ™‚

      • Beekhof says:

        No doubt Tungsten solves real problems, much like the ones the filesystem guys solved back when they too built special purpose clusters.

        The problem with focusing on one technology is that
        a) you’re hosed when the focus shifts (eg. Xen -> KVM) and,
        b) the optimizations you make means that everything else ends up a second-class cluster citizen

        I’m not saying there’s no room for targeted solutions, I’m sure you’ve done market research that says there is, but I would claim that (almost?) all the smarts you’re talking about can be encapsulated in a resource agent and therefor managed by a generic cluster manager like Pacemaker.

        Oh, and wrt. Pacemaker’s “complex” configuration, http://theclusterguy.clusterlabs.org/post/178680309/configuring-heartbeat-v1-was-so-simple might be interesting reading.

      • Thanks for the link Andrew. Just to be clear I think your work on failover and STONITH has defined the state of the art in open source. We have been looking at the Linux-HA work for years.

        One further idea: this discussion about clustering approaches would make a really excellent presentation at a conference. Would you and Florian be interested? If we could get an MMM defender like Baron or perhaps Arjen there would be a very complete set of choices for listeners.

  4. Yves Trudeau says:

    Hi Florian,
    I agree that all the HA solutions around MySQL are pretty confusing. There is a real need to have an impartial knowledge base to people understand their options.

    For that purpose, I am currently writing a “HA Whitepaper” where I am trying to be as objective as possible and present many solutions with the pros/cons. For now, Tungsten is not in there since I know very little of it. I’ll blog about this whitepaper soon.


  5. […] quickly veered into an argument about clustering in general. As Florian Haas put it on his blog, this is not just an issue of DRBD vs. MySQL Replication. Is a database cluster something you cobble together through bits and pieces like MMM? Or is it […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: