Today (Saturday 3/14) and tomorrow (Sunday 3/15) Martin Loschwitz and myself are presenting open source high availability at Chemnitzer Linuxtage in Chemnitz, Germany. As I am blogging this, Martin is speaking to a packed auditorium of about 200-or-so attendees (I’m terrible at estimating crowds).
After an interesting start with a malfunctioning mike, Martin kicks off the talk with a definition of high availability and why it’s important.
Mike malfunction again. Martin handles this in stride and with a quick joke, scores a laugh from the crowd.
Martin goes on to stress the importance of redundant components such as redundant power supplies (and power grids), and redundant network switches. Also talks about cross-site links in metro area clusters. (Hint: watch our website next week for an important announcement about this issue). Finally, he emphasizes the importance of good hardware support. If one of your nodes dies, you don’t just want the other node to take over. You want your failed node to be replaced/repaired quickly too.
Now Martin dives head first into DRBD. Goes on to explain DRBD essentials. We’ll do a workshop at this conference tomorrow, and another 45-minute talk which will go into this in more detail. Mentions 3 and 4 node replication setup which the DRBD 8.3 release made possible.
He showcases a customer case study with one of our longest-running clients in Austria, talking about a highly available PostgreSQL database feeding a PHP web application and a Postfix recipient database. This customer wanted
- low entry barriers (low setup cost)
- maximum availability
- low maintenance cost
- excellent performance
and had, at the time we started working with them, no high availability nor redundancy whatsoever. What we introduced were
- multiple redundant web servers,
- redundant load balancers,
- redundant mail spools,
- a redundant central database.
Martin falls victim to another mike screw-up. Asks for “highly available mike”, crowd is relaxing and opening up. More laughter.
Martin explains three node replication in this case study. Customer operates a backup data center connected with the primary data case across a 100 Mbit MAN link. Goes on to explain site fail-over with dynamic routing using BGP (pretty nifty setup).
Cut to a different case study. Highly available virtualization with Xen, Linux-HA and DRBD across two datacenters 12 kilometers apart. Maximum cost efficiency with full high availability.
Wrap up. Full high availability with open source. Manageability. Low cost.
- Is csync2 free software? Yes of course.
- Can we do active-active clustering? Yes we can.
- How quick is BGP fail-over in practice? Depends on multiple factors, however our experience is usually around 2 minutes or less.
- Can I replace SAN based synchronous replication with DRBD? Heck yeah, that’s what it’s for! It’s just orders of magnitude cheaper!