Page MenuHomeClusterLabs Projects

From the HA Trenches
Updated 186 Days AgoPublic

This is a place to collect some example success stories and failure post-mortems.

CGGVeritas (2009)

CGGVeritas, a global provider of geophysical services and equipment, did set up several clusters to provide seismic data to users. They attached a 37 TB JBODs to each node of the cluster so using a total of 72 TB XFS filesystem on each node. 10 of these clusters are set up with Linux-HA version 2.1.3 (the equivalent of Heartbeat 2.99 + Pacemaker 0.6) exporting the data with NFS in an active/active setup.

Each node of the clusters has 16 GByte RAM, a 10 GBit Network interface toward the clients and a 4 GBit HBA direct attached storage. Each cluster serves more than 500 clients. The systems came into production 2006.

Minor hiccups caused by file system corruption were resolved after a failover and reboot of the node. Special hint: The admins did set up a uniq fsid. Otherwise the clients might get confused.

Thanks to Sachin Patel for this story.

Heilig-Geist-Hospital, Bingen (2009)

The Heilig-Geist-Hospital in Bingen at the Rhine uses a high available clustered firewall with state synchronization to separate several internal networks from each others. One of their applications is PACS (Picture Archiving and Communication System) for their central radiography laboratories. All departments use a terminal session to access the data. In case of an error the failover occurs. Since the connection table of the firewalls are synced the user experiences an small delay of the line but can go on working after about 3 seconds.

System: Two ordinary PCs, debian lenny, pacemaker and fwbuilder to manage the setup. They use about 20 different VLANs and also some routing controlled by the cluster. Please find a HOWTO to setup the HA firewall here.

Thanks to Matthias Thiele for this story.

GupShup, Free Group SMS (2009)

GupShup is India’s largest social messaging platform. Based in Mumbai it is mobile group SMS service that allows users to create mobile communities and broadcast messages to them. GupShup is growing rapidly with thousands of groups on topics such as finance, entertainment, lifestyle, health, sports and technology.

The cluster, two Ubuntu 8.04 Servers configured with Linux-HA version 2.1.3-2 (the equivalent of Heartbeat 2.99 + Pacemaker 0.6), runs a Shorewall firewall in an Active/Active configuration. Each node of the cluster has 4 Gigs RAM with 250 GB Hard Drive and serves more than 12 million outgoing sms daily at the rate of 150 sms/sec.

Thanks to Kaushal Shriyan for this story.

GitHub (2012 incidents)

GoCardless (2017)

Press

  • There is an article that offers an overview all the way from heartbeat to pacemaker with openais or corosync in Linux Technical Review. Sorry, article is in German and a subscription is needed.
  • A German book "Clusterbau" by O'Reilly describes pacemaker, openais, corosync and LVS. It tells you how to set up clusters from the basics and also includes many useful examples.

Miscellaneous links

These are all old, but some may still be of value.

Last Author
kgaillot
Last Edited
Jan 22 2024, 1:23 PM