Wednesday, January 03, 2007

Xen shared storage

disk = [ 'phy:/dev/vg/xen1,hda,w', 'phy:/dev/vg/xen1-swap,hdb,w', 'phy:/dev/vg/xen1-drbd,hdc,w', 'phy:/dev/vg/san,hdd,w!' ]

For some work that I am doing I am trying to simulate a cluster that uses fiber channel SAN storage (among other things). The above is the disk line I'm using for one of my cluster nodes, hda and hdb are the root and swap disks for a cluster node, hdc is a DRBD store (DRBD allows a RAID-1 to be run across the cluster nodes via TCP), and hdd is a SAN volume. The important thing to note is the "w!" mode for the device, this means write access is granted even in situations whre Xen thinks it's unwise (IE it's being used by another Xen node or is mounted on the dom0). I've briefly tested this by making a filesystem on /dev/hdd on one node, copying data to it, then umounting it and mounting it on another node to read the data.

There are some filesystems that support having multiple nodes mounting the same device at the same time, these include CXFS, GFS, and probably some others. It would be possible to run one of those filesystems across nodes of a Xen cluster. However that isn't my aim at this time. I merely want to have one active node mount the filesystem while the others are on standby.

One thing that needs to be solved for Xen clusters is fencing. When a node of a cluster is misbehaving it needs to be denied access to the hardware in case it recovers some hours later and starts writing to a device that is now being used by another node. AFAIK the only way of doing this is via the xm destroy command. Probably the only way of doing this is to have a cluster node ssh to the dom0 and then run a setuid program that calls xm destroy.

1 comment:

Anonymous said...

we provide such a script: