Author Archive

Open Source Business Models

June 29, 2009

The vast majority of the systems we run are Open Source – for a simple reason; we are a small company and we want to produce enterprise grade solutions at a fraction of the price that a serious Enterprise would pay. We simply can’t afford to pay the amounts that the Enterprises pay. Not until we are making considerably more money than we are currently.

Having said that however, we are responsible citizens  and would like to contribute back where we can. Generally, that is either in code contributions or by paying for the software that we use. Code contributions are difficult, we don’t have a lot of time to spend on making changes to the software that we use. We also use far more than would be practical for us to be across. So we pick one or two systems that we really need more from and contribute to those.

For the rest, we would love to be able to pay reasonable amounts of money for them. The issue is that invariably the pricing structure is either free or enterprise grade (i.e. >= $10k per CPU per year) style pricing. This includes support, escalations etc. The trouble is that this leaves us with no option to really pay anything.  Personally I would have thought there would be thousands more companies like us than there are major enterprises willing to pay. I would like to see products costed at $1k per year for access to the product (via RPM) and security updates. I don’t need the support (not at that price anyway).

The products that do provide that level of pricing, I pay for, the rest, I use for free. While I feel guilty about that, until the business models of the companies providing these systems change, there is not much else I can do.




Backups & SAN Storage

January 27, 2009

Enterprises typically spend huge dollars on their storage solutions (which admittedly are excellent). But if you don’t have several hundred thousand dollars, what do you do? Run CentOS!

We are putting together our backup solution – we use Bacula to manage our backups which is an excellent network aware solution that seems rock solid for us. We needed to compile our own RPMs from the source, but this was a simple matter of running:

# rpmbuild --rebuild --define "build_centos5 1" --define "build_postgresql 1" \
                     --define 'build_gconsole 1'--define 'build_bat 1' \

which we then imported into our yum repository. The configuration for this was pretty complicated the first time we set it up, but with the excellent HOWTOs on the site was not too much pain. We then had to decide where to store all the data. Tape drives are expensive, and slow and was not something we really wanted to do if we didn’t have to, so we built our own SAN. A SAN (Storage Attached Network) is a device which looks like a normal disk device on the computer, but in fact sends all the data across the network to a storage device.

Building a SAN is as simple as installing correct package and configuring it up:

# yum install scsi-target-utils
# chkconfig tgtd on
# service tgtd start
# /usr/sbin/tgtadm --lld iscsi --op new  --mode target --tid 1 \
# /usr/sbin/tgtadm --lld iscsi --op new  --mode logicalunit --tid 1 --lun 1 \
                   -b /dev/VolGroup00/Backup
# /usr/sbin/tgtadm --lld iscsi --op bind --mode target --tid 1 -I ALL
# echo >> /etc/rc.local <<EOF
/usr/sbin/tgtadm --lld iscsi --op new  --mode target --tid 1 \
/usr/sbin/tgtadm --lld iscsi --op new  --mode logicalunit --tid 1 --lun 1 \
                 -b /dev/VolGroup00/Backup
/usr/sbin/tgtadm --lld iscsi --op bind --mode target --tid 1 -I ALL

and you are done. Then you need to bind to that storage on the server that requires access to it. This is as simple as:

# yum install iscsi-initiator-utils
# echo "`hostname`" \
  > /etc/iscsi/initiatorname.iscsi
# iscsiadm -m discovery -t st -p <san address>
# iscsiadm --mode node --targetname \
           --portal <san address>:3260 --login
# mount /dev/<new device> /mnt

You now have a working SAN hooked up to the machine that requires it. Couple this with drbd and you have a highly available storage setup which will run as fast as your network (Gig-E is obviously recommended for this). Using a distributed filesystem (such as GFS2) should allow many machines to access the same data simultaneously. However, we have not done this as yet.

High Availability with DRBD and Heartbeat (CRM)

December 16, 2008

One of the huge benefits of going with Open Source solutions is the ability to get Enterprise grade solutions for decidedly non Enterprise costs. We have a fairly typical Web site set up with two load balancers, 2 Apache/Tomcat servers and 2 Postgres database boxes. We wanted to be able to ensure that in the event of a  failure of any of those machines, we would be able to automatically recover and continue providing services.


We have two machines, both with Postgres 8.1 installed on them (the latest version provided as part of CentOS 5.2). While apparently 8.3 can work in active/active mode, we decided to stick with 8.1 to reduce dependency hell with everything else on the machines and work with DRBD. Setup is incredibly simple – we created an /etc/drbd.conf file which had:

global {
  usage-count no;

common {
  protocol C;

resource r0 {
  device    /dev/drbd1;
  disk      /dev/LVMGroup/LVMVolume;
  meta-disk internal;

  on <node1> {
    address   <ip address1>:7789;
  on <node2> {
    address   <ip address2>:7789;

on both nodes and ran :

# drbdadm create-md r0  <-- on both nodes
# service drbd start    <-- on both nodes
# drbdadm -- --overwrite-data-of-peer primary r0  <-- on the primary node

This started DRBD and allowed the primary node to sync to the secondary. For more details about this (and heartbeat configuration below), have a look at this excellent CentOS HOWTO. Then we needed to configure heartbeat to manage the automatic failover for us. Create /etc/ha.d/ on both nodes to contain:

keepalive 2
deadtime 30
warntime 10
initdead 120
bcast   eth0
node    <node1 name as per uname -n>
node    <node2 name as per uname -n>
crm yes

The /etc/ha.d/authkeys on both nodes should contain:

auth 1
1 sha1 <passwd>

This will then result in a working heartbeat. Start the heartbeat service on both nodes and ywait a few minutes and the command crm_mon will show you a running cluster:

[root@<node1> ha.d]# crm_mon
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode

Last updated: Tue Dec 16 14:29:03 2008
Current DC: <node2> (33b76ea8-7368-442f-aef3-26916c567166)
2 Nodes configured.
0 Resources configured.

Node: <node2> (33b76ea8-7368-442f-aef3-26916c567166): online
Node: <node1> (bbccba14-0f40-4b1c-bc5d-8c03d9435a37): online

Then run hb_gui to configure the resources for heartbeat. The file that this gui configures is in /var/lib/heartbeat/crm and is defined via XML. While I would prefer to configure it manually, I haven’t worked out how to do that yet and the hb_gui tool is very easy to use.

Using that tool, you can create a resource group for the clustered services (click on Resources and Plus and select Group). Then within that group you need to configure 4 resources, a virtual IP address that can be used to communicate with the the primary node, a filesystem resource for the DRBD filesystem, drbddisk resource and a resource for postgres. To take each in turn:

  1. IP Address – click on plus and select Native, change the resource name to ip_<groupname>, select the group it should belong to, then select IPaddr from the list and click on Add Parameter. Then enter ip and a virtual ip address for the cluster. Add another parameter nic and select the interface for this to be configured against (i.e. eth0). Then click on OK.
  2. drbddisk resource – Same procedure, but this time select drbddisk  instead of ipaddr and select Add Parameter. Then enter 1 and the name of the drbd resource created (r0 in our case).
  3. filesystem – Same again, but select Filesystem and add the following parameters:
    1. device, /dev/drbd1 (in this example)
    2. directory, /var/lib/pgsql (for postgres)
    3. type, ext3 (or the filesystem you have created on it)
  4. postgres – Lastly add a postgres resource with no parameters.

Start the resource group and in a few minutes everything should be started on one of the nodes. To switch over the the other run service heartbeat stop on one and everything will migrate to the other. Good luck, you should now have an active/passive cluster for your database. This worked for us. Your mileage may vary, but any issues feel free to leave a comment and we’ll update this HOWTO.


Creating the clustering for the Web was similarly easy. We kept the 2 web machines as they were with Apache and Tomcat running on both and instead clustered the load balancers initially in active/passive (until we can work out the active/active settings) in much the same way. The key difference was that for these machines we ran the load balancing software (HAProxy) on both all the time and the cluster just looked after the IP address. That way nothing was slowed down if the primary load balancer failed while waiting for services to start.