Monday, April 30, 2007

comment spam

The war on comment-spam has now begun. It appears that Blogger might have some anti-spam measures of which I was unaware. Otherwise it’s a strange coincidence that I get a huge number of comment spams for extremely hard-core porn from the Ukraine so soon after starting a Wordpress blog.

About 24 hours before the spam attack there was a strange blog comment that linked to google (with no offensive or spammy content). It appears that leaving it online was my mistake, when I left that online for a day the spammer decided that I might also leave porn spam online. I arrived home this evening to find almost 100 spams in the form of comments and track-backs, and more arriving by the minute. So I used iptables to block a /20 related to the spam and things are quiet now.

The moral of the story is to delete anything unusual ASAP in case it encourages the idiots.

I’ve also tightened the anti-spam measures on my blog too.

Saturday, April 28, 2007

new blog

I am starting to move my blog to my own Wordpress server. Here is the new URL for my main blog (feed), and here is the new URL for my Source-Dump blog (feed) which is now named just "dump".

Wordpress gives me the power to change all aspects of my blog's operation (including adding plug-ins). It also allows me to correctly display greater-than and less-than characters (the Perl script I use for converting them is at this post - it's short now but will probably grow).

Hopefully the new blog will also solve the date problems that some Planet readers have been complaining about.

I will briefly put the same content on both the old and new blogs, when I'm fully confident in the new blog I'll stop updating the old one and try to get all Planet installations changed. Anyone who wants to convert their Planet installation to my new blog now is welcome to do so.

more on presentations

Here's an amusing video about how not to do presentations.

Thursday, April 26, 2007

paper about ZCAV

This paper by Rodney Van Meter about ZCAV (Zoned Constant Angular Velocity) in hard drives is very interesting. It predates my work by about four years and includes some interesting methods of collecting data that I never considered.

One interesting thing is that apparently on some SCSI drives you can get the drive to tell you where the zones are. If I get enough spare time I would like to repeat such tests and see how the data returned by disks compares to benchmark results.

It's also interesting to note that Rodney's paper shows a fairly linear drop of performance on higher sector numbers (while he notes that it would be expected to fall off more quickly at higher sector numbers). One of my recent tests with a 300G disk showed the greater than linear performance drop (see my ZCAV results page for more details). It might require modern large disks to show this performance characteristic.

I also found it very interesting to see that a modified version of Bonnie was used for some of the tests and that it gave consistent results! I assumed that any filesystem based tests of ZCAV performance would introduce unreasonable amounts of variance into my tests and instead wrote my ZCAV test program to directly read the disk and measure performance.

It's times like this that I wish for a "groundhog day" so that I could spend a year doing nothing but reading technical papers.

Wednesday, April 25, 2007

MySQL security in Debian

Currently there is a problem with the MySQL default install in Debian/Etch (and probably other distributions too). It sets up "root" with dba access with no password by default, the following mysql command will give a list of all MySQL accounts with Grant_priv access (one of the capabilities that gives great access to the database server) and shows their hashed password (as a matter of procedure I truncated the hash for my debian-sys-maint account). As you can see the "root" and "debian-sys-maint" accounts have such access. The debian-sys-maint account is used for Debian package management tools and it's password is stored in the /etc/mysql/debian.cnf file.

$ echo "select Host,User,Password from user where Grant_priv='y'" | mysql -u root mysql
Host User Password
localhost root
aeon root
localhost debian-sys-maint *882F90515FCEE65506CBFCD7
It seems likely that most people who have installed MySQL won't realise this problem and will continue to run their machine in that manner, this is a serious issue for multi-user machines. There is currently Debian bug #418672 about this issue. In my tests this issue affects Etch machines as well as machines running Unstable.

Tuesday, April 24, 2007

booting from USB for security

Sune Vuorela asks about how to secure important data such as GPG keys on laptops.

I believe that the ideal solution involves booting from a USB device with an encrypted root filesystem to make subversion of the machine more difficult (note that physically subverting the machine is still possible - EG through monitoring the keyboard hardware).

The idea is that you boot from the USB device which contains the kernel, initrd, and the decryption key for the root filesystem. The advantage of having the key on a USB device is that it can be longer and more random than anything you might memorise.

In my previous posts about a good security design for an office, more about securing an office, and biometrics and passwords I covered some of the details of this.

My latest idea however is to have the root filesystem encrypted with both a password that is entered and by a password stored on the USB device. This means that someone who steals both my laptop and my USB key will still have some difficulty in getting at my data, but also someone who steals just the laptop will find that it is encrypted with a key that can not be brute-forced with any hardware that doesn't involve quantum-computing.

Also coincidentally also on Planet Debian in the same day Michael Prokop documents how to solve some of the problems relating to booting from a USB flash device.

Monday, April 23, 2007

free laptop

Jesus Climent writes about donating laptops.

Free Thinkpad

I have a Thinkpad 385xd laptop to give away for free. It has a PentiumMMX-233 CPU, 96M of RAM, a 3.2G IDE disk, and a 800x600 display. As of my last tests it works well and is currently running an old version of Debian.

The power connector on the laptop is a little broken (it takes a bit of work to plug the cable in) and the cable is also broken (I think that some of the wires are broken and it gets hot when used for a while). Probably the best thing to do would be to solder the cable from the PSU onto the motherboard.

If anyone has a good use for such a machine that benefits a free software project and can arrange to collect it from Melbourne Australia then let me know.

Also I can bring it to any conference that I attend.

Note that I will delete this post once the laptop is taken. So if the Planet link doesn't resolve then someone else got in first.

first look at CentOS 5 Xen

I have just installed a machine running CentOS 5 as a Xen server. I installed a full GUI environment on the dom0 so that GUI tools can be used for managing the virtual servers.

The first problem I had was selecting the "Installation source", it's described in the error message as an "Invalid PV media address" when you get it wrong which caused me a little confusion when installing it at 10PM. Then I had a few problems getting the syntax of a nfs:// URL correct. But these were trivial annoyances. It was a little annoying that my attempts to use a "file://" URL were rejected, I had hoped that it would just run exportfs to make the NFS export from the local machine (much faster than using an NFS server over the network which is what the current setup will lead people to do).

The first true deficiency I found with the tools is that it provides no way of creating filesystems on block devices. The process of allocating a block device or file from the Xen configuration tool is merely assigning a virtual block device to the Xen image - and only one such virtual block device is permitted. Then the CentOS 5 installation instance that runs under Xen will have to partition the disk (it doesn't support installing directly to an unpartitioned disk) which will make things painful when it comes time to resize the filesystems.

When running Debian Xen servers I do everything manually. A typical Debian Xen instance that I run will have a virtual block device /dev/hda for the root FS, /dev/hdb for swap, and /dev/hdc for /home. Then if I want to resize them I merely stop the Xen instance, run "e2fsck -f" on the filesystem followed by "resize2fs" and the LVM command "lvresize" (in the appropriate order depending on whether I am extending or reducing the filesystem).

Xen also supports creating a virtual partitioned disk. This means I could have /dev/lvm/xenroot, and /dev/lvm/xenswap, and /dev/lvm/xenhome appear in the domU as /dev/hda1, /dev/hda2, and /dev/hda3. This means that I could have a single virtual disk that allows the partitions to be independently resized when the domU in question is not running. I have not tried using this feature as it doesn't suit my usage patterns. But it's interesting and unfortunate that the GUI tools which are part of CentOS don't support it.

When I finally got to run the install process it had a virtual graphics environment (which is good) but unfortunately it suffered badly from the two-mouse-cursor problem with different accellerations used for both cursors so the difference in position of the two cursors varied in different parts of the screen. This was rather surprising as the dom0 had a default GNOME install.

lemonup and blog license

I have just updated my previous post about licenses and also explicitely licensed my blog. Previously I had used a Creative-Commons share-alike license for lecture notes to allow commercial use and had not specified what the license is for my blog apart from it being free for feeds (you may add it to a planet without seeking permission first).

Unfortunately the operators of a site named decided to mirror many of my blog posts with Google AdWords. The site provides no benefit to users that I can discover and merely takes away AdWords revenue from my site. It has no listed method of contacting the site owner so it seems that blogging about this and letting them read it on their own site is the only way of doing so. :-#

I'm happy for Technorati to mirror my site as they provide significant benefits to users and to me personally. I am also happy for planet installations that include my blog among others to have a Google advert on the page (in which case it's a Google advert for the entire planet not for my blog post).

Also at this time I permit sites to mirror extracts of my articles. So for example the porn blogs that post paragraphs of my posts about topics such as "meeting people" with links to my posts don't bother me. I'm sure that someone who is searching for porn will not be happy to get links to posts about Debian release parties etc - but that's their QA issue not a license issue. I am aware that in some jurisdictions I can not prevent people from using extracts of my posts - but I permit this even in jurisdictions where such use is not mandated by law.

Lemonup: you may post short extracts (10% or one paragraph) of my posts with links to the original posts, or you may mirror my posts with no advertising at all. If those options are not of interest to you then please remove all content I wrote from your site.

Sunday, April 22, 2007

the Right to Fork

Leon Brooks blogged about the Right to Fork (an essential right for free software development) but notes that governments of countries don't permit such a right.

One of the criteria for the existence of a state is the ability to control it's own territory. Lose control of the territory and you lose the state, lose some of the territory and the state is diminished. Therefore preventing a division of the territory (a split after a civil war) is the primary purpose of a state. The other criteria of a state are the ability to tax the population, impose civil order, and to administer all other aspects of government. All of these operations are essential to the government and lead to the destruction of the state if they are lost.

It's not that governments want to prevent forking, it's the fact that the existence of the state (on which the existence of the government depends) demands that it be prevented in all but the most extreme situations.

With free software forking is not a problem as multiple groups can work on similar software without interference. If someone else works on a slightly different version of your program then the worst that they can do is to get the interest of more developers than you get. This competition for developers leads to better code!

With proprietary software the desire to prevent forking is due to the tiny marginal cost of software. Most of the costs of running a software company are in the development. The amount of work involved in development does not vary much as the user-base is increased. So doubling the number of sales can always be expected to significantly more than double the company's profit.

One thing that would benefit the computer industry would be to have all the source to proprietary programs put in escrow and then released freely after some amount of time or some number of versions have been released. If Windows NT 4.0 was released freely today it would not take many sales from the more recent versions of Windows. But it would provide significant benefits for people who want to emulate older systems and preserve data. I expect that current versions of MS-Office wouldn't properly read files created on NT 4.0, I'm sure that this is a problem for some people and will become more of a problem as new machines that are currently being designed are not capable of booting such old versions of Windows.

praying for rain

Paul Dwerryhouse posted a comment about the Prime Minister asking people to pray for rain. I don't think that Johnny is suggesting this because he's overly religious (compare his actions with the New Testament of the Bible). The fact is that the Australian government has no plans to deal with global warming, the inefficient distribution of water, and the large commercial farms that produce water inefficient crops such as rice and cotton in areas that have limited amounts of water. This means that small farmers should pray, no-one else will help them!

I wonder if the farmers will ever work out that the National party is doing absolutely nothing for them by it's alliance with the Liberal party. Maybe if farmers could actually get a political party that represents their interests then things would change.

Thursday, April 19, 2007

a Heartbeat developer comments on my blog post

Alan Robertson (a major contributor to the Heartbeat project) commented on my post failure probability and clusters. His comment deserves wider readership than a comment generally gets so I'm making a post out of it. Here it is:

One of my favorite phrases is "complexity is the enemy of reliability" . This is absolutely true, but not a complete picture, because you don't actually care much about reliability, you care about availability.

Complexity (which reduces MTBF) is only worth it if you can use it to drastically cut MTTR - which in turn raises availability significantly. If your MTTR was 0, then you wouldn't care if you ever had a failure. Of course, it's never zero

But, with normal clustering software, you can significantly improve your availability, AND your maintainability.

Your post makes some assumptions which are more than a little simplistic. To be fair, the real mathematics of this are pretty darn complicated.

First I agree that there are FAR more 2-node clusters than larger clusters. But, I think for a different reason. People understand 2-node clusters. I'm not saying this isn't important, it is important. But, it's not related to reliability.

Second, you assume a particular model of quorum, and there are many. It is true that your model is the most common, but it's hardly the only one - not even for heartbeat (and there are others we want to implement).

Third, if you have redundant networking, and multiple power sources, as it should, then system failures become much less correlated. The normal model which is used is completely uncorrelated failures.

This is obviously an oversimplification as well, but if you have redundant power supplies supplied from redundant power feeds, and redundant networking etc. it's not a bad approximation.

So, if you have an MTTR of 4 hours to repair broken hardware, what you care about is the probability of having additional failures during those four hours.

If your HA software can recover from an error in 60 seconds, then that's your effective MTTR as seen by (a subset) of users. Some won't see it at all. And, of course, that should also go into your computation. This depends on knowing a lot about what kind of protocol is involved, and what the probability of various lengths of failures is to be visible to various kinds of users. And, of course, no one really knows that either in practice.

If you have a hardware failure every 5 years approximately, and a hardware repair MTTR of 4 hours, then the probability of a second failure during that time is about .009%. The probability of two failures occuring during that time is about 8^10-7% - which is a pretty small number.

Probabilities for higher order failures are proportionately smaller.

But, of course, like any calculation, the probabilities of this are calculated using a number of simplifying assumptions.

It assumes, for example, that the probabilities of correlated failures are small. For example, the probability of a flood taking out all the servers, or some other disaster is ignored.

You can add complexity to solve those problems too ;-), but at some point the managerial difficulties (complexity) overwhelms you and you say (regardless of the numbers) that you don't want to go there.

Mangerial complexity is minimized by uniformity in the configuration. So, if all your nodes can run any service, that's good. If they're asymmetric, and very wildly so, that's bad.

I have to go now, I had a family emergency come up while I was writing this. Later...

End quote.

It's interesting to note that there are other models of quorum, I'll have to investigate that. Most places I have worked have had a MTTR that is significantly greater than four hours. But if you have hot-swap hard drives (so drive failure isn't a serious problem) then having machines have an average of one failure per five years should be possible.

Wednesday, April 18, 2007

2 node vs 3+ node clusters

A comment on my post about the failure probability of clusters suggested that a six node cluster that has one node fail should become a five node cluster.

The problem with this is what to do when nodes recover from a failure. For example if a six node cluster had a node fail and became a five node cluster, then became a three node cluster after another two nodes had failed, then you would have half the cluster that was disconnected. If the three nodes that appeared to have failed became active again but unable to see the other three nodes then you would have a split-brain situation.

As noted in the comment the special case of a two node cluster does have different failure situations. If the connection between nodes goes down and the router can still be pinged then you can have a split brain situation. To avoid this you will generally have a direct connection between the two nodes (either a null-modem cable or a crossover Ethernet cable), such cables are more reliable than networking which involves a switch or hub. Also the network interface which involves the router in question will ideally also be used as a method of maintaining cluster status - it seems unlikely that two nodes will both be able to ping the router but be unable to send data to each other.

For best reliability you need to use multiple network interfaces between cluster nodes. One way of doing this is to have a pair of Ethernet ports bonded for providing the service (connected to two switches and pinging a router to determine which switch is best to use). The Heartbeat software supports encrypted data so it should be safe to run it on the same interface as used for providing the service (of course if you provide a service to the public Internet then you want a firewall to prevent machines on the net from trying to attack it).

Heartbeat also supports using multiple interfaces for maintaining the cluster data, so you can have one network dedicated to cluster operations and the network that is used for providing the service can be a backup network for cluster data. The pingd service allows Heartbeat to place services on nodes that have good connectivity to the net. So you could have multiple nodes that each have one Ethernet port for providing the service and one port as a backup for Heartbeat operations, if pingd indicates that the service port was not functioning correctly then the services would be moved to other nodes.

If you want to avoid having private Heartbeat data going over the service interface then in the two-node case you need a minimum of two Ethernet ports for Heartbeat and one port for providing the service if you use pingd. If you don't use pingd then you need two bonded ports for providing the service and two ports (either bonded or independently configured in Hertbeat) for Heartbeat giving a total of four ports.

When there are more than two nodes in the cluster the criteria for cluster membership is that a majority of nodes are connected. This makes split-brain impossible and reduces the need to have reliable Ethernet interfaces. A cluster with three or more nodes could have a single service port and a single private port for Heartbeat, or if you trust the service interface you could do it all on one Ethernet port.

In summary, three nodes is better than two, but requires more hardware. Five nodes is better than three, but as I wrote in my previous post four nodes is not much good. I recommend against any even number of nodes other than two for the same reason.

Sunday, April 15, 2007

failure probability and clusters

When running a high-availability cluster of two nodes it will generally be configured such that if one node fails then the other runs. Some common operation (such as accessing a shared storage device or pinging a router) will be used by the surviving node to determine that the other node is dead and that it's not merely a networking problem. Therefore if you lose one node then the system keeps operating until you lose another.

When you run a three-node cluster the general configuration is that a majority of nodes is required. So if the cluster is partitioned then one node on it's own will shut down all services while two nodes that can talk to each other will continue operating as normal. This means that to lose the cluster you need to lose all inter-node communication or have two nodes fail.

If the probability of a node surviving for the time interval required to repair a node that's already died is N (where N is a number between 0 and 1 - 1 means 100% chance of success and 0 means it is certain to fail) then for a two node cluster the probability of the second node surviving long enough for a dead node to be fixed is N. For a three node cluster the probability that both the surviving two nodes will survive is N^2. This is significantly less, therefore a three node cluster is more likely to experience a critical second failure than a two node cluster.

For a four node cluster you need three active nodes to have quorum. Therefore the probability that a second node won't fail is N^3 - even worse again!

For a five node cluster you can lose two nodes without losing the cluster. If you have already lost a node the probability that you won't lose another two is N^4+(1-N)*N^3*4. As long as N is greater than 0.8 the probability of keeping three nodes out of four is greater than the probability of a single node not failing.

To see the probabilities of four and five node clusters experiencing a catastrophic failure after one node has died run the following shell script for different values of N (0.9 and 0.99 are reasonable values to try). You might hope that the probability of a second node remaining online while the first node is being repaired is significantly higher than 0.9, however when you consider that the first node's failure might have been partially caused by the ambient temperature, power supply problems, vibration, or other factors that affect multiple nodes I don't think it's impossible for the probability to be as low as 0.9.

echo $N^4+\(1-$N\)*$N^3*4|bc -l ; echo $N^3 | bc -l
So it seems that if reliability is your aim in having a cluster then your options are two nodes (if you can be certain of avoiding split-brain) or five nodes. Six nodes is not a good option as the probability of losing three nodes out of six is greater than the probability of losing three nodes out of five. Seven and nine node clusters would also be reasonable options.

But it's not surprising that a google search for "five node" cluster high-availability gives about 1/10 the number of results as a search for "four node" cluster high-availability. Most people in the computer industry like powers of two more than they like maths.

Friday, April 13, 2007

Debian/Etch release party in Melbourne - Australia

We are having a release party on Saturday the 14th of April. We meet at mid-day under the clocks at Flinders Street Station and then go somewhere convenient and not too expensive for lunch.

All welcome.


The event was moderately successful. There were only six people including me - that was quite a bit smaller than the Debian 10th birthday party we had in Melbourne, but it was still enough to have fun.

Everyone there had a good knowledge of Linux and Debian and many interesting things were discussed. We had lunch at a Japanese stone-grill restaurant - their specialty is serving raw ingredients along with a stone that's at 400C (or so they claim - I would expect a 400C stone to radiate more heat than I experienced on my previous visit). As it was a warm day we skipped the stone grill and ordered from the lunch menu (which was also a lot cheaper). Some of the guys had never tried Sake or Plum Wine before, they seemed to like it. Strangely the waitress always wanted to deliver alcohol to a 15yo in preference to almost anyone else.

One of the topics of discussion was Linux meetings and the ability to attend them. A point was made that if you are <18yo and rely on your parents' permission to do things then a meeting that finishes at 9PM isn't a viable option. It has previously been noted that for people from regional areas an evening meeting is also inconvenient.

Maybe we should have occasional LUG meetings on a Saturday afternoon to cater for the needs of such people?

Wednesday, April 11, 2007

Spooks and GConf

Jeff Waugh wrote an amusing post about SE Linux and GConf support. It's good to see SE Linux being promoted to the GNOME community.

presentations about SE Linux

I have just read the Presentation Zen blog post about PowerPoint.

One of the interesting suggestions was that it's not effective to present the same information twice, so you don't have notes covering what you say. Having a diagram that gives the same information is effective though because it gives a different way of analyzing the data. I looked at a couple of sets of slides that I have written and noticed that the ratio of text slides to diagram slides was 6:1 and 3:1 in favor of text, and that wasn't counting the first and last slides that have the title of the talk and a set of URLs respectively.

So it seems that I need more and better diagrams. I'll include most of the diagrams I use in my current SE Linux talks in this post with some ideas on how to improve them. I would appreciate any suggestions that may be offered (either through blog comments or email).

The above diagram shows how the SE Linux identity limits the roles that may be selected, and how the role limits the domains that may be entered. Therefore the identity controls what the user may do and in this example the identity "root" means that the user has little access to the machine (a Play Machine configuration). I think that the above is reasonably effective and have been using it for a few years. I have considered a more complex diagram with the "staff_r" role included as well and possibly including the way that "newrole" can be used to change between roles. So I could have the above as slide #1 about identities and roles with a more detailed diagram following to replace a page of text about role transition.

The above diagram shows the domain transitions used in a typical system boot and login process. It includes the names of the types and a summary of the relevant policy rules used to implement the transitions. I also have another diagram that I have used which is the same but without the file types and policy. In the past I have never used both in the one talk - just used one of the two and had text to describe the information content of the other. To make greater use of diagrams I could start with the simple diagram and then have the following slide have all the detail.

The above diagram simply displays the MCS security model with ellipses representing processes and rectangles representing files.

The above diagram shows a simplified version of the MMCS policy. With MMCS each process has a range with the low level representing the minimum category set of files to which it is permitted to write and the high level representing the files that it may read and write. So to write to a file with the "HR" category the process must have a low level that's no higher than "HR" and a high level that is equal or greater than "HR". The full set of combinations of two categories with low and high levels means 10 different levels of access for processes which makes for a complex diagram. I need something other than plain text for this but the above diagram is overly complex and a full set is even more so. Maybe a table with process contexts on one axis, file contexts on another and access granted being one of "R", "RW" or nothing?

I also have a MLS diagram in the same manner, but I now think it's too awful to put on my blog. Any suggestions on how to effectively design a diagram for MLS? For those of you who don't know how MLS works the basic concept is that every process has an "Effective Clearance" (AKA low level) which determines what it can write, it can't write to anything below that because it might have read data from a file at it's own level and it can't read from a level higher than it's own level. MLS also uses a high level for ranged processes and filesystem objects (but that's when it gets really complex).

This last one is what I consider my most effective diagram. It shows the benefits of SE Linux in confining daemons in a clear and effective manner. Any suggestions for improvement (apart from fixing the varying text size which is due to a bug in Dia) would be appreciated.

The above diagrams are all on my SE Linux talks page, along with the Dia files that were used to create them. They may be used freely for non-commercial purposes.

If anyone has some SE Linux diagrams that they would like to share then please let me know, either through a blog comment, email, or a blog post syndicated on Planet SE Linux.

Xen and SE Linux - EWeek review of RHEL5

The online magazine EWeek has done a review of RHEL5. It's quite a positive review which can be summarised as "good support for Xen as service (not an appliance), better value than previous versions with the licenses for multiple guests included, and SE Linux briefly got in the way but the Troubleshooting tool fixed it quickly and easily".

The problem they had is that the SE Linux policy expects Xen images to be in /var/lib/xen/images, but the Xen configuration tools apparently didn't adequately encourage them to use that directory. They stored the images somewhere else and SE Linux stopped it from working. The Troubleshooting tool did something that they didn't describe and then it all worked.

Generally a very positive review of RHEL5 and a moderately positive review of SE Linux in RHEL5.

PS You might have to turn off JavaScript to view the link, the page has broken JavaScript code that takes an unreasonable amount of CPU time.

Tuesday, April 10, 2007

what is a BOF?

BOF stands for Birds Of a Feather, it's an informal session run at a conference usually without any formal approval by the people who run the conference.

Often conferences have a white-board, wiki, or other place where conference delegates can leave notes for any reason. It is used for many purposes including arranging BOFs. To arrange a BOF you will usually write the title for the BOF and the name of the convenor (usually yourself if it's your idea) and leave a space for interested people to sign their names. Even though there is usually no formal involvement of the conference organizers they will generally reserve some time for BOFs. Depending on the expected interest they will usually offer one or two slots of either 45 minutes or one hour. They will also often assist in allocating BOFs to rooms. But none of this is needed. All that you need to do is find a notice-board, state your intention to have a BOF at a time when not much else is happening and play it by ear!

My observation is that about half the ideas for BOFs actually happen, the rest don't get enough interest. This is OK, one of the reasons for a BOF is to have a discussion about an area of technology that has an unknown level of interest. If no-one is interested then you offer the same thing the next year. If only a few people are interested then you discuss it over dinner. But sometimes you get 30+ people, you never know what to expect as many people don't sign up - or have their first choice canceled and attend the next on the list!

To run a BOF you firstly need some level of expert knowledge in the field. I believe that the best plan is for a BOF to be a panel discussion where you have a significant portion of the people in the audience (between 5 and 15 people) speaking their opinions on the topic and the convener moderating the discussion. If things work in an ideal manner then the convener will merely be one member of the panel. However it's generally expected that the person running the BOF can give an improvised lecture on the topic in case things don't happen in an ideal manner. It's also expected that the convener will have an agenda for a discussion drawn up so that if the panel method occurs they can ask a series of questions for members of the BOF to answer. My experience is that 8 simple questions will cover most of an hour.

One requirement for convening a BOF is that you be confident in speaking to an audience of unknown size, knowledge, and temperament. Although I haven't seen it done it would be possible to have two people acting as joint conveners of a BOF. One person with the confidence to handle the audience and manage the agenda and another with the technical skills needed to speak authoritatively on the topic.

Some of the BOFs I have attended have had casual discussions, some have had heated arguments, and some ended up as lectures with the convener just talking about the topic. Each of these outcomes can work in terms of entertaining and educating the delegates.

But don't feel afraid, one of the advantages of a BOF is that it's a very casual affair, not only because of the nature of the event but also because it usually happens at the end of a long conference day. People will want to relax not have a high-intensity lecture. One problem that you can have when giving a formal lecture to an audience is nervous problems such as hyper-ventilating. This has happened to me before and it was really difficult to recover while continuing the lecture. If that happens during a BOF then you can just throw a question to the audience such as "could everyone in the room please give their opinion on X", that will give you time for your nerves to recover while also allowing the audience to get to know each other a bit - it's probably best to have at least one such question on your agenda in case it's needed.

Note that the above is my personal opinion based on my own experience. I'm sure that lots of other people will disagree with me and write blog posts saying so. ;)

The facts which I expect no-one to dispute are:

  • BOFs are informal
  • Anyone can run one
  • You need an agenda
  • You need some level of expert knowledge of the topic

Monday, April 09, 2007

a strange interpretation of the US constitution about copyright

In a blog on infoworld the following strange statement appeared:

The US Constitution is clear that the reason for copyright/patent/etc. is to benefit creators of property, not users of property. I appreciate the reason: give creators a reasonable return on their investment.

Actually the US constitution seems to clearly say the opposite. Here is a link to section 8 of the US constitution. The important phrase is "To promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries".

There is no mention of providing benefits to creators of written works and inventions. The aim is clearly stated to promote the progress of science and useful arts and the exclusive right which operates for limited times is merely a method of achieving that aim.

BSD vs GPL licences

James Dumay writes about Theo's latest flame-war.

One interesting part of the debate was Theo's response to this comment:
> We can dual license our code though and that is an
> acceptable license for Linux, the kernel.

We? Sure, you can. But Reyk will not dual license his code, and most of the other people in the BSD community won't either, because then they receive the occasional patch from a GPL-believer which is ONLY under the GPL license, and then they are no less screwed than they would be from the code granted totally freely to companies.

The difference of course is that when you give code to companies under the BSD license you will never know what is done to it, but GPL-only patches can still be used as inspiration for new code development. Sure GPL-only code can't be copied into BSD-only code, but once you know where the bugs are they are easy to fix.

Towards the end of the debate Theo asks the following question:
David, if you found a piece of your code in some other tree, under a different license, would your first point of engatement be a public or private mail?

I can't speak for David, but after reading the discussion I would probably start by blogging about such an issue.

New Debian release and new DPL

Ingo Juergensmann has blogged in detail about the new release and the new DPL.

Sam Hocevar ran for DPL on a platform based on some significant new changes. It will be interesting to see what happens over the next year.

The release of Etch is an exciting milestone in Debian development. Among other things it has SE Linux working!

I'm going to try and arrange a party in Melbourne, Australia to celebrate. We also have a mailing list for Debian people in Melbourne, to subscribe send a message to with the subject subscribe. I'll use that list for arranging the party, send me private email if you are not subscribed but want to attend.

blogger sucks!

If I enter "a < b" in blogger then it works, but if I want the < symbol to be next to some other text (EG for a #include line in C source) then it treats it as a HTML tag. The HTML code for a < symbol also doesn't work. This doesn't work regardless of whether I try entering HTML in the HTML editor or entering text in the "Compose" editor. I could deal with this problem if it forced me to strictly use one of the two editors available, but it fails in both!

Do other blog server programs have problems like this? I think that I need to change my blog server just to allow posting C source!

Also do any of the common blog servers allow a file upload? For most of my posts I do all the editing offline, so I'd rather just upload a HTML file instead of pasting the text into the blog editor (and then manually fixing the situations where it's idea of formatting differs from mine). If such an upload mode also supported getting a file via HTTP then that would be convenient too. The option of editing a file on a remote server with vi, exporting it via Apache, and then getting it to the blog server via HTTP would work well for me in some important situations.

Update: It sucks more than I thought. The feed misses the "less than" character between "a" and "b" in the above paragraph. My list of requirements in this regard now includes the ability to use such characters in a feed. Actually I want to go the whole hog and be able to include samples of HTML in my blog entries and have them display correctly in the feed.

heartbeat - what defines a cluster?

In Debian bug 418210 there is discussion of what constitutes a cluster.

I believe that the node configuration lines in the config file /etc/ha.d/ should authoritatively define what is in the cluster and any broadcast packets from other nodes should be ignored.

Currently if you have two clusters sharing the same VLAN and they both use the same auth code then they will get confused about which node belongs to each cluster.

I set up a couple of clusters for testing (one Debian/Etch and the other Debian/unstable) under Xen using the same bridge device - naturally I could set up separate bridges - but why should I have to?

I gave each of them the same auth code (one was created by copying the block devices from the other - they have the same root password so there shouldn't be a need for changing any other passwords). Then things all fell apart. They would correctly determine that they should each have two nodes in the cluster (mapping to the two node lines), but cluster 1 would get nodes ha1 and ha2-unstable even though it had node lines for ha1 and ha2.

I have been told that this is the way it's supposed to be and I should just use different ports or different physical media.

I wonder how many companies have multiple Heartbeat installations on different VLANs such that a single mis-connected cable will make all hell break loose on their network...

Sunday, April 08, 2007

SE Linux - not too difficult for new users

At Jan-Frode Myklebust has documented his work in creating new SE Linux policy to run Googleearth on Red Hat Enterprise Linux 5. He discussed this with us on #selinux in (the main SE Linux IRC channel).

One of his later IRC comments was:
btw erich, the reason for creating this googleearth module was mostly triggered by your apparmor blog entry.. I wanted to see how much more difficult it would be to create policies in selinux compared to apparmor, and must say it wasn't too hard.

"erich" refers to Erich Schubert. His AppArmor blog entry is currently the top entry in his SE Linux list.

Saturday, April 07, 2007


This morning after having had my car parked in the sun for a couple of hours I poured water on the rear window to cool it (as described in this post). When I did so the mist that rose up from the window spiraled up in a way that was similar to a cyclone (but on a much smaller scale).

Would this have been caused by the steam rising off my car after I poured the water on? Or might the air have been moving in that pattern before and merely have been revealed by the water vapor?

It's a pity I didn't have my camera. My phone-camera doesn't have anything close to the quality needed to make a movie of this (it would be doubtful even with my regular camera).

wikisky is like google maps but looking up! It's cool, check it out!

Friday, April 06, 2007

Linux Tour Bus

I have seen buses used for tours that contain bunk beds. If one or more such buses were hired then a group of Linux people could go on a moving Linux conference. This would have to take place in an area with many reasonable size cities in a close area and where there is a good number of Linux people in such cities. Probably the EU is the only area.

A bus (or several buses depending on demand) would then take a group of Linux people through the major cities and have a conference in each one.

Currently there are conferences such as the Debian conference DebConf which receive sufficient sponsorship money to pay for many speakers to attend. Having a similar conference traveling around Europe shouldn't cost any more money and will give plenty of time for the people in the bus to do some coding along the way.

We already have the Geek Cruises, my idea is to do a similar thing but on the road. Also it isn't practical to transport an entire conference, so it would probably just be speakers on buses and the audiences would vary from city to city.

The Inevitability of Failure

I posted the NSA paper titled The Inevitability of Failure on my source-dump blog at this URL. If you want to get the background to SE Linux then this is the paper to read.

Submitting big documents (62K) to Blogger turns out to be surprisingly painful. It's a pity that Blogger doesn't support editing sections (as MediaWiki does) so every edit requires that the entire document be transferred back and forth. Do other blog servers make this easier?

Thursday, April 05, 2007

Mac vs PC

For a few months I have been spending a lot of time using a Mac running OS/X for 40 hours a week. Recently a discussion started at a client site as to whether Macs or PCs should be used for future desktop machines.

The Apple hardware is very slick, everything fits together nicely and works well. For example my Apple monitor connects to my Mac via a single cable that supplies power, USB, and the video signal. My USB keyboard and mouse connect to my monitor. So my Mac has only three cables connected to it, power, Ethernet, and the monitor cable. The same thing on a PC would require an additional USB cable going to the PC and an additional power cable going to the monitor.

That is just one trivial example of how Macs are slick.

The advantage of a PC is that installing Linux is much easier and better supported (the vast majority of Linux users have PCs). The benefits of slick hardware are greatly diminished if it's only partly supported in terms of drivers and/or getting answers to technical questions on the net.

Another issue is that of software compatibility. When doing Linux work having a desktop machine that runs X is a significant benefit - even if all your work is text based. The Apple X server that I have installed never worked properly and it's a hack like the various Windows X servers. To make things worse the current versions of OpenOffice for Mac use X and therefore don't work for me - I've been told that the next version of OpenOffice will fix this.

Then there's the issue of price. For a Mac you would want to have a machine with an Intel CPU which means a recent and expensive machine. For a PC the benefits of a 64bit machine over an old P3 or P4 are very small. For all the work I do a refurbished IBM or Compaq P3 machine would provide all the performance I require, have better software compatability than a Mac, and cost only $120 for a machine.

Also I imagine that some common and cheap PC expansions (such as KVM switches) would cost a lot more for a Mac.

You can run Linux on a Mac, but what's the point? You lose some of the hardware features so the Mac becomes an expensive PC that's not entirely compatible. I'd rather have a P3 PC on my desktop than the latest Mac.

But I wouldn't rule out recommending a Mac to other people. People who have no-one to help them run Linux are best advised to use a Mac. Of course Windows isn't viable due to security problems.

Wednesday, April 04, 2007

reviewing blog comments and links

It seems that the site is mirroring all my blog posts. The site seems to be doing some good things in terms of spreading information about free software and has a good presentation that makes such information easy to read. Also having a backup of my blog posts also could be handy if blogger ever does the wrong thing.

However it is a little annoying that when I write a blog post that refers to one of my older posts it will get a link back to This is an annoyance for readers who want to see posts that link to mine from outside my blog. So I've been deleting those links when I notice them.

Also someone from Brazil has been linking to my posts, which is a good thing. Their blog also causes my blog to list theirs as a link which is also fine. However the problem is that their blog seems to detect me as being from an English speaking country and gives me an English version of the blog (rather than the presumably Portuguese version that has the link to my article). Assuming that someone speaks English because they reside in Australia is a bad idea, and breaking links is a worse one. So I've been deleting those links from my blog as they are of no use to people who are detected as English speakers (which comprises the vast majority of my blog readers). When someone blogs about one of my posts I want to see what they wrote, even if all that I can read are the parts that are quoted from me!

Finally I've been deleting some comments containing URLs. It seems that there are quite a few people trying to advertise their businesses by posting comments that bear some vague relation to a blog post with their company's URL included. You have to try harder than that if you want to promote yourself on my blog.

Tuesday, April 03, 2007

cheap big TFT monitor

I just received the latest Dell advert, they are offering a 22 inch monitor with 1680x1050 resolution for $499 including delivery! This is a great deal, I've got the same model of monitor at home (I paid $750 for it almost six months ago) and have been totally satisfied. The same monitor with a $499 price is amazing value.

In the past I blogged about the benefits of larger monitors for software development. Now these benefits are available to most computer users in first-world countries.

Now that 1680x1050 is commonly available I expect to see higher resolution monitors dropping in price, at the $800 and $1200 price-points there will need to be something better than that.

The next development will be new software to take advantage of this. One thing that I have heard of is a window manager that splits the display into two halves (in this case they would be 840x1050 resolution). The benefit of having this configuration (according to the people who use it) is that for maximising a window will make it take half the physical screen. This means that you could have a debugger in one half of the screen and your application in the other, to "maximise" the application would not occlude the debugger. Or you could have a web browser and a MUA each using half a screen.

Of course the same result could be achieved by getting two physical displays, but this requires a graphics card that supports "twin-head" operation, and the purchase price of two displays (which will add up to more than $500).

Splitting a screen into two virtual displays is not something that would suit my working patterns. For a lot of my work I just have a screen filled with as many Xterms as will fit. For the GUI stuff I am happy to manually resize things. Maybe a KDE addition that would allow one "Desktop" to be split while another isn't would work.

A final impediment to splitting the screen is that 840 pixels is not enough to correctly display all web sites (many of which are designed for 1024x768). Maybe if I had a split desktop with an icon on the title-bar of the window to unsplit it for one particular window then it would work.

Another use for a large display is virtualisation. I previously blogged about how to use Xephyr to run multiple X sessions on one display, as Xen is now supported in all Linux distributions and KVM and other
virtualisation technologies are also being developed there should be a lot of demand to have multiple virtual machine GUI displays on one desktop (although you could probably do this by manually sizing the windows).

These are just some wild ideas, I have no plans to write the code for any of them, so it'll be a matter of whatever is desired by the people who write the code or pay them. But one thing is certain, the low prices of such monitors will drive new research into how to use them effectively. New technology to effectively use large displays will then drive demand for even larger displays (as will the people who just want to get something better and more expansive than their neighbours). I wonder when we will get to the stage when people are satisfied. For basic office applications commodity PC hardware has far surpassed what is needed for people to do their work.

Geek Social Fallacies

At my Source Dump blog I posted a copy of the Geek Social Fallacies. It's difficult to find a copy when you want it, so I think it needed to be mirrored.

Monday, April 02, 2007 - malware?

When looking through my Webalizer stats recently I noticed that * is transferring about four times as much data from my domain than * This wouldn't concern me if I saw some people being referred to my site from, however I see almost none, while is responsible for referring about half the traffic to my site!

Then I looked through the aggregate stats for all web sites hosted on my ISP and noticed that has three times the bandwidth use of google while not showing up in referrals.

I did a couple of test searches with and it seems that one reason why I'm not getting hits is because the search engine just isn't much good. The search string "bonnie++" does not return any links to my program on the first page (maybe can't handle a '+' character).

So I'm now wondering whether there is any reason to permit the servers to use my bandwidth. It's costing my ISP money for no apparent good cause.

In the past there was a previous MS search engine that I had to block because it's attacks (which can not be described in any other way) were using half the web bandwidth of the entire ISP). This case is not so obviously an attack and I'm wondering whether I should permit it to continue for a while just in case they end up giving me some useful referrals.

Of course the other possibility is that if we all block their servers then the results will become even more useless than they currently are and they'll give up on the idea.

I look forward to comments on this issue.

tour of the Gigabyte motherboard factory

This article has some really interesting pictures of the Gigabyte motherboard and video card factory. Check it out!

Sunday, April 01, 2007

buy free software developers dinner

In response to my post about buying dinner for developers (as an alternative to "professional networking sessions") Kris notes that his company has been doing it for years. He goes a little further than I did in my post and advocates buying dinner for developers as a way of thanking them for their work.

I agree that buying dinner for people is a good way of thanking them for their work. I didn't suggest it in my post though because I didn't expect that there would be much interest in such things. I'm glad that Kris has proved me wrong. I'm not sure whether Kris was talking about personally buying dinner or getting his company to do so. In either case it's a really good thing, and I encourage others to do the same!

right-side visual migraine

This afternoon I had another visual migraine. It was a little different from the previous ones in that it had more significant visual affects and in that it affected the right side of my vision. My central vision was OK, the left side was quite good, but the right side was mostly occluded by bright flashes. Closing my right eye seemed to make it a little better - apparently my right eye was more affected than my left. Previous visual migraines had only affected my central vision.

It happened shortly after going outside and it was a sunny afternoon, so maybe the bright light helped trigger it. The Australian optometrist chain OPSM advertise transitions - lenses that darken when exposes to UV light so they act as sun-glasses when outdoors, this sounds interesting (I don't want to have prescription sun-glasses as well as regular glasses). However there is one concerning item in the advert - "protect your eyes from dazzling sunlight, harsh artificial lighting and the glare from computer screens", I don't want my glasses to go dark when I'm looking at a computer screen (a large portion of my waking hours)!