And there are more than one type of clustering. The kind I mentioned (PVM) is used when you need a lot of computational power such as in supercomputing. However, you can just run any old program on a supercomputer cluster. It's a little like running a single threaded application (like Firefox or OpenOffice.org for instance) on a multiprocessor machine. It's only going to use one processor on that machine. On supercomputers the applications have to be specially written so large tasks can be broken down into many smaller tasks and distributed evenly between all processors. A cluster with 1,000 processors can basically perform a large task 1,000 times faster than a single processor machine. But again, the app has to be written to take advantage of the technology.
The other type of cluster is very common in businesses. We actually run many computer clusters where I work. For instance, we have web server clusters, mail clusters, proxy clusters, database clusters, etc. These kinds of clusters are designed to distribute the load evenly from clients and provide high availability. That is, one machine can fail and the others will pick up the load.
The wikipedia has a nice writeup as usual:
http://en.wikipedia.org/wiki/Computer_cluster
Here you can find the top 500 supercomputers:
http://www.top500.org/
Of which Linux is by far the OS of choice:
http://www.top500.org/stats/27/osfam/
I also should have also mentioned Beowulf previously:
http://www.beowulf.org/
Nice cooling system on this Linux cluster:
