rryBlog
Fri, 17 Aug 2007 @ 15:35
[/tech]
Flurry rice

I don’t usually take benchmarks very seriously. It’s worthwhile running them on new hardware as a quick check that everything’s working as expected - but if the results are within an order of magnitude of the hoped-for numbers after a single run, then I’m usually happy to move onto more productive tasks. Leave the endless tweaking and measurbation to the inhabitants of gentoo-land.

With flurry, though, I thought I should take a little more care. It has a 10 disk array, so the standard “ach, sure, raid 5 will do” instinct can be very dangerous. A single disk failure will leave the machine vulnerable for up to 72 hours - a couple of days to replace the disk, and another to rebuild the array. That’s a bit too long for comfort, especially if environmental factors have been the root cause of the initial failure.

So; I really, really wanted to go for RAID 6 - but I was unsure as to how much of a performance penalty that that would incur. My vague, handwave-y guess was that it’d be about a third slower in use, when compared to RAID 5. I’d consider 50% slower to be unacceptable, and anything less than 25% slower to be surprisingly good.

It turns out that bonnie++ was the best tool for the job. I was able to mimic the sort of operations that our current mail server does most often by using it with the following command line:
bonnie++ -b -d /home/rory/ -u rory -n 128:25000:500:16

ie. to write 128*1024 files to each of 16 directories, with a random variation of sizes between 500 and 25000 bytes (the average filesize on our current mailserver is 12.14 Kb - so that’s about right) - 25 Gigabytes of data in total. The -b option causes it to issue fsync() calls after each file has been written - again, this is the same setup that we’ll be using when the server goes live.

I ran that five times on a 1 TB “vanilla” ext3 partition (mounted noatime, like every other disk partition i’ve touched in the past five years!) sitting on top of a LVM volume, which in turn was mounted on the various types of RAID array (5, 6, 1+0, 0 [the latter for comedy value only, of course]) supported by out HP P400 card. I didn’t bother trying any form of software raid.

For comparison purposes, I also ran bonnie++ on a machine that is identical to ashes - which had served as a webserver from September 2000 to August 2005, and hasn’t been touched since then. It has a 30GB partition mounted at the start of the array (ashes has a 40GB one), which is formatted as reiserfs (as it is on ashes). It’s therefore going to give us a nice indication of how much (if at all) faster the new system is compared to what it’s replacing.

The results are as follows:

RIAD test results

Well, unsurprisingly, all of the SATA RAID levels are faster than the old SCSI RAID 5 array for each of the four operations - between 13 and 14 times as fast for random reads (due mostly to having 10 spindles rather than just 4, I’m sure).

However, RAID 6 is “only” 3.01 times faster for random creates, compared to 5.87x for RAID5. That’s very close to being unacceptable to me, especially since this sort of operation accounts for almost a third of all those performed on our current mailserver.

Another option may be to go for RAID 5 + a hot spare. I’d end up with almost the same speed as the 10-disk RAID5 array, whilst being able to automatically being the array rebuild after the first failure - reducing the “danger time” by two thirds.

On the other hand, a mimimum of three times faster than the current system is still perfectly decent. I think I’m going to go need to do another round of measurebation aren’t I? Oh god, it’ll be -funroll-loops and buying a bass tube for my vauxhaul nova next…


Tue, 14 Aug 2007 @ 12:22
[/tech]
Ashes to, er Flurry…

Time for a new mailserver, then.

In an ideal world, I’d be putting the following on my shopping list:

HP DL380 G5, dual 2.67 GHz Xeon 5150 CPUs, 8 GB RAM, 8x 36 GB 2.5” 15k rpm SAS disks

- for the main mail server, anti-spam&virus, the mail queue (a large number of low-capacity 2.5” disks is the best route to achieving ultra-low seek times, which is important for randomly-accessed data like email), and sending our weekly mailshots.

HP DL320s, single 2.67 GHz Xeon 3070 CPU, 4 GB RAM, 12x 300 GB 3.5” 15k rpm SAS disks

- nfs server for users’ Maildirs, and the customer care mail database.

DL140 G3, dual 2.67 GHz Xeon 5150 CPUs, 8 GB RAM, 2x 100 GB 7.2 krpm SATA disks (for booting from only) x2

- user-facing servers - the first hosting a number of Xen instances for people to read mail using mutt or adjust their procmail setups, as well as pop3 and imap servers, and the second running webmail and the web frontend for the customer care mail system.

mm, tasty.

Unfortunately, I don’t have a budget of fifteen grand to spend on the above, so I’m going to make do with just the one box. In fact, it’s worse than that, as I’m also going to have to use this machine as a replacement for our ftp server, our “friends’n’family” webserver, and as a backup server (connected to a nice lto2 tape array).

As a result, I plumped for the following:

HP DL320s, single 2.67 GHz Xeon 3070 CPU, 4 GB RAM

- with an upgrade to a 512 MB battery-backed write cache

2x 72 GB 15k rpm 3.5” SAS disks

- a RAID 1 array for the mail queue and system partitions

10x 250 GB 7.2k rpm 3.5” SATA disks

- for users’ Maildirs

Total price was about £3,000 - a fifth of the cost of doing it right.


Meet flurry, our new mailserver

Things won’t be so awful for our technical staff - I’ll export their Maildirs to their own Xen instances on our big development server, laganside - so they can read their mail nice and quickly there. And I’ll probably inject the half-million message mailshots from infuse, a Xen instance elsewhere on our network. Even so - the new setup will merely provide a noticeable improvement to our users, rather tham being “zomg ultra-turbo-plus-plus!”. bah!

My task for the rest of week is to thoroughly benchmark the new machine, dubbed flurry. In particular, I’m interested to see the difference in speed between the various disk array setups that are open to me - JBOD (2.5 TB available), RAID 1+0 (1.25 TB available), RAID 5 (2.25 TB available), and RAID 6 (2.0 TB available).

My instinct is to go for either JBOD or RAID 6 - two disk failures will kill a RAID 5 array, and has a 50% chance of killing a RAID 1+0. With that number of disks, from the same manufacturer (a number of different batches, though), and subject to the same physical environment, the chances of experiencing multiple disk failures is higher than I’d like. I’m willing to be persuaded otherwise if the performance penalty for RAID 6 turns out to be huge, though.

Anyway, flurry has now been running memtest86+ for just over 24 hours, so it’s time for me to go start the benchmarking. hurrah!


Tue, 14 Aug 2007 @ 10:43
[/tech]
Dust to dust, etc.

I arrived at Sendit two-and-a-half years ago, and it quickly became obvious that every server needed to be replaced, and every system needed to be overhauled. That job is pretty much complete now - and, thanks to the magic of Xen, we’ve gone from having 60 or so servers to 11 (in fact, the savings in electricity alone will pay for the cost of the machines within the first half of their expected service life).

One of the last tasks on my list is to replace our mail server - something I’ve been looking forward to, as email is probably the closest I have to being a specialist subject. Our current server is ashes - a Dell Poweredge 2450 that entered service on 5th September, 2000. I’m aiming to do the switchover on its seventh birthday :)

Ashes has two 666 MHz Pentium III Xeon processors, a gigabyte of RAM, and 54 GB of disk (4x 18 GB 10krpm SCSI disks in a hardware RAID 5 array). It runs a fairly vanilla install of Qmail, qmail-popup for pop3, and courier-imap for imap4 (both of which are wrapped with stunnel for the ssl-ised variants). Some semblance of anti-spam measures are provided by rblsmtpd (pointed at the sbl-xbl.spamhaus.org blacklist), and most users also run spamassassin from their procmailrc. Anti-virus is provided by McAfee’s uvscan.

Our mail system has a couple of quirks - first, mail coming in to our customer care team is forwarded on to a separate server, angel (another PE2450, though slightly older) for entry into a mysql-back perl behemoth. Secondly, we send weekly special offers mails to half a million or so customers that have opted in to that service - this is done by a third server, y02, which is a shoddy *8* year old Dell Dimension XPS desktop box. eep!

Mail volume is pretty substantial - we have 91 “real” users, and 568 aliases (not counting all the various username-blah@ “dot-qmail”-style aliases). On a typical day, we would see around 350,000 delivery attempts, of which maybe 140,000 will be accepted into the system. Both of those figures can rise by 100,000 in the 24 hours after we send a mailshot, thanks to the staggering number of inventively-broken “out of office” autoresponders that our customers use.

This brings us to one of qmail’s major weaknesses. Since it doesnt check if a user actually exists before accepting mail in to the system, we can’t bounce backscatter / spam / improperly-addressed mail at SMTP time. Instead, we create around 40,000 new bounce messages every day, which might have been acceptable a decade ago, but is terribly anti-social these days.

Of the 100,000 emails that are delivered to users every day, perhaps three quarters are spam or viruses that either get sent to /dev/null or (all to often) end up in people’s inboxes. In short, more modern SMTP-time checks would save our server from doing an awful lot of work.

Finally, we have the issue of disk space. Ashes has 36 GB devoted to the /home partition. This has currently got less than a gigabyte free, and has never been less than 90% full since I’ve been here. Users have become adept at downloading mail, and storing it in whatever nasty, fragile binary format outlook uses. Even so, I have to harrass them every month or two to clean their inboxes out - a huge waste of everyone’s time. Since we use the dreaded RicerFS (in notail mode, too!) most Maildir/ files are unimaginably fragmented - to the point that opening a maildir containing 1,000 messages can take over two minutes.

Something needs to be done…


Tue, 01 Aug 2006 @ 08:54
[/tech]
New shiney(sic)

I’ve bought myself a Canon 5D digital SLR. I’ve only been out to use it once so far, but photos are at my new gallery-thing, at http://gallery.nothovel.net/ (and an RSS feed of my favourite images is available at http://gallery.nothovel.net/rss/favourites ).

I’m incredibly impressed with the camera. It’s a bit smaller than my film SLR, but a bit heavier (well, of course it is - a film camera is an empty box, whereas a digital camera is stuffed full of electronics). Other than that, they operate in an almost identical manner - other than that the DSLR can change ISO on the fly! hurrah!

Now, I’ve been scanning (mostly self-developed b&w) film for a couple of years; my Nikon film scanner turns out images that are almost 9 megapixels in size, while the DSLR has almost 13 megapixels. The difference in quality, is /much/ huger than a mere 20% jump in horizontal resolution would suggest.

9 mpix is high-enough resolution to see individual grain clusters at iso 400 or above, and even at iso 100, non-grainy, well-focused bits didn’t look /sharp/ when viewed at 1:1. This weekend, with the DSLR, and using my two crappiest lenses, I was able to achieve sharpness at iso 400 that almost cut your eyeballs to ribbons when compared with the output from the film scanner.

Not only that, but there’s /colour/. Lots and lots and /lots/ of it. In fact, perhaps a bit too much - I’d read that Canon’s higer-end DSLRs tended to produce rather flat colour, because that provides a better base for later post-processing work. I wasn’t really wanting to spend hours on each image in the Gimp, so I set contrast and saturation to one level above their default values (and therefore matched the setup in the 350D and 30D cameras). On a bright sunny morning, this made everything look like it had come from a crappy holiday brochure, which I’m not really sure I like.

Ah well, at least I’ve got plenty of new things to learn about :)


Thu, 27 Jul 2006 @ 21:54
[/tech]
I have a new baby

Well, my home network does, at least. Marrow is my 1 week old media storage server, and has almost a terabyte of storage space - which is exported via nfs to the other machines in my house.

The machine itself is an old Dell Dimension XPS. The “XPS” is Dell-speak for “not quite as shit as a standard Dimension”. I’ve always been a fan of Dell Optiplex and Precision desktops - but the Dimension and XPS series make the Slashdot crowd’s sneery attitude towards Dell desktops seem justified. ah well.

Anyway, when bought, marrow had a 700 MHz Pentium 3 cpu, 256 MB RAM, 20 GB of hard disks, and a quad speed cd-rom drive. Now, it has er, a 700 MHz cpu, 756 MB RAM, 5x 250MB disks, 2x five disc cd changers, and a dvd-rw drive. Anyone would think that it’d been pilfering from a cyclist’s medicine cabinet.

Each disk has a two partitions on it, of 5 GB and 245 GB:

+------------++------------++- - - - - - +   +------------------------+
|            ||            |             |   |                        |
| /dev/hda1  || /dev/hdb1  || /dev/hdc1      | /dev/hdd1    /dev/hde1 |
| raid 1     || raid 1     |  raid 1     |   | raid 0                 |
| / (mirror) || / (mirror) || / (spare)      | swap                   |
|            ||            |             |   |                        |
+------------++------------++ - - - - - -+   +------------------------+
+---------------------------------------------------------------------+
|                                                                     |
| /dev/hda2     /dev/hdb2      /dev/hdc2       /dev/hdd2    /dev/hde2 |
| raid 5                                                              |
| mounted on /srv                                                     |
|                                                                     |
+---------------------------------------------------------------------+

So, we have a redundant, mirrored 5 GB root partition, with a hot spare. 10 Gb of swap is available on striped set - this is greeat for a toy setup, but a silly idea for a real machine, as failure of either disk will cause the machine to die spectacularly if/when it tries to use more than half of the available swap space.

I’ve been fiddling around to see the best parameters to use with nfs. After lots of geeky testing, I’ve come up with the following mount line:

marrow:/srv/music /home/rory/music nfs rw,noatime,rsize=32768,wsize=32768,hard,intr

noatime: Don’t update the file’s last access timestamp - I’ve never used the atime for anything, so I’m happy not to have to to the remote server every time I touch a file.

rsize, wsize: These values surprised me. The default is 2048, and the standard recommendation is to increase it to 8192 if your server supports large values. To be honest, I’d expected that 64k - the server’s raid 5 stripe size - would provide the best performance; but, in my tests, 32k was significantly faster than any other value (with 16k being marginally faster than either 8k or 64k). I have a feeling that this is because the mode of the file sizes in my trial set (my homedir, essentially) was 31.5k. In other words, the optimal size of the rsize and wsize values are dependant on the size of the files being used.

hard: This allows file operations to be retried even if they initially time out - which means that the nfs server can go away for a reboot without causing any client-side problems

intr: hard mounting has a drawback, however - if the server crashes, the client will keep on trying to access it. You won’t be able to close any program that has a lock on a remote file, or unmount any remote partitions. intr fixes this by letting the client interrupt any such operation. hurrah!

Finally, a note about using nfs on a desktop system. Desktop enivronments, like GNOME, need to know whats happening to files in a users homedir, so that they can show those changes in the file manager, etc.

The obvious way to do this is to periodically poll each file for changes. This is, of course, slow, system intensive, and icky. A much better way is for the kernel to notify the desktop each time that some other subsystem changes a file - something it does using the older dnotify, or shiny new inotify interface.

In the olden days, we used “fam” to listen for dnotify events. You could run it on a remote nfs server, and get it to broadcast update notifications to nfs client machines, which was all very handy. However, fam has now been replaced with “gamin”, which is more lightweight (and so less system-intensive), and can listen for inotify events rather than dnotify (again, less system-intensive).

Unfortunately, gamin can’t do the funky listen-to-a-remote-fam-server thing. This is supposedly a feature-not-a-bug. “security, donchaknow”. BUT, JESUS FUCK, NFS IS WILDLY INSECURE /ANYWAY/. cretins.

So, gamin falls back to polling all your nfs mounts every 10 seconds. Braindead, yes? So, ease the pain by making it every 90s instead, by editing your /etc/gamin/gaminrc to add the following line:

fsset nfs poll 90

With that added, marrow is able to saturate the network to both humble (full duplex eth100) and sanguine (802.11g) - which is exactly what I wanted to achieve. hurrah!


Wed, 22 Feb 2006 @ 13:42
[/tech]
Working with graphics from the command-line

Sarah’s been busy in the studio, taking photos of various gorgeous people for her project on body modification. Yesterday’s session produced around a gigabyte of images, which were uploaded to wintermute for safekeeping.

Unfortunately, some of them didn’t turn out properly. In this one, for example, the studio flash didn’t fire - so the image was taken with the modelling lights only. Therefore, it’s underexposed and the colour balance is badly off.

underexposed peet

What could I do to help?

Well, the images are on a remote server, and they’re almost 4 MB each. I didn’t fancy having to download, edit, then re-upload them - so that limited me to the command-line tools available in the Imagemagick suite. In particular, I’ve used the “mogrify” and “convert” commands - “mogrify” changes the original image, whereas “convert” saves the changes to a new file.

Step one: auto-rotation

Sarah’s Canon 350D camera has a sensor that detects if a photo is taken in portatrait orientation, rather than landscape. This information is encoded in the EXIF data within the jpeg headers, and can be used to rotate any images that need it without losing any quality:

jhead -autorot *.jpg
Step two: normalization

Okay, the image is now the right way up, but it’s still very dark. Imagemagick’s “normalize” option will spread out the colour values over the full range, increasing contrast and (usually) helping to correct colour balances (note that this is called “Auto Levels” in the GIMP and Photoshop:

mogrify -normalize *.jpg
normalized peet

Step three: converting to b&w

Hm, those colours are awful - he looks jaundiced and spotty and hungover, nothing like his appearance in the properly-lit pictures - and the damage is probably beyond repair without lots of tedious pixel-level editing. The range of tones is decent enough, though, and there’s plenty of contrast - so lets try converting that one black and white:

mogrify -monochrome 1213.jpg
monochrome peet

Ooops. That’s a dithered two-colour image - we want a grayscale. The “grayscale” or “desaturate” option in GUI graphics programs wil almost always average the red, green, and blue values of each pixel - thus creating a b&w image that contains information from all three colour channels. to replicate this in the gimp, we use the “modulate” option, which lets us specify a percentage change for brightness, saturation, and hue:

mogrify -modulate 100,0,100 1213.jpg
desaturated peet

Not much better, is it? Unfortunately, the colour image was rather noisy - but that noise was largely confined to the colour of each pixel, rather than the brightness of it. So, rather than just extracting the r, g, and b values from each pixel, lets try extracting the “intensity” and “luminosity” values from each pixel.

convert 1213.jpg -fx 'luminosity' 1213.luminosity.jpg
luminous peet

convert 1213.jpg -fx 'intensity' 1213.intensity.jpg
intense peet

For a portrait, the luminosity version is probably the better - but either is a huge improvement over a simple desaturation. Hurrah!

Step four: noise reduction

There’s still a little bit of noise in that image - so, let’s try to smooth that out using imagemagick’s “enhance” option:

convert 1213.luminosity.jpg -enhance 1213.enhanced.jpg
enhanced peet

That’s an improvement - but at the cost of a little loss of detail in his hair, and a general softening of the image. Personally, I think the original was fine - but Imagemagick’s noise enhancement can work wonders on images that would otherwise have been ruined by grain.

Step five: sharpening

All the cool kids apply an unsharp mask to their images, so lets do that, too. Imagemagick can auto-select appropriate values, which is really handy for doing a reasonable-enough job on huge batches of images:

convert 1213.unsharp.jpg -unsharp 1213.unsharp.jpg
unsharp peet

And finally…

Becuase jpegs are lossy, the manipulations we’ve made in steps two to four should really be done at the same time, in order to preserve maximum image quality (and also to keep the file size down):

convert 1213.jpg -normalize -fx 'luminosity' -enhance -unsharp 0 1213.processed.jpg
final peet

easy, eh?


Thu, 29 Sep 2005 @ 09:45
[/tech]
Getting a Dell Poweredge to show all available memory

The four six-year-old Dell Poweredge 2300 machines that we’d be running our main site on have slowly been dying off over the last few months.

In July, I bought two brand new Poweredge 1850s as replacements. With dual Xeon64 processors and a couple of gigs of ram, they should be more than adequate, alongside a couple of the older boxes. I cloned an image of the old server setup onto one of them, and put it into service immediately. I was intending to install Debian Sarge from scratch onto the other, but various perl module incompatibilities need more care and attention than I’ve had time to give of late.

Yesterday, though, another of the old servers failed. We’re also running a very successful promotion at the moment, which has increased pressure on our machines. Last week, the PE1850, carrying 50% of traffic, was experiencing a load avg of around 0.4. Yesterday evening, carrying 75% of traffic, it was seeing a load of around 9.5. With dual hyper-threaded processors, a load average of more than 4 will mean that customers may notice slowdowns in accessing the site.

So; I decided to give up on the install-from-scratch, and simply clone the server system image onto the second PE1850. That took about six hours in all - there’s a 100 Mbps LES line between the datacentre and my office, but only a 10 Mbps hub at my desk (beh!).

Strangely, though, I couldn’t get the second new Poweredge to show all available memory in Linux. I tried Linux 2.4 and 2.6 kernels, and both the lilo and grub bootloaders, but saw only 256 MB ram with each. agh!

Both have the same kernel config - with CONFIG_MPENTIUM4, CONFIG_HIGHMEM4G and CONFIG_HIGHMEM all set (and CONFIG_NOHIGHMEM unset, obviously). I ran a diff between the dmesg output on both machines, and the only difference was a “allocated 32 pages and 32 bhs reserved for the highmem bounces” line on the “working” machine that wasn’t present on the broken one. That’s diagnostic, though - not anything that would /cause/ problems. And, yes, I’d tried passing “mem=2048M” on the kernel’s command line.

Strangely, all two gigs of RAM could be seen in the bios’ memory test, and running memtest86, both from the boot loader and from the commandline.

So, eventually, I went to have a hoke through the BIOS settings. I hadnt changed either of them from the factory defaults, but it seems that the “broken” machine has “OS Install mode” selected, whereas the other didn’t. What does “OS Install mode” do? Why, it limits the amount of memory reported to 256 MB. agh!

I turned that off, and lo, all 2048MB was visible in linux! hurrah! But, why had that been set in the first place? Who knows? And what possible use can it have? I’m presuming it’s something to do with OS/2 or Windows NT4 compatibility, but even so… grrrrrr!


Mon, 06 Jun 2005 @ 14:14
[/tech]
My vacuum cleaner died last month…

…but even when it was working, it had never done a very good job on my living room carpet, which is exceptionally cheap and nasty. It’s woven from artificial fibres, which moult and then get stuck in the remaining pile, forming hairballs. The hoover wasn’t powerful enough to do more than push those around; it functioned more as an old-fashioned carpet sweeper, and left the place looking almost as filthy as if I didn’t bother to hoover at all.

So; I needed a new hoover. Immediately discounting the overpriced dyson-type tat, I went to look for one of those “henry” devices that office cleaners use. I reasoned that they cost about £25, so were essentially disposable. Better still - though they may not have nilfisk-grade filtration, or any other “features” to speak of, they are powerful and reliable enough for industrial use, and are therefore easily good enough for use in the not.nothovel.

Unfortunately, that idea seems to have occurred to more than a few other people, and Henry prices have quadrupled in the last five years. bah.

So, thought I, if I’m going to spend that much, why not spring for a Roomba? Well, they turned out to be £200-ish (or £150 on ebay) - but Paddy mentioned that B&Q had had some cheaper ones in last time he was there…

One trip to B&Q later, and I was now in posession of their own brand Roomba-clone for £30 (it’s not listed on their website, unfortunately - the closest match is some £900[!] Karcher thing).

I’m really rather impressed with it; it doesnt do the return-to-base-after-a-set-time thing like the roomba - it just keeps on wandering about until the battery runs out (generally in about an hour). That would probably be a problem in an office environment, but is fine in my living room. The dust compartment is small - but after a month without hoovering, it was only just about full after the first hour-long run, so it’ll be more than enough for normal use.

Best of all, is that it’s really amusing to watch. I sort-of wish I had cats, to see how they’d react to it…

Anyway; pictures of it are boring - it just looks like a roomba. What you really want is this video (6.8 MB)