rryBlog
Tue, 14 Aug 2007
Dust to dust, etc.

I arrived at Sendit two-and-a-half years ago, and it quickly became obvious that every server needed to be replaced, and every system needed to be overhauled. That job is pretty much complete now - and, thanks to the magic of Xen, we’ve gone from having 60 or so servers to 11 (in fact, the savings in electricity alone will pay for the cost of the machines within the first half of their expected service life).

One of the last tasks on my list is to replace our mail server - something I’ve been looking forward to, as email is probably the closest I have to being a specialist subject. Our current server is ashes - a Dell Poweredge 2450 that entered service on 5th September, 2000. I’m aiming to do the switchover on its seventh birthday :)

Ashes has two 666 MHz Pentium III Xeon processors, a gigabyte of RAM, and 54 GB of disk (4x 18 GB 10krpm SCSI disks in a hardware RAID 5 array). It runs a fairly vanilla install of Qmail, qmail-popup for pop3, and courier-imap for imap4 (both of which are wrapped with stunnel for the ssl-ised variants). Some semblance of anti-spam measures are provided by rblsmtpd (pointed at the sbl-xbl.spamhaus.org blacklist), and most users also run spamassassin from their procmailrc. Anti-virus is provided by McAfee’s uvscan.

Our mail system has a couple of quirks - first, mail coming in to our customer care team is forwarded on to a separate server, angel (another PE2450, though slightly older) for entry into a mysql-back perl behemoth. Secondly, we send weekly special offers mails to half a million or so customers that have opted in to that service - this is done by a third server, y02, which is a shoddy *8* year old Dell Dimension XPS desktop box. eep!

Mail volume is pretty substantial - we have 91 “real” users, and 568 aliases (not counting all the various username-blah@ “dot-qmail”-style aliases). On a typical day, we would see around 350,000 delivery attempts, of which maybe 140,000 will be accepted into the system. Both of those figures can rise by 100,000 in the 24 hours after we send a mailshot, thanks to the staggering number of inventively-broken “out of office” autoresponders that our customers use.

This brings us to one of qmail’s major weaknesses. Since it doesnt check if a user actually exists before accepting mail in to the system, we can’t bounce backscatter / spam / improperly-addressed mail at SMTP time. Instead, we create around 40,000 new bounce messages every day, which might have been acceptable a decade ago, but is terribly anti-social these days.

Of the 100,000 emails that are delivered to users every day, perhaps three quarters are spam or viruses that either get sent to /dev/null or (all to often) end up in people’s inboxes. In short, more modern SMTP-time checks would save our server from doing an awful lot of work.

Finally, we have the issue of disk space. Ashes has 36 GB devoted to the /home partition. This has currently got less than a gigabyte free, and has never been less than 90% full since I’ve been here. Users have become adept at downloading mail, and storing it in whatever nasty, fragile binary format outlook uses. Even so, I have to harrass them every month or two to clean their inboxes out - a huge waste of everyone’s time. Since we use the dreaded RicerFS (in notail mode, too!) most Maildir/ files are unimaginably fragmented - to the point that opening a maildir containing 1,000 messages can take over two minutes.

Something needs to be done…

Comments