Getting a Dell Poweredge to show all available memory
The four six-year-old Dell Poweredge 2300 machines that we’d be running our main site on have slowly been dying off over the last few months.
In July, I bought two brand new Poweredge 1850s as replacements. With dual Xeon64 processors and a couple of gigs of ram, they should be more than adequate, alongside a couple of the older boxes. I cloned an image of the old server setup onto one of them, and put it into service immediately. I was intending to install Debian Sarge from scratch onto the other, but various perl module incompatibilities need more care and attention than I’ve had time to give of late.
Yesterday, though, another of the old servers failed. We’re also running a very successful promotion at the moment, which has increased pressure on our machines. Last week, the PE1850, carrying 50% of traffic, was experiencing a load avg of around 0.4. Yesterday evening, carrying 75% of traffic, it was seeing a load of around 9.5. With dual hyper-threaded processors, a load average of more than 4 will mean that customers may notice slowdowns in accessing the site.
So; I decided to give up on the install-from-scratch, and simply clone the server system image onto the second PE1850. That took about six hours in all - there’s a 100 Mbps LES line between the datacentre and my office, but only a 10 Mbps hub at my desk (beh!).
Strangely, though, I couldn’t get the second new Poweredge to show all available memory in Linux. I tried Linux 2.4 and 2.6 kernels, and both the lilo and grub bootloaders, but saw only 256 MB ram with each. agh!
Both have the same kernel config - with CONFIG_MPENTIUM4, CONFIG_HIGHMEM4G and CONFIG_HIGHMEM all set (and CONFIG_NOHIGHMEM unset, obviously). I ran a diff between the dmesg output on both machines, and the only difference was a “allocated 32 pages and 32 bhs reserved for the highmem bounces” line on the “working” machine that wasn’t present on the broken one. That’s diagnostic, though - not anything that would /cause/ problems. And, yes, I’d tried passing “mem=2048M” on the kernel’s command line.
Strangely, all two gigs of RAM could be seen in the bios’ memory test, and running memtest86, both from the boot loader and from the commandline.
So, eventually, I went to have a hoke through the BIOS settings. I hadnt changed either of them from the factory defaults, but it seems that the “broken” machine has “OS Install mode” selected, whereas the other didn’t. What does “OS Install mode” do? Why, it limits the amount of memory reported to 256 MB. agh!
I turned that off, and lo, all 2048MB was visible in linux! hurrah! But, why had that been set in the first place? Who knows? And what possible use can it have? I’m presuming it’s something to do with OS/2 or Windows NT4 compatibility, but even so… grrrrrr!