Leap second causing ksoftirqd and java to use lots of cpu time

Today there was a leap second at 23:59:60 UTC. On one of my systems, this caused a high CPU load starting from around 02h00 GMT+2 (which corresponds with the time of the leap second). ksoftirqd and some java (glassfish) process where using lots of CPU time. This system was running Debian Squeeze with kernel 2.6.32-45. The problem is very easy to fix: just run

# date -s "`date`"

and everything will be fine again. I found this solution on the Linux Kernel Mailing List: http://marc.info/?l=linux-kernel&m=134113389621450&w=2. Apparently a similar problem can happen with Firefox, Thunderbird, Chrome/Chromium, Java, Mysql, Virtualbox and probably other processes.

I was a bit suprised that this problem only happened on this particular machine, because I have several other servers running similar kernel versions.

MegaCLI: useful commands

Recently I installed a server with a Supermicro SMC2108 RAID adapter, which is actually a LSI MegaRAID SAS 9260. LSI created a command line utility called MegaCLI for Linux to manage this adapter. You can download it from their support pages. The downloaded archive contains an RPM file. I installed mc and rpm on Debian with apt-get, and then extracted the MegaCli64 binary (for x86_64) to /usr/local/sbin, and the libsysfs.so.2.0.2 from the Lib_utils RPM to /opt/lsi/3rdpartylibs/x86_64/ (that’s the location where MegaCli64 looks for this library).

Here are some useful commands:

View information about the RAID adapter

For checking the firmware version, battery back-up unit presence, installed cache memory and the capabilities of the adapter:

# MegaCli64 -AdpAllInfo -aAll

View information about the battery backup-up unit state

# MegaCli64 -AdpBbuCmd -aAll

View information about virtual disks

Useful for checking RAID level, stripe size, cache policy and RAID state:

# MegaCli64 -LDInfo -Lall -aALL

View information about physical drives

# MegaCli64 -PDList -aALL

Patrol read

Patrol read is a feature which tries to discover disk error before it is too late and data is lost. By default it is done automatically (with a delay of 168 hours between different patrol reads) and will take up to 30% of IO resources.

To see information about the patrol read state and the delay between patrol read runs:
# MegaCli64 -AdpPR -Info -aALL

To find out the current patrol read rate, execute
# MegaCli64 -AdpGetProp PatrolReadRate -aALL

To reduce patrol read resource usage to 2% in order to minimize the performance impact:
# MegaCli64 -AdpSetProp PatrolReadRate 2 -aALL

To disable automatic patrol read:
# MegaCli64 -AdpPR -Dsbl -aALL

To start a manual patrol read scan:
# MegaCli64 -AdpPR -Start -aALL

To stop a patrol read scan:
# MegaCli64 -AdpPR -Stop -aALL

You could use the above commands to run patrol read in off-peak times.

Migrate from one RAID level to another

In this example, I migrate the virtual disk 0 from RAID level 6 to RAID 5, so that the disk space of one additional disk becomes available. The second command is used to make Linux detect the new size of the RAID disk.

# /usr/local/sbin/MegaCli64 -LDRecon -Start -r5 -L0 -a0
# echo 1 > /sys/block/sda/device/rescan

Extending an existing RAID array with a new disk

./MegaCli64 -LDRecon -Start -r5 -Add -PhysDrv[32:3] -L0 -a0

Create a new RAID 5 virtual disk from a set of new hard drives

First we need to now the enclosure and slot number of the hard drives we want to use for the new RAID disk. You can find them out by the first command. Then I add a virtual disk using RAID level 5, followed by the list of drives I want to use, specified by enclosure:slot syntax.

# MegaCli64 -PDList -aALL | egrep 'Adapter|Enclosure|Slot|Inquiry'
# MegaCli64 -CfgLdAdd -r5'[252:5,252:6,252:7]' -a0

Extending an existing RAID array with a new disk

First check the enclosure device ID and the slot number of the newly added disk with the command above. Then we reconstruct the logical drive, adding the new drive. For a RAID 5 array this command is used:

# MegaCli64 -LDRecon -Start -r5 -Add -PhysDrv[32:3] -L0 -a0

View reconstruction progress

When reconstructing a RAID array, you can check its progress with this command.
# MegaCli64 -LDRecon ShowProg L0 -a0

(replace L0 by L1 for the second virtual disk, and so on)

Configure write-cache to be disabled when battery is broken

# MegaCli64 -LDSetProp NoCachedBadBBU -LALL -aALL

Change physical disk cache policy

If your system is not connected to a UPS, you should disable the physical disk cache in order to prevent data loss.

# MegaCli -LDGetProp -DskCache -LAll -aALL

To enable it (only do this if you have a UPS and redundant power supplies):

# MegaCli -LDGetProp -DskCache -LAll -aALL

More information


Fixing grub-probe error: Couldn’t find PV, check your device.map.

Today I was getting this error when installing a new kernel on a server running Debian:

/usr/sbin/grub-probe: error: Couldn't find PV pv2. Check your device.map.

The error can be reproduce by running the update-grub command.

The day before, a new RAID disk was added to this server, so I suspected this could be the cause. The file /boot/grub/device.map contained a reference to the first RAID disk as (hd0) but did not contain a reference to the new RAID disk. I ran

# ls -l /dev/disk/by-id/

to find out which SCSI ID referred to sdb (the new RAID disk), and then added the following line to device.map:

(hd1) /dev/disk/by-id/scsi-3600304800087c4f015fb4f2e4cc7a8e5

Now installing the new kernel works fine!

HP support: the end

So, how did the almost two month lasting struggle with HP’s support end (see part 1, part2)?

On Wednesday evening, I received a mail that a 160 GB SSD was sent and we received it on Thursday morning. Also during the same week, we received a 500 GB 7200 RPM hard drive, which was meant to be a temporary replacement until the 160 GB SSD was available again.

So things are finally solved for good. I am just surprised that the 160 GB SSD suddenly became available so quickly now (it was pretty useless to send a 500 GB disk if the SSD would arrive only a few days later). Is this just coincidence or did the complaining convince HP to finally make a real effort to find a replacement quickly? We will probably never know.

HP's customer support: part 2

Two weeks ago I wrote about my struggle with HP’s customer service. To summarize: HP was unable to replace a failed 160 GB SSD because it was not in stock and was unable to provide me any other alternative even after one month. In the end, a 250 GB SSD was promised, but it also was not delivered.

  • Friday morning, 1 July, I call back HP’s support service. It seems that they still need approval from the Customer Relations Team (CRT) Belgium to send me a 250 GB SSD instead of the 160 GB one but were unable to get an answer from CRT. The guy on the phone trieq different times to call CRT while I am waiting, but the call is always dropped after one minute. In the end, he can not do anything more than contact CRT by e-mail.
  • Friday afternoon, I receive an e-mail from CRT. CRT Belgium & Luxembourg seems to be located in Sofia (Bulgaria), but they are answering me in Dutch. They approve the replacement of the 160 GB SSD by a 250 GB one and apologize for the long delay. Finally I start having some hope than things will be fixed now.
  • Friday evening, I take a look at the case log on HP’s support site. I feel big consternation when reading that a few hours afters CRT approved replacement by a 250 GB SSD, it appears that the 250 GB SSD is also unavailable! The case log mentions that I was informed about the delay but I had not had any contact with HP anymore after Friday morning, so the only way I discover the new delay, is by logging into HP’s site and reading the case log.
  • Monday morning, 4 July, I reply to CRT and to the support case that I do not accept the new delay and I demand an immediate solution. I receive a message in which they apologize for the delays and they inform me that people of superior departments are looking for a solution “with appropriate priority”. I also receive a message from our HP distributor asking whether this problem is still pending. I confirm them on Wednesday 6 July that this is still the case and they will transfer my complaint to HP Belgium.
  • I finally get a reaction from HP on Friday 8 July. They inform that they will send me a 500 GB 7200 RPM disk instead of the 160 GB SSD which is not deliverable. The disk will arrive on Monday 11 July. I answer them that I do not accept this as a final solution to the problem because a 7200 RPM disk is much slower and much more inexpensive than the 160 GB SSD this machine was bought with. In the afternoon I get the answer that the 500 GB 7200 RPM disk will be sent as a temporary replacement then, and that a 160 GB SSD will be ordered too and sent as soon as available.
  • As of Sunday 10 July, I have no indication that the 500 GB disk has been sent, so I am quite skeptical that the disk will be there on 11 July. I also have my doubts if and when I will finally receive an SSD.

To be continued…

HP: worthless customer support

One month ago, on the 25 May, I contacted HP support because a HP EliteBook 8540p (NU486AV) notebook had a broken 160 GB SSD disk (which is actually an Intel X25-M disk). The hard drive was not recognized anymore: both the BIOS and a Linux rescue cd could not find any connected hard drive. This machine was only a few months old and was bought with an 3 year HP eCare Pack for Next Business Day warranty support. Today, 30 June, HP still has not provided me any solution, even not a temporary one.

Here is a summary of what happened:

  • When calling HP’s customer service on the phone on 25 May, I was promised to receive a replacement SSD the next day. The helpdesk guy explicitly checked whether the disk was not out of stock, and apparently it was not. In the case log this is written as: “Part is NOT on CRT TOP shortage list , Part can be ordered”.
  • The next day I get a mail stating: “Your ordered part is delayed, the delivery date is not yet known.”
  • I do not hear anything from customer support for more than a week. On 6 June, I ask for a status update via HP’s support website. The same day someone calls me back to inform me that the SSD is out of stock. He only offers me a 80 or 120 GB SSD disk as an alternative, which I obviously do not agree with: I want a disk of at least 160 GB.
  • I hear nothing from customer support for almost 3 weeks. On 20 June I contact them back via the support site, demanding an immediate solution. In the case log this triggers this  cryptic message: “PSL Status requested by email to PS”. I do not get any reply.
  • Later that same week I ask my local HP distributor whether they can do something to trigger a solution. I do not get any reply.
  • On 28 June I let HP customer support know via the support site that I am unhappy with their lack of initiative to provide a solution. I do not get any reply.
  • On 29 June I try the HP support chat function. Before entering the chat I have to select my country from a list and provide the serial number of the machine. The chat support guy first asks some details about my identity and of the machine. He excuses for the fact that I had to wait for more than a month for a solution and starts to look at the case log. After looking at the case log he suddenly says that HP chat support is only available for the UK and Ireland. Now why do they even let me enter the chat after I chose Belgium as a country then and why did he ask all details about the case?
  • I call customer support again by phone. While waiting on the phone, a recorded message recommends to try HP’s chat support!
  • The guy on the phone proposes to send me a 250 GB SSD disk. He still needs confirmation from the technical service whether this model is compatible with that laptop. If I did not get any message the same day that would mean that all was OK and then I would have the new SSD the next day.
  • The case log shows these entries after I called:
    Sub-case comment added: Jun 29, 2011 1:15:27 PM
    again no answer at CRT
    consult PS
    Sub-case comment added: Jun 29, 2011 11:19:31 AM
    tried to call CRT belgium >> no answer
    try again in 1 hour after lunch
    Sub-case comment added: Jun 29, 2011 10:50:52 AM
    Cu called back
    595756-001 250GB solid-state drive (SSD) – SATA interface, 2.5-inch form factor
    This part IS supported for this notebook
  • On 30 June, I still do not have a new SSD and nobody contacted me. It seems the whole case is stuck and forgotten again and I will have to call back once again to get the whole process unstuck.

So for the second time, things seem to be stuck at “waiting for PS”. I do not know who or what is this “PS”, but it is clear to me that it is not doing its job.

I can only conclude that HP’s customer support is just worthless. Cases are not followed up and the customer is never informed about the status. HP’s customer service takes no initiative to propose an alternative solution. Instead the customer repeatedly has to take the initiative to make any progress. And even then when customer support is reminded of the problem, they do not do anything to prevent it from getting stuck again. Chat support is totally useless even though it is recommended by HP.

The last few months, I have had contact with HP support 3 times for other small problems. Things were all fixed in a reasonable matter, although it always took more than 1 business day to get a replacement, which is a pity.  However, what is happening now is simply unacceptable.

Almost all systems I bought the last few years were HP systems. I will definitely re-evaluate this, because reasonable customer support is simply essential with systems used in production in business. I am very unhappy and dissatisfied with HP support so I will consider alternatives in the future.

What are other people’s experiences with HP customer support in Belgium/The Netherlands?

Linux performance improvements

Two years ago I wrote an article presenting some Linux performance improvements. These performance improvements are still valid, but it is time to talk about some new improvements available. As I am using Debian now, I will focus on that distribution, but you should be able to easily implement these things on other distributions too. Some of these improvements are best suited for desktop systems, other for server systems and some are useful for both. Continue reading “Linux performance improvements”

Improving Mediawiki performance

Now that I am on the subject of improving performance, I configured some performance improvements for a Mediawiki installation here:

  • Make sure you run the latest Mediawiki version. Mediawiki 1.16 introduced a new localisation caching system which is supposed to improve performance, so you definitely want this to get the best performance.
  • Create a directory where Mediawiki can store the localisation cache (make sure it is writable by your web server). By preference store it on a tmpfs (at least if you are sure it will be big enough to store the cache), and configure it in LocalSettings.php:
    $wgCacheDirectory = "/tmp/mediawiki";
    Iif /tmp is on a tmpfs, you might add creation of this directory with the right permissions to /etc/rc.local, so that it still exists after a reboot.
  • Enable file caching in Mediawiki’s LocalSettings.php:
    $wgFileCacheDirectory = "{$wgCacheDirectory}/html";
    $wgUseFileCache = true;
    $wgShowIPinHeader = false;
    $wgUseGzip = true;
  • Make sure you have installed some PHP accelerator for caching. I have APC installed and configured it in Mediawiki’s LocalSettings.php:
    $wgMainCacheType = CACHE_ACCEL;

Here is a benchmark before implementing the above configuration (with CACHE_NONE, but APC still installed):

$ ab -kt 30 http://site/wiki/index.php/Page
This is ApacheBench, Version 2.3 < $Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking site (be patient)
Finished 255 requests

Server Software: Apache/2.2.16
Server Hostname: site
Server Port: 80

Document Path: /wiki/index.php/Page
Document Length: 12750 bytes

Concurrency Level: 1
Time taken for tests: 30.084 seconds
Complete requests: 255
Failed requests: 0
Write errors: 0
Keep-Alive requests: 0
Total transferred: 3344070 bytes
HTML transferred: 3251250 bytes
Requests per second: 8.48 [#/sec] (mean)
Time per request: 117.978 [ms] (mean)
Time per request: 117.978 [ms] (mean, across all concurrent requests)
Transfer rate: 108.55 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 3 6 2.8 7 21
Processing: 88 112 11.1 112 163
Waiting: 66 90 9.1 89 125
Total: 95 118 11.9 118 170

Percentage of the requests served within a certain time (ms)
50% 118
66% 122
75% 125
80% 127
90% 132
95% 138
98% 145
99% 156
100% 170 (longest request)

And here a benchmark after implementing the changes:

ab -kt 30 http://site/wiki/index.php/Page
This is ApacheBench, Version 2.3 < $Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking site (be patient)
Finished 649 requests

Server Software: Apache/2.2.16
Server Hostname: site
Server Port: 80

Document Path: /wiki/index.php/Page
Document Length: 12792 bytes

Concurrency Level: 1
Time taken for tests: 30.015 seconds
Complete requests: 649
Failed requests: 0
Write errors: 0
Keep-Alive requests: 0
Total transferred: 8538244 bytes
HTML transferred: 8302008 bytes
Requests per second: 21.62 [#/sec] (mean)
Time per request: 46.248 [ms] (mean)
Time per request: 46.248 [ms] (mean, across all concurrent requests)
Transfer rate: 277.80 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 3 9 3.7 8 29
Processing: 23 37 6.0 37 62
Waiting: 13 23 4.9 24 41
Total: 28 46 7.8 45 82

Percentage of the requests served within a certain time (ms)
50% 45
66% 47
75% 49
80% 50
90% 56
95% 62
98% 68
99% 73
100% 82 (longest request)

So Mediawiki can deal with more than 2,5 times as much requests now.

Some people use Apache’s mod_disk_cache to cache Mediawiki pages, but I prefer Mediawiki’s own caching system because it is more standard and does not require patching Mediawiki, even if it might not get as much benefit as a real proxy or mod_disk_cache.

Improving performance by using tmpfs

Today I analyzed disk reads and writes on a server with iotop and strace and found some interesting possible optimizations.

With iotop you can check which processes are reading and writing from the disks. I always press the o, p and a keys in iotop so that it only shows me processes doing I/O and so that it will show accumulated I/O instead of the bandwidth. With the left and right arrows I select on which columns to sort the list.

Once you have identified the processes wich are doing much I/O, you can check what they are reading or writing with strace, for example
# strace  -f -p $PID  -e trace=open,read,write

(you can leave out read and/or write if this gives too much noise)

This way I identified some locations where processes do lots of read and write operations on temporary files.

For nagios I placed /var/lib/nagios3/spool and /var/cache/nagios3 on a tmpfs, for Amavis /var/lib/amavis/tmp and for PostgreSQL /var/lib/postgresql/8.4/main/pg_stat_tm.

Other candidates you might want to consider: /tmp, /var/tmp and /var/lib/php5. There are probably many others, depending on which services you use.