Mandriva – The Next Generation

Mandriva 2007 Spring was already a very nice distribution with a lots of improvements and changes in style from the previous versions. severely improved graphical themes, the public availability of non-free packages, getting update notifications on the desktop without Mandriva subscription, and a generally more polished system.

For the next version of Mandriva, the bar is again set a lot higher than before. Last week, two interesting announcements were made:

  • David Barth announced that some internal re-organization was done at Mandriva, and that the community will be more actively involved in the development of the next Mandriva version. What this means in practice, and how this will work out, remains to be seen, but I have the feeling that this is a confirmation of something we have seen already in 2007.1: Mandriva really cares about the community now, and makes use of the community’s remarks to improve the distribution.
  • Anne Nicolas, who will follow-up development of next Mandriva’s versions from now on, announced that an effort will be done to get Mandriva’s bugzilla cleaned up and to improve the workflow of bug reports, which should result in a more bug free distribution. All new bug reports will first be checked by a team, which will verify the validity of the report, and which will collect all necessary information for the developers to be able to analyze and fix the bug. If developers receive more high quality reports, this will hopefully result in quicker and more bug fixing. Statistics about the bug fixing progress will be collected, in order to pinpoint problematic parts of the distribution, so this can be acted upon.

This is all great news. Mandriva shows that it is not wanting to be yet another ordinary nice Linux distribution, but that it has the ambition to be a real leader. Interesting times are coming!

Linux kernel development thoughts

One week ago, kernel hacker Ingo Molnar reviewed Con Kolivas’ swap prefetch patches and approved them to be added to the official Linux kernel. Swap prefetching is a technique, which will load swapped out memory pages back in memory if the system is idle and memory has become available. This is useful for people starting temporary jobs which use a lot of memory, which makes other processes move to swap. Once the memory hungry process is finished, swap prefetching will kick in and slowly reload the swapped out pages back to memory, so that the system potentially does not need to do this anymore when the user again uses one of the swapped out processes. Tjos functionality can be enabled and disabled at will during compilation of the kernel.

Swap prefetching has been available for a long time in Con Kolivas patch set, and was also added to the mm development kernels some time ago. In that period, no bugs have been reported, and it seems people are happy with this feature. So it was hoped that the push by Ingo Molnar, would finally make swap prefetching available for all Linux users in version 2.6.22.

Developer Nick Piggin seemed rather critical of the swap prefetching feature. I I have understood the thread correctly, he discovered a problem (actually swap prefetching did not seem to work anymore because of some unrelated changes in Linux), and there was some serious disagreement how it should be dealt with. Con Kolivas’ got fed up with the unreasonable objections to the patch and proposed to dump the patch completely. So this is yet another performance improving feature we won’t be seeing in the Linux kernel for some time…

Update 12 May 2007: Con Kolivas posted an updated version of swap prefetch, addressing some problems, and the patch has not been dropped from the mm kernel. Maybe things do not look so bad after all?

This whole story makes me remind of the recent RSDL/SD scheduler. Con Kolivas implements a feature which is clearly working very good for most people, and then someone comes up who vetoes it for whatever reason, and in the case of SD other implementations are started (which still do not have the same maturity as SD). In the end a lot of work seems to duplicated and interesting features are delayed or even cancelled completely. Is it me, or is the behaviour of some kernel developers really hurting Linux development?

Also in related news, kernel hacker Adrian Bunk decided to not track regressions anymore. He was tired of Linux 2.6.21 being released with lots of known regressions. It seems Linux kernel development has really some issues these days…

Mandriva work this week

This week I was contacting the mailing lists for various upstream projects to find fixes for some problems.

First there was the problem that F-Spot does not automatically handle the upgrade from sqlite2 to sqlite3. In the past, F-spot used sqlite2 as its database, today it uses sqlite3 by default, but you can still continue to use your sqlite2 database. That is, if sqlite2 libraries are installed on your system. And one can imagine that in the future, distributions will stop shipping sqlite2, because it is not maintained anymore and because sqlite3 is far superior. It is possible to upgrade from sqlite2 to sqlite3 by upgrading the db at the command line, but we really cannot expect normal users to do this, do we?

In Pan, I had the problem that new headers were not appearing in a newsgroup, until I killed the “get new headers”-task by hand. I found out it was caused by a news server I had configured in Pan, which was not reachable. I filed a bug.

I experienced a crash in ogginfo when some wrongly encoded characters are found in the Vorbis tag, for which I found a patch with Google. I filed a bug but I’ll need to investigate if the tag is really invalid: Easytag shows it correctly without any problem.

Another problem I experienced already some time ago, but which I never really investigated, was that Amarok did not add files on a read-only Samba share to its library. Thanks to Debian changelog mailing list, I discovered today that there was a patch for libtag fixing this. This is a real serendipity!

I did again some testing with Kaffeine. I contact its mailing list for the problem that it copied everything to ~/tmp when using system:/ URI in Konqueror. I found a fix now, but I’m still not impressed by Kaffeine’s stability. For now, you’re better of continuing to use kmplayer.

I contacted Mandriva’s web discuss mailing list (which does not seem to be archived unfortunately) for a new structure of the End user documentation on the Mandriva wiki and posted a proposed for some documentation about the tmb kernel. I’m a bit disappointed by the reactions: like this we won’t succeed in creating a very active wiki community with lots of useful user documentation. I’ll try to implement a proposal in my sandbox this weekend: practical work is often better than theoretical discussion

Linux kernel: The battle of the CPU schedulers

Since some time already, different patches are being written for the Linux kernel, which improve the CPU scheduler. The CPU scheduler, is that part of the kernel, that’s responsible for assigning CPU time to the different task running on your system. If you sometimes experience problems with sound stuttering or your mouse becoming jerky while running other CPU intensive tasks, then this is definitely a problem caused by the task scheduler.

Con Kolivas has been maintaining an alternative scheduler for some time. His Staircase scheduler was designed with interactivity in mind, especially for desktop systems, where people want their system to be quickly responsive under all kinds of workloads. This scheduler has been optimized a lot through the years, and as such is very stable. Still there are some rare cases where “starvation” is possible.

At the start of March, Con Kolivas published a new scheduler, which was called RSDL (Rotating Staircase DeadLine scheduler) at first, and has been renamed to SD (Staircase Deadline) afterwards. Based on the experience Con Kolivas gathered with his Staircase scheduler, SD is a more general purpose scheduler, trying to give absolute fairness to the different running tasks, without favouring any process (for example lots of other schedulers favour X). This way, no starvation issues should be possible with this scheduler. A lot of discussion followed after his announcement, and it became quickly clear that a lot of people were not happy with the current scheduler in Linux. Important kernel developers like Mike Galbraith, Nick Piggin, Ingo Molnar, Willy Tarreau and Andrew Morton joined the discussion and also posted other scheduler patches, sometimes not without some trolling and flaming as sometimes happen on such mailing lists. Con Kolivas’ scheduler was added to Andrew Morton’s mm kernel tree to get some more testing. The development of RDSL/SD went up and down sometimes, because of Con Kolivas’ health problems.

Ingo Molnar, which was rather critical of some of the ideas in the new scheduler at first, also recently began the development of CFS (Completely Fair Scheduler, which actually is based on the same basic concept of fairness. Con Kolivas announced that he would stop development of the RD scheduler, because of his health problems, and because his ideas would now continue to be used in the CFS scheduler. But things came out differently, and Con Kolivas continued development in the end. The result is that the SD scheduler is now at its 46’th version (v. 0.46), and it seems most problems have been fixed. Based on all the testing done on the kernel mailing list, it seems SD 0.46 is more mature than CFS 6. Even Willy Tarreau, maintainer of the 2.4 Linux kernel tree, said that thanks to SD, he did make Linux 2.6 the default kernel on his laptop, as he found the scheduler in mainline 2.6 too bad compared to 2.4. It’s unclear however which of these schedulers will be integrated in linux finally, and when this will happen.

Personally I think SD 0.46 should be integrated now in Linux 2.6.22 pre-releases. There has been a lot of testing and bug fixing, and it seems there are no serious bugs open anymore now. I also hope that Mandriva 2008 will come out with one of these new schedulers. The tmb kernel in Mandriva Cooker, already uses the SD scheduler now. People interested in this discussion, can subscribe to the ck mailing list where a lot of the discussion is happening. Sites like LWN.net and Kerneltrap also often post about the progress of this subject.

Testing KDE video players, setting up backups and a cluster, cleaning up poppler and sqlite stuff

Today I tried Kaffeine 0.8.4 with XCB support. It is supposed to fix the stability issues when playing embedded videos in Konqueror, so I wanted to verify if it is a real contender to KMPlayer. Kaffeine indeed does not seem to cause instabilities in Konqueror anymore. That’s good. I tried viewing videos on vrtnieuws.net and this worked fine. Unfortunately, trying to view an Apple movie trailer failed, while KMPlayer (with mplayer back-end) worked fine. When double clicking on local video files while browsing system:/home in Konqueror, Kaffeine still copies the whole file to a temporary directory. Actually the default use of system:/home instead of /home/login by Konqueror is completely braindead, but I do not understand why Kaffeine needs to copy the whole file, while KMplayer does not need to do this. So for now, I still recommend KMPlayer.

At work, I started to install a cluster system, consisting of about 6 (IIRC) computers. I will try Kerrighed on Debian Etch. It seems an interesting project: more actively developed than OpenMosix with a clear roadmap for the future. And it does not seem too difficult to install. I’ll post my experiences soon.

Also at work, I recently discovered the rdiff-backup and backupninja projects. These programs really rule for making backups on free hard drive space. For tape back-ups I like Bacula, but it is not that easy to configure. For simple backups on hard drives, nothing beats the simplicity of rdiff-backup and backupninja. I should definitely check whether these good backuptools are also included in the Mandriva repositories.

Some things I have on my personal TODO list: I want to find out which programs still depend on xpdf, or have their proper xpdf code included in the source. We should maximise the use of poppler instead, because it is a shared library. If all PDF programs use poppler, all programs will easily benefit from improvements to poppler. And it will make security updates for all these programs much easier. And we will gain some space, which is always interesting for the One live CDs. Because now, a lot of packages have to be separately updated which is a PITA. I would like to do the same for sqlite. Seems like a lot of packages still depend on the old sqlite 2 instead of the more modern version 3, and some programs still use their own private copy. Let’s hunt these packages down, and see if we can compile them with the sqlite 3 shared library. If not, bug reports should be opened upstream so this gets fixed.

Mandriva 2007 Spring released, looking forward to 2008

Now that Mandriva 2007 Spring has been released and the corrupted Mandriva Cooker subversion repository has been fixed, development for Mandriva 2008 has started now.

Some things which I have on my wishlist for Mandriva 2008:

  • New kernel, Linux 2.6.22 or later which I hope will give us these things: CFQ IO scheduler as default (since Linux 2.6.18), better task scheduler, probably CFS, better wireless drivers (bcm43xx is stable in 2.6.20, but it’s not in Mandriva’s 2.6.17, finally good Ralink drivers: complete documentation exists, but drivers are still not production quality)
  • Mandriva still uses OSS drivers for some sound cards which are perfectly supported with Alsa (for example for some Intel ICH2 810 AC97 cards, which was not fixed in 2007.1 as far as I know). This is annoying as OSS does not have sound mixing, and can cause applications complain about occupied /dev/dsp and it also causes missing sound in Flash. Fedora even does not ship OSS anymore for some time already. A maximum of cards should be switched to Alsa now early in Cooker development , so that things get tested.
  • slocate is still being installed by default by rpmsrate instead of the much performance friendlier mlocate (already in 2007.1 Main). We should make the switch now in rpmsrate so it’s the default in 2008.
  • Almost a year ago, I proposed a new program menu structure for Mandriva. According to Gnome and menu maintainer Frédéric Crozat, it would be implemented for 2008. Let’s hope this becomes true, as it will make the menu at the same time friendly for new users, and more practical for power users. Unfortunately, I lost the complete original document (what a shame!). If somebody still has it somewhere, I would be grateful if you could send it to me!
  • Drop esound, and use Pulseaudio instead. Pulseaudio is a more modern and actively maintained sound server. There was some discussion about it some time ago, and I still hope Pulseaudio will not be forced on all users (personally I prefer not to use a sound server, and let all the mixing be done by Alsa. For this reason, I still hope more and more applications will use GStreamer, as GStreamer makes it easy to choose your preferred output method, such as Alsa or Pulseaudio for all applications. But Pulseaudio should certainly replace esound in main.
  • Test new Kaffeine: In Mandriva 2007.0, Kaffeine was replaced by kmplayer as the default video player in KDE, because Kaffeine was unstable paying embedded videos in Konqueror. Now that both Xine and Kaffeine support XCB, this problem should be fixed. We should test if it is indeed stable now, and if Kaffeine plays supports all kinds of embedded videos in web pages like kmplayer. If this is the case, we should consider moving back to Kaffeine, as its interface is a bit better than KMplayer’s.
  • Mandriva’s net_applet and mdkonline system tray utilities still poll too much, which causes unnecessary CPU usage, and which also eats battery life on laptops. These applications should be scrutinized a bit more, so they are completely idle most of the time.
  • Up to date printing support: Unfortunately printing support did not really advance in Mandriva 2007 Spring. The drivers are mostly still the same as in 2007.0, and some newer printer models supported by newer versions of hpijs and other models which appeared on linuxprinting.org, are not support in 2007 Spring. Mandriva 2008 should really have up to date printing support, like was always the case in previous versions.
  • Cleaning up the repositories: the Main repository now contains a lot of programs which are not used by a lot of people (things like flphoto, uucp, uw-imapd, xpdf,…). On the other hand, some very interesting packages are in Contribs (Transmission, dovecot,…). The contents of both repositories should be looked at, in order to clean up, and put packages in the appropriate repository.
  • We need a simple application installation program: rpmdrake has improved a lot in 2007.0 and now in 2007.1 Spring. Still it is too complex for new users. Suppose a user wanting to search a DTP application. He should easily find Scribus. Well, Scribus does not even appear in the Publishing category in rpmdrake. Instead, these category contains mostly technnical console programs not suitable for beginners. We should have something like Ubuntu’s gnome-app-install which only shows userfriendly graphical applications.
  • More and easier to find documentation: Mandriva already comes with a lot of documentation, but the included handbooks are difficult to find (being hidden in More Applications). They should be in a more obvious place. On the other hand, we should make sure more documentation becomes available in different languages on the wiki, and the wiki should be easy to find (include a link on the desktop?). This way, users will know where to look for help.
  • Related to the menu structure and the reorganization of the repositories, we should constantly be evaluating which applications should be in the default installation and which not. We had such discussion shortly before 2007.1 Spring went out, but this was a bit too little too late. We should make sure we start with this process before the first beta is released, so we have more time to evaluate and stress test programs.

Currently I already started by taking a quick look at the contents of the main repository. Maybe other people are interested in participating in this effort, so that it can happen a bit more organized?

Struggling with Linux’ OOM killer when building RPMs

Last two weeks, I have created a lot of updated packages for Mandriva 2007.1. I packaged Gnome 2.18.1, and also updated subversion snaphots of kdepim and kdegraphics. Kdepim, because it has received a lot of bug fixing love the last two months, and kdegraphics, because it contains kpdf using new xpdf code, which should be compatible with PDF 1.6 and 1.7 specifications. In kdepim, they also removed the kitchensynk tool, which offered synchronization options with external devices. Apparently it was too buggy to be really useful. I have the impression that Kmail is indeed also more stable than in 3.5.6. I could not reproduce yet the hangs I sometimes experienced with 3.5.6.

Compiling kdepim on an AMD64 system, seems to require a huge amount of memory (much more than on x86 32 bit). 1 GB of RAM and about 250 MB of swap did not prevent g++ eating up all of my memory (even when no other services were active!). Unfortunately, this made Linux become completely unstable: the hard drive started thrashing the whole time, and the system was completely unresponsive. I could only stop it by doing a hard reset. I am clearly not the only one hating this stupid Linux behaviour.

In the Mandriva Cooker channel (irc.freenode.org, #mandriva-cooker), couriousous suggested to execute

# echo 2 > /proc/sys/vm/overcommit_memory

The default value is 0: when an application asks more memory than is available, Linux will still try to allocate it, even if chances exist that it won’t be available. This seems to be done because some applications ask more memory than what they will really use. If in the end, no more memory or swap is availed, Linux’ OOM killer will kill some (random) processes to free up memory. By setting overcommit_memory to 2, the allocation of too much memory will fail immediately. The application can then react itself to the fact that not enough memory is available. The result was that instead of bringing my system to death, the g++ compiler just exited with the message that I was out of memory. Much nicer! I found a complete technical explanation about memory allocation in Linux on the web. To make this setting default, I put vm.overcommit_memory = 2 in my /etc/sysctl.conf. I do not understand why it is not the default value, making a system unresponsive for several (tens of) minutes does not seem very friendly…

Update 16 april 2007: It seems like this setting has severe problems as well. On my system with 384 MB RAM, which I use as server and desktop, several applications randomly crashed because they did not get the memory they wanted (Evolution, Tilda,…). Changing the setting back to 0, made these applications work correctly again. I suppose playing with the overcommit_ratio value as explained in the article which I mentioned above, can improve this behaviour. But anyway, it sucks that such things are so difficult to get right. This should really be working nicely out of the box

Anyway, in the end I got kdepim compiled on AMD64 by adding some more swap. From now on, I’ll be creating bigger swap partitions is Linux, it can realy be useful, even when you think you have enough memory…

I also built a freetype 2.3.4 RPM for Mandriva. Font rendering on my flat panel is now much nicer comparing with freetype 2.3.1 which is included in Mandriva 2007.1! Not that it was ugly before, but it’s a nice surprise to still see such big improvements, especially from a minor update.