The linux-kernel mailing list FAQ
Before you consider posting to the linux-kernel mailing list,
please read at least the start of section 3 of this
FAQ list.
These frequently asked questions are divided in various categories. Please
contribute any category and Q/A that you may find
relevant. You can also add your answer to any question that has already
been answered, if you have additional information
to contribute.
The official site is:
http://www.tux.org/lkml/ (this
is in the east coast of the U.S.A). Many thanks to Sam Chessman and
David Niemi for hosting the FAQ on a high-bandwidth, professionally
managed Linux server. The following mirrors are available (and are
updated at the same time as the official site):
Hot off the Presses
vger.kernel.org has enabled ECN. You may need to switch ISP in
order to receive linux-kernel email. See the section on ECN for more details.
Two digest forms of linux-kernel (a normal digest every 100KB and a
once-daily digest) are available at http://lists.us.dell.com/.
Read this before complaining to linux-kernel about compile
problems. Chances are a thousand other people have noticed and the fix
is already published.
Index
Basic Linux kernel documentation
The following are Linux kernel related documents, which you
should take a look at before you post to the linux-kernel mailing
list:
-
The Linux Kernel Hackers' Guide,
compiled by Michael K. Johnson of Red Hat fame. Includes among other
documents selected Q/As from the linux-kernel mailing list.
-
The Linux Kernel
book, by David A. Rusling, available in various formats from the
Linux Documentation Project
and mirrors.
Still being worked on, but explains clearly the main structure of the
Linux kernel.
-
The Linux FAQ
by Robert Kiesling has many high quality Q/As.
-
The Linux
Kernel HOWTO by Brian Ward. Fundamental reading for anybody
wanting to post to the linux-kernel mailing list.
-
A completely new Kernelhacking-HOWTO at
http://www.kernelhacking.org/.
Currently work in progress, but already contains some useful information.
-
Various Linux
HOWTOs
on specific questions, such as the
BogoMips mini-HOWTO by Wim van Dorst. These are all by
definition LDP documents.
-
The Linux kernel source code for any particular kernel version
that you may be using. Note that there is a /Documentation directory
which holds some very useful text files about drivers, etc. Also check
the MAINTAINERS file in the kernel source root directory.
-
Some drivers even have Web pages, with additional up to date
information e.g. the network drivers
by Donald Becker, etc. Check the Hardware section in the
LDP site.
-
Similarly, Linux implementations for some CPU architectures have
dedicated Web pages, mailing lists, and sometimes even a HOWTO
e.g. the Linux Alpha
HOWTO by Neal Crook. Check the LDP site and its mirrors for
Web links to the various architecture specific sites.
-
Linux device drivers, a book written by Alessandro
Rubini. C. Scott Ananian reviewed
it for Amazon.com.
-
Linux kernel internals, a book by Michael Beck (Editor) et al. Also reviewed for Amazon.com.
-
Another useful site is:
http://www.kernelnewbies.org/
-
Here is a general guide on how to ask questions in a way that greatly
improves your chances of getting a reply:
http://www.catb.org/~esr/faqs/smart-questions.html. If you have
a bug to report, you should also read
http://www.chiark.greenend.org.uk/~sgtatham/bugs.html.
Extra instructions, specific to the Linux kernel are available
here.
Contributors and some special expressions
This is the list of contributors to this FAQ. They are listed in
alphabetic order of their abbreviations, used in the Answers sections below to identify the author(s)
of each answer.
Some English expressions for non-native English readers. Many of these
(and far more) may be obtained from the
Jargon File:
-
AFAIK = As Far As I Know
-
AKA = Also Known As
-
ASAP = As Soon As Possible
-
BTW = By The Way (used to introduce some piece of information or
question that is on a different topic but may be of interest)
-
COLA = comp.os.linux.announce (newsgroup)
-
ETA = Estimated Time of Arrival
-
FAQ = Frequently Asked Question
-
FUD = Fear, Uncertainty and Doubt
-
FWIW = For What It's Worth
-
FYI = For Your Information
-
IANAL = I Am Not A Lawyer
-
IIRC = If I Recall Correctly
-
IMHO = In My Humble Opinion
-
IMNSHO = In My Not-So-Humble Opinion
-
IOW = In Other Words
-
LART = Luser Attitude Readjustment Tool (quoting Al Viro: "Anything you
use to forcibly implant the clue into the place where luser's head
is")
-
LUSER = pronounced "loser", a user who is considered a to indeed be a
loser (idiot, drongo, wanker, dim-wit, fool, etc.)
-
OTOH = On The Other Hand
-
PEBKAC = Problem Exists Between Keyboard And Chair
-
ROTFL = Rolling On The Floor Laughing
- RSN = Real Soon Now
-
RTFM = Read The Fucking Manual (original definition) or Read The Fine
Manual (if you want to pretend to be polite)
-
TANSTAAFL = There Ain't No Such Thing As A Free Lunch (contributed by
David Niemi, quoting Robert Heinlein in his science fiction novel 'The
Moon is a Harsh Mistress')
-
THX = Thanks (thank you)
-
TIA = Thanks In Advance
-
WIP = Work In Progress
-
WRT = With Respect To
Related mailing lists
Some questions are better posted to related mailing lists on specific
subjects. Posting to these mailing lists helps reduce the volume on the
linux-kernel mailing list and also increases your chances of having
your message read by an expert on the subject. Some people do not have
the time to subscribe to the linux-kernel mailing list, as it is too
general for them. Some related lists are:
Question Index
-
Why do you use "GNU/Linux" sometimes and just "Linux" in
other parts of the FAQ?
-
What is an experimental kernel version?
-
What is a production kernel?
-
What is a feature freeze?
-
What is a code freeze?
-
What is a f.g.hhprei kernel?
-
Where do I get the latest kernel source?
-
Where do I get extra kernel patches?
-
What is a patch?
- How do I make a patch suitable for the linux
kernel list?
-
How do I apply a patch?
-
What's vger?
-
What is a CVS tree? Where can I find more information
about CVS?
-
Is there a CVS tutorial?
-
How do I get my patch into the kernel?
-
Why does the kernel tarball contain a directory
called linux/ instead of linux-x.y.z/ ?
-
What's the difference between the official kernels
and Alan Cox's -ac series of patches?
-
What does it mean for a module to be tainted?
-
What is this about GPLONLY symbols?
-
Do I have to use BitKeeper to send patches?
-
Why do some developers use the non-free BitKeeper?
Isn't this against the spirit of Free Software?
-
Who maintains the kernel?
-
The kernel doesn't compile cleanly. What shall I do
?
-
Driver such and such is broken!
-
Here is a new driver for hardware XYZ.
-
Is there support for my card TW-345 model C in kernel version
f.g.hh?
-
Who maintains driver such and such?
-
I want to write a driver for card TW-345 model C, how do
I get started?
-
I want to get the docs, but they want me to sign an NDA
(Non-Disclosure Agreement).
-
I want/need/must have a driver for card TW-345 model C!
Won't anybody write one for me?
-
What's this major/minor device number thing?
-
Why aren't WinModems supported?
-
Modern CPUs are very fast, so why can't I write a
user mode interrupt handler?
-
Do I need to test my driver against all
distributions?
-
How do I subscribe to the linux-kernel mailing list?
-
How do I unsubscribe from the linux-kernel mailing list?
-
Do I have to be subscribed to post to the list?
-
Is there an archive for the list?
-
How can I search the archive for a specific question?
-
Are there other ways to search the Web for information
on a particular Linux kernel issue?
-
How heavy is the traffic on the list?
-
What kind of question can I ask on the list?
-
What posting style should I use for the list?
-
Is the list moderated?
-
Can I be ejected from the list?
-
Are there any implicit rules on this list that I should
be aware of?
-
How do I post to the list?
-
Does the list get spammed?
-
I am not getting any mail anymore from the list! Is it
down or what?
-
Is there an NNTP gateway somewhere for the mailing
list?
-
I want to post a Great Idea (tm) to the list. What should
I do?
-
There is a long thread going on about something completely
offtopic, unrelated to the kernel, and even some people who are in the
"Who's who" section of this FAQ are mingling in it. What should I do to
fight this "noise"?
-
Can we have the Subject: line modified to help
mail filters?
-
Can we have a Reply-To: header automatically
added to the list traffic?
-
Can I post job offers/requests to the list?
-
Why do I get bounces when I send private email to
some people?
-
Why don't you split the list, such as having one each
for the development and stable series?
-
How do I post a patch?
-
How do I capture an Oops?
-
How do I post an Oops?
-
I think I found a bug, how do I report it?
-
What information should go into a bug report?
-
I found a bug in an "old" version of the kernel, should
I report it?
-
How do I compile the kernel?
Names are in alphabetical order (last name) to
avoid stepping on toes.
If someone doesn't appear here, check /usr/src/linux/CREDITS.
-
Who is in charge here?
-
Why don't we have a Linux Kernel Team page, same as there
are for other projects?
-
Why doesn't <any of the below> answer my mails?
Isn't that rude?
-
Why do I get bounces when I send private to email to
some of these people?
-
Who is Matti Aarnio?
-
Who is H. Peter Anvin?
-
Who is Donald Becker?
-
Who is Alan Cox?
-
Who is Richard E. Gooch?
-
Who is Paul Gortmaker?
-
Who is Bill Hawes?
-
Who is Mark Lord?
-
Who is Larry McVoy?
-
Who is David S. Miller?
-
Who is Linus Torvalds?
-
Who is Theodore Y. T'so?
-
Who is Stephen Tweedie?
-
Who is Roger Wolff?
Some people haven't contributed
yet with a few lines about themselves, and the policy of this FAQ dictates
that nobody is going to write about anybody else without authorization.
Hence the missing links e.g. if you are not Linus, don't insist, we are
not going to add your information about Linus.
Other OS developers:
Is this a matter of taste or what?
-
What is the "best" CPU for GNU/Linux?
-
What is the fastest CPU for GNU/Linux?
-
I want to implement the Linux kernel for CPU Hyper123,
how do I get started?
-
Why is my Cyrix 6x86/L/MX detected by the kernel as a Cx486?
-
What about those x86 CPU bugs I read about?
-
I grabbed the standard kernel tarball from
ftp.kernel.org or some mirror of it, and it doesn't compile on the
Sparc, what gives?
-
Does the Linux kernel execute the Halt instruction to power
down the CPU?
-
I have a non-Intel x86 CPU. What is the [best|correct]
kernel config option for my CPU?
-
What CPU types does Linux run on?
OS theory and practical issues mix.
-
OS $toomuch has this Nice feature, so it must be better
than GNU/Linux.
-
Why doesn't the Linux kernel have a graphical boot screen
like $toomuch OS?
-
The kernel in OS CTE-variant has this Nice-very-nice
feature, can I port it to the Linux kernel?
-
How about adding feature Nice-also-very-nice to the Linux
kernel?
-
Are there more bugs in later versions of the Linux kernel,
compared to earlier versions?
-
Why does the Linux kernel source code keep getting larger
and larger?
-
The kernel source is HUUUUGE and takes too long to download.
Couldn't it be split in various tarballs?
-
What are the licensing/copying terms on the Linux kernel?
-
What are those references to "bazaar" and "cathedral"?
-
What is this "World Domination" thing?
-
What are the plans for future versions of the Linux kernel?
-
Why does it show BogoMips instead of MHz in the kernel
boot message?
-
I installed kernel x.y.z and package foo doesn't work
anymore, what should I do?
-
People talk about user space vs. kernel space. What's
the advantage of each?
-
What are threads?
-
Can I use threads with GNU/Linux?
-
You mean threads are implemented in user space? Why not
in kernel space? Wouldn't that be more efficient?
-
Can GNU/Linux machines be clustered?
-
How well does Linux scale for SMP?
-
Can I lock a process/thread to a CPU?
-
How efficient are threads under Linux?
-
How does the Linux networking/TCP stack work?
-
Can we put the networking/TCP stack into user-space?
Kernel compilation problems.
-
I downloaded the newest kernel and it doesn't even
compile! What's wrong?
-
What are the recommended compiler/binutils for
building kernels?
-
Why the recommended compiler? I like xyz-compiler
better.
-
Can I compile the kernel with gcc 2.8.x, egcs, (add
your xyz compiler here)? What about optimizations? How do I get to use
-O99, etc.?
-
I compiled the kernel with xyz compiler and get the
following warnings/errors/strange behavior, should I post a bug report
to the list? Should I post a patch?
-
Why does my kernel compilation stops at random
locations with: "Internal compiler error: program cc1 caught fatal
signal 11."?
-
What compiler flags should I use to compile
modules?
-
Why do I get unresolved symbols like foo__ver_foo in
modules?
-
Why do I get unresolved symbols with __bad_ in the name?
Miscellaneous kernel features questions.
-
GNU/Linux Y2K compliance?
-
What is the maximum file size supported under ext2fs? 2
GB?
-
GGI/KGI or the Graphics Interface in Kernel Space debate?
-
How do I get more than 16 SCSI disks?
-
What's devfs and why is it a Good Idea (tm)?
-
Linux memory management? Zone allocation?
-
How many open files can I have?
-
When will the Linux accept(2) bug be fixed?
-
What about STREAMS? I noticed Caldera has a STREAMS package,
when will that go in the kernel source proper?
-
I need encryption and steganography. Why isn't it
in the kernel?
-
How about an undelete facility in the kernel?
-
How about tmpfs for Linux?
-
What is the maximum file size/filesystem size?
-
Linux uses lots of swap while I still have stuff in
cache. Isn't this wrong?
-
Why don't we add resource forks/streams to Linux
filesystems like NT has?
-
Why don't we internationalise kernel messages?
-
Size (source and executable)?
-
Can I use a 2.2.x kernel with a distribution based on
a 2.0.x kernel?
-
New filesystems supported?
-
Performance?
-
New drivers not available under 2.0.x?
-
What are those __initxxx macros?
-
I have seen many posts on a "Memory Rusting Effect". Under
what circumstances/why does it occur?
-
Why does ifconfig show incorrect statistics with 2.2.x
kernels?
-
My pseudo-tty devices don't work any more. What
happened?
-
Can I use Unix 98 ptys?
-
Capabilities?
-
Kernel API changes
Please, if you wish to contribute a Q/A in this section, provide a very
short answer defining the topic and then a URL
to a longer text/Web page. Like that we can have various URL's for a single
Q, each with a different point of view. Another advantage of this approach
is that each contributor has to sit down and write a coherent HTML page
or text file. Having to structure a written answer gives ample time to
think about the issues and the topic as a whole. It also allows frequent
independent revisions, which would be impossible on the FAQ itself.
Note that writing the longer text/Web page on some relevant Linux kernel
topic and providing a Q/A in this section confers you instant Guru
status. Some people would *kill* for this. Now go and write
your stuff. ;)
-
What's a primer document and why should I read it first?
-
How about having I/O completion ports?
-
What is the VFS and how does it work?
-
What's the Linux kernel's notion of time?
-
Is there any magic in /proc/scsi that I can use to rescan
the SCSI bus?
Answers to common questions about kernel programming details. See also
Tigran Aivazian's page on
kernel
programming.
-
When is cli() needed?
-
Why do I see sometimes a cli()-sti() pair, and sometimes
a save_flags-cli()-restore_flags sequence?
-
Can I call printk() when interrupts are disabled?
-
What is the exact purpose of start_bh_atomic() and
end_bh_atomic()?
-
Is it safe to grab the global kernel lock multiple
times?
-
When do I need to initialise variables?
We sometimes get these messages in our system logs and wonder what they
mean...
-
What exactly does a "Socket destroy delayed"
mean?
-
What do I do about "inconsistent MTRRs"?
-
Why does my kernel report lots of "DriveStatusError
BadCRC" messages?
-
Why does my kernel report lots of "APIC error"
messages?
The kernel behaves in ways that seem odd...
-
Why is kapmd using so much CPU time?
-
Why does the 2.4 kernel report Connection
refused when connecting to sites which work fine with earlier
kernels?
-
Why does the kernel now report zero shared memory?
-
Why does lsmod report a use count of -1 for
some modules? Is this a bug?
-
Why doesn't the kernel see all of my RAM?
-
I've mounted a filesystem in two different places and
it worked. Why?
Responses to suggestions about programming techniques and languages.
-
Why is the Linux kernel written in C/assembly?
-
Why don't we rewrite it all in assembly language for
processor Mega666?
-
Why don't we rewrite the Linux kernel in C++?
-
Why is the Linux kernel monolithic? Why don't we rewrite
it as a microkernel?
-
Why don't we replace all the goto's with C
exceptions?
-
Why are the kernel developers so dismissive of new
techniques?
Answers to common questions about user-space programming details, as
it relates to the kernel/user-space interface (i.e. system calls).
This does not cover questions on the C library nor any other
library, as those questions are not related to the kernel.
-
Why does setsockopt() double SO_RCVBUF?
Answers
Section 1 - General questions
-
Why do you use "GNU/Linux" sometimes
and just "Linux" in other parts of the FAQ?
-
(ADB) In this FAQ, we have tried to use the
word "Linux" or the expression "Linux kernel" to designate the kernel,
and GNU/Linux to designate the entire body of GNU/GPL'ed OS software, as
found in the various distributions. We prefer to call a cat, a cat, and
a GNU, a GNU. ;-)
The purpose of the FAQ is to provide information on the Linux kernel
and avoid debates on e.g. semantics issues. Further discussion of the
relationship between GNU software and Linux can be found at
http://www.gnu.org/gnu/linux-and-gnu.html.
BTW, it seems many people forget that the linux kernel mailing
list is a forum for discussion of kernel-related matters, not GNU/Linux in
general; please do not bring up this
subject on the list.
-
What is an experimental kernel version?
-
(ADB)) Linux kernel versions are divided
in two series: experimental (odd series e.g. 1.3.xx or 2.1.x) and
production (even series e.g. 1.2.xx, 2.0.xx, 2.2.x, 2.4.x and so
on). The experimental series are fast moving versions which are used
to test new features, algorithms, device drivers, etc. By their own
nature the experimental kernels may behave in unpredictable ways, so
one may experience data losses, random machine lockups, etc.
-
What is a production kernel?
-
(ADB) Production or stable kernels have a
well defined feature set, a low number of known bugs, and tried and proven
drivers. They are released less frequently than the experimental kernels,
but even so some "vintages" are considered better than others. GNU/Linux
distributions are usually based on chosen stable kernel versions, not necessarily
the latest production version.
-
What is a feature freeze?
-
(ADB) A feature freeze is when Linus announces
on the linux-kernel list that he will not consider any more features until
the release of a new stable kernel version. Usually the net effect of such
an announcement is that on the following days people on the list propose
a flurry of new features before Linus really enforces the feature freeze.
;-)
-
What is a code freeze?
-
(ADB) A code freeze is more restrictive than
a feature freeze; it means only severe bug fixes are accepted. This is
a short phase that usually precedes the creation of a new stable kernel
tree.
-
What is a f.g.hhprei
kernel?
-
(ADB) These are intermediate pre-release versions
of version f.g.hh. Note that usually i
< 5, but e.g. 2.0.34prei
was available with i = 1 to 16. Sometimes
"pre" is replaced by the initials of the developer putting
together the kernel revision, e.g. 2.1.105ac4 means the 4th intermediate
release of kernel version 2.1.105 by Alan Cox.
-
Where do I get the latest kernel source?
-
(ADB) The primary site for the Linux
kernel (experimental and production) sources is hosted by Transmeta
(the company Linus Torvalds works for) on a dedicated Web server at http://www.kernel.org/. This site is
mirrored across the world, and has pointers to mirrors for each
country. You can go directly to a mirror for your country by going to
http://www.CODE.kernel.org/ where "CODE"
is the appropriate country code. For example, "au" is the country code
for Australia, so the principle mirror site for Australia is http://www.au.kernel.org/
-
(REG) You may also access tarballs and
patches directly via ftp from
ftp://ftp.CODE.kernel.org/pub/linux/kernel/
which is where Linus distributes his kernels from. Other notable
kernel hackers have directories under the people
directory, which is where they keep their kernel patches. The
testing directory is where Linus puts pre-release
patches. The pre-release patches are mainly intended for other
developers, so they can stay in sync with changes in Linus' source
tree. These are often highly experimental and may crash or cause
filesystem corruption. Use at your own risk.
Note that Linus and Marcelo are using
BitKeeper to manage their
kernel source trees, and it is more convenient for them to make
snapshots of their latest trees available via BitKeeper, rather than
make patches. If you want access to these snapshots (which are merely
a work in progress, and may be buggy), there are several access
methods available:
BitKeeper: bk://linux.bkbits.net/linux-2.[45]
CVS: :pserver:[email protected]:/home/cvs/linux-2.[45]
RSYNC: rsync.kernel.org::pub/scm/linux/kernel/bkcvs/linux-2.[45]/
Subversion: svn://svn.kernel.org/linux-2.[46]/trunk
These access methods are provided as a service to the community by
BitKeeper, as is the free
licence to use the BitKeeper software for Open Source projects. These
services cost BitKeeper real money to provide (bandwidth, computing
power and support costs for helping the Open Source community). Please
do not abuse these services.
-
Where do I get extra kernel patches?
-
(REG) There are many places which provide
various extra patches to the kernel for new features. One fairly good
archive is available at:
http://www.linuxhq.com/.
-
What is a patch?
-
(RRR) A patch file (as it refers to the Linux
kernel) is an ASCII text file that contains the differences
between the original code and the new code, plus some additional
information such as filenames and line numbers. The patch program (man
patch) can then apply the patch to an existing kernel source tree.
-
How do I make a patch suitable for the linux kernel
list?
-
(REG) Here are some basic guidelines for
posting patches. For information on how to generate patches, see the
entry by RRR below.
-
Ensure the patch does not have trailing control-M characters on each
line. A number of broken tools used to encode patches add control-M
for "DOS compatibility". This breaks many versions of patch, so
be sure to configure your tools properly, or use unbroken tools,
otherwise your patch will be silently deleted.
-
Include the patch inline in your email, in plain text. Do not post it
as a base64 MIME attachment. Many people will not be able to
read your patch, and thus your patch will be deleted without comment.
-
If you have a large patch, post a URL instead, otherwise you'll fill
the mailboxes of thousands of people, and you will get complaints.
Posting a new, large patch once might be OK, but updates should not be
posted in full (post a URL).
-
If you want Linus or one of the primary maintainers (i.e. Marcelo,
David) to apply your patch, you must Cc: them explicitly,
otherwise your patch will be ignored.
-
When sending patches to Linus or one of the primary maintainers, you
must include the patch inline, in plain text, no matter how large the
patch.
-
If you want to send a patch to the list for comment, and also send it
to Linus/primary maintainer for inclusion, and the patch is large, you
many wonder how to reconcile the conflicting requirements. The
solution is obvious: post the URL to the mailing list, wait for
comments, and later send the patch, inline, to Linus/primary
maintainer. Yes, this is more work for you. No, we don't care.
-
If you have a mailer that eats whitespace or causes similar
corruption, then FIX YOUR MAILER, don't expect to be able to
take the easy solution and MIME encode your patch.
Finally, I've seen one person question the veracity of these
guidelines, stating that the rules are rather more relaxed, and this
FAQ is being over zealous. Fortunately, the King Penguin himself
responded to this, so I include his words on this, so that there can
be no doubt:
If I get a patch in an attachment (other than a "Text/PLAIN" type
attachment with no mangling and that pretty much all mail readers and
all tools will see as a normal body), I simply WILL NOT apply it unless
I have strong reason to. I usually wont even bother looking at it,
unless I expected something special from the sender.
Really. Don't send patches as attachments.
Linus
-
(RRR) To make a patch you use the diff program
(read the info file for diff). The easiest way to do this is to set up
two source trees under /usr/src, set a symlink "/usr/src/linux" to point
to the modified tree, and diff one tree against the other. The file /usr/src/Documentation/CodingStyle
has more specific information, read it. Things
to remember:
-
Always specify unified (-u) diff format.
-
Avoid making formatting changes to the source that make the diff needlessly
larger. Watch out for editors that convert tabs to spaces or vice versa.
-
Unless you have specific reasons, diff against the latest official source
tree. Otherwise, your patch is likely to be ignored. Either way, specify
in your post against what you've diff'ed.
-
Make sure your diff includes only the intended changes in your patch, not
every other patch you have made to your source tree. Usually patches are
limited to a few files, or directories. It is best to only diff the relevant
files i.e. if I only made changes to the file driver_xyz.c under drivers/net,
then I would use the following commands (assuming you have the original
source tree named "linux-2.1.105", and the modified tree pointed at by
the symlink "linux"):
cd /usr/src
diff -u linux-2.1.105/drivers/net/driver_xyz.c \
linux/drivers/net/driver_xyz.c > my_patch
-
The following two should go without saying: the arguments to diff are first
source (the original, unmodified file(s)),
and then destination (your modified version
of the file(s)), otherwise you get a reversed patch (and lots of
people wondering what you're smoking). Also, make sure your patch
applies and compiles cleanly.
-
Of course you need to set up two identical source directories to
be able to diff the tree later. A nice trick -- requiring a
little bit of consideration, though -- is to create the
modified source tree from hard links to the original source tree:
tar xzvf linux-2.1.anything.tar.gz
mv linux linux-2.1.anything.orig
cp -al linux-2.1.anything.orig linux-2.1.anything
This will hardlink every source file from the original tree to a new
location; it is very fast, since it does not need to create some 80+
megabytes of files. You can now apply patches to the
linux-2.1.anything source tree, since patch does not change the
original files but move them to filename.orig, so the
contents of the hard-linked file will not be changed.
Assuming that your editor does the same thing, too (moving original
files to backup files before writing out changed ones) you can also
freely edit within the hardlinked tree. If your editor does not
handle files this way, you need to make a copy of each file before
editing it, like this:
cp driver_xyz.c temporary; mv temporary driver_xyz.c
You can use file permissions to remind you to do this. Just remove
write permissions from all the files in the directory you are working
in:
chmod -w *.c
The changed tree can be diffed at high speed, since most files don't
just have indentical contents, they are identical files in both
trees. Naturally removing that tree is quite fast, too. Thanks to
Janos Farkas <[email protected]>
for this trick.
-
Finally, review the patch file (the format is not that complicated) before
posting, and include all relevant information as to the nature of the patch.
In particular, specify: why is this patch needed/useful, and what exactly
does it fix/improve.
-
How do I apply a patch?
-
(TAC) (From /usr/src/linux/README)
You can upgrade between releases by patching. Patches are distributed
in the traditional gzip and the new bzip2 format. To install by
patching, get all the newer patch files, enter the top-level directory
of the unpacked kernel source tree and execute:
gzip -cd patchXX.gz | patch -p1 or:
bzip2 -dc patchXX.bz2 | patch -p1
(repeat xx for all versions bigger than the version of your current
source tree, in order) and you should be ok. You may want to
remove the backup files (xxx~ or xxx.orig), and make sure that there
are no failed patches (xxx# or xxx.rej). If there are, either you or
me has made a mistake.
Alternatively, the script patch-kernel can
be used to automate this process. It determines the current kernel
version and applies any patches found. Use it thus:
scripts/patch-kernel .
The first argument in the command is the location of the kernel
source. Patches are applied from the current directory, but an
alternative directory can be specified as the second argument.
-
(RRR) To apply kernel patches please take
a look at the kernel README file (/usr/src/linux/README) under "Installing
the kernel". There is also a good
explanation on the Linux HQ Project site.
-
What's vger?
-
(REG) "vger" is the name of the machine
which hosts the LKML server. This server also hosts a number of other
linux-related mailing lists. More information about the server is
available at
http://vger.kernel.org/
-
What is a CVS tree? Where can I find more
information about CVS?
-
(REG) "CVS" is short for Concurrent
Versions System, a Source Code Management system. Check out the
CVS Bubbles page.
-
Is there a CVS tutorial somewhere?
-
(ADB) Here is a CVS tutorial which you
can find online:
Getting a general idea of how CVS works takes about 15 minutes (highly
recommended). Note that there are various graphical front ends to CVS,
so you don't have to learn the usual assortment of cryptic commands.
-
How do I get my patch into the kernel?
-
(RRR) Depending on your patch there are
several ways to get it into the kernel. The first thing is to
determine under which maintainer does your code fall into (look in the
MAINTANERS file). If your patch is only a small bugfix and you're
sure that it is 'obviously correct', then by all means send it to the
appropriate maintainer and post it to the list. If there is urgency
to the bugfix (i.e. a major security hole) you can also send it to
Linus directly, but remember he's likely to ignore random patches
unless they are "obviously correct" to him, have the maintainer's
approval, or have been well tested and meet the first condition. In
case you're wondering what constitutes well tested, here's another
important bit: one purpose of the list is to get patches peer-reviewed
and well-tested. Now, if your patch is relatively big, i.e. a rewrite
of a large code section or a new device driver, then to conserve
bandwidth and disk-space just post an announcement to the list with a
link to the patch. Lastly, if you're not too sure about your patch
yet, want some feedback from the maintainer, or wish to avoid
open-season flaming on work-in-progress, then use private email.
-
(REG) If there is no specific maintainer
for the part of the kernel you want to patch, then you have three main
options:
- send it to [email protected] and hope someone
picks it up and feeds it to Linus, or maybe Linus himself will pick it
up (don't count on it)
- send it to linux-kernel and Cc:
Linus Torvalds <[email protected]> and hope Linus
will apply it. Note that Linus operates like a black box. Do not
expect a response from him. You will need to check patches he releases
to see if he applied your patch. If he doesn't apply your patch, you
will need to resend it (often many times). If after weeks or months
and many patch releases he still hasn't applied it, maybe you should
give up. He probably doesn't like it
- send it to linux-kernel and Cc:
Alan Cox <[email protected]>. Alan is better at
responding to email, and will queue your patch and resend it to Linus
periodically, so you can forget about it. He also serves as a good
taste tester. If Alan accepts your patch, it's more likely that Linus
will too. If he doesn't like your patch, you will probably get an
email saying so. Expect it to be terse.
-
Why does the kernel tarball contain a directory
called linux/ instead of linux-x.y.z/ ?
-
(DW) Because that's the way Linus wants
it. It makes applying many consecutive patches simpler, because the
directory doesn't need to be renamed each time, and it also makes life
easier for Linus.
-
What's the difference between the official
kernels and Alan Cox's -ac series of patches?
-
(REG, contributed by Erik Mouw) Alan's
kernel can be seen as a test bed for Linus' kernels. While Linus is
very conservative and only applies obvious and well tested patches to
the 2.4 kernel, Alan maintains a set of kernel patches that contains
new concepts, more and/or newer drivers, and more intrusive
patches. If the patches prove themselves stable, Alan submits them to
Linus to include them into the official kernel.
-
What does it mean for a module to be tainted?
-
(REG, contributed by John Levon) Some
vendors distribute binary modules (i.e. modules without available
source code under a free software license). As the source is not
freely available, any bugs uncovered whilst such modules are loaded
cannot be investigated by the kernel hackers. All problems discovered
whilst such a module is loaded must be reported to the vendor of that
module, not the Linux kernel hackers and the linux-kernel
mailing list. The tainting scheme is used to identify bug reports from
kernels with binary modules loaded: such kernels are marked as
"tainted" by means of the MODULE_LICENSE tag. If a module is
loaded that does not specify an approved license, the kernel is marked
as tainted. The canonical list of approved license strings is in
linux/include/linux/module.h.
"oops" reports marked as tainted are of no use to the kernel
developers and will be ignored. A warning is output when such a module
is loaded. Note that you may come across module source that is under
a compatible license, but does not have a suitable
MODULE_LICENSE tag. If you see a warning from
modprobe or insmod for a module under a compatible
license, please report this bug to the maintainers of the module, so
that they can add the necessary tag.
-
(KO) If a symbol has been exported with
EXPORT_SYMBOL_GPL then it appears as unresolved for modules that do
not have a GPL compatible MODULE_LICENSE string, and prints a
warning.
A module can also taint the kernel if you do a forced load. This
bypasses the kernel/module verification checks and the result is
undefined, when it breaks you get to keep the pieces.
-
(KO) According to Alan Cox, a license of
"BSD without advertisement clause" is not a suitable free
software license. This license type allows binary only modules without
source code. Any modules in the kernel tarball with this license
should really be "Dual BSD/GPL".
-
What is this about GPLONLY symbols?
-
(REG) By default, symbols are exported
using EXPORT_SYMBOL, so they can be used by loadable
modules. During the 2.4 series, a new export directive
EXPORT_SYMBOL_GPL was added. This is almost the same thing,
except that the symbol can only be accessed by modules which have a GPL
compatible licence (note that this includes dual-licenced BSD/GPL
code). This new directive was added for these reasons:
- To clarify the ambiguous legal ground on which non-GPL
(particularly proprietary) modules lie. A strict reading of the GPL
prohibits loading proprietary modules into the kernel. While Linus has
consistently stated that proprietary modules are allowed (i.e. he has
granted an explicit exemption), it is not clear that he is able to
speak for all developers who have contributed to the Linux kernel.
While many think Linus' edict means that all contributed code falls
under this exemption granted by Linus, not everyone agrees that this
is a legally sound argument. The new EXPORT_SYMBOL_GPL
directive makes the licence conditions explicit, and thus removes the
legal ambiguity.
- To allow choice for developers who wish, for their own reasons, to
contribute code which cannot be used by proprietary modules. Just as a
developer has the right to distribute code under a proprietary
licence, so too may a developer distribute code under an
anti-proprietary licence (i.e. strict GPL).
Note that Linus has stated that existing symbols will not be switched to GPL-only. Developers of
proprietary modules for Linux need not fear. Furthermore, it is quite
unlikely that Linus will look favourably upon the introduction of new
core driver APIs which are restricted to GPL-only modules. This would
not be in the best interests of Linux. Linus has forwarded me a
message he sent to someone else to clarify his views.
-
Do I have to use BitKeeper to send patches?
-
(REG) Absolutely not. Some kernel
developers, including Linus and Marcelo, have chosen to use
BitKeeper to manage their
kernel source trees, but this does not mean you need to use BitKeeper
yourself to maintain your trees or submit patches. Many notable kernel
developers continue to maintain their source trees using other tools
and techniques, and continue to send conventional patches.
If you want to use BitKeeper to manage and submit your code, the
Documentation/BK-usage directory has some information and
sample scripts which contain some useful suggestions. This
documentation and code is not an endorsement of BitKeeper.
-
Why do some developers use the non-free
BitKeeper? Isn't this against the spirit of Free Software?
-
(REG) This depends on whose definition of
"freedom" you are using, and what you think "Free Software"
means. Some definitions are available at
http://www.gnu.org/philosophy/free-sw.html,
http://www.debian.org/social_contract.html and
http://www.opensource.org/docs/definition_plain.html.
If you subscribe to the view that all software wants to be free (and
hence all non-free/proprietary software is evil), then yes, using
BitKeeper is against the spirit
of Free Software.
However, if you care more about freedom for software developers to
develop software and use whatever tools they wish, then using
BitKeeper is in the spirit of Free Software, since some
developers feel that BitKeeper saves them a lot of time and effort.
That is good for the development of Free Software.
This is an idealogical debate that some people will never agree on,
and generates considerable flaming on both sides of the argument.
Talking about it on the kernel development mailing list will not
resolve the issue, no matter how loud or how many times you scream, so
it's better that those who feel strongly about it debate the finer
points of freedom elsewhere, and leave the development list for
matters of actual code development, which is what it is
intended for.
Note that BitKeeper is not free software, but it may be used for Free
Software projects at no charge, subject to the licensing rules (bk
help bkl will show the licence, or you can go to
http://www.bitkeeper.com/Sales.Licensing.Free.html).
If you are seriously concerned about the use of a non-free managment
tool for a Free Software project, the most productive approach to
changing the situation is to write a free replacement. This is most
likely to take several years of work, particularly because Linus is
very demanding. It is worth remembering that it took years from when
BitKeeper was "nearly good enough" to Linus being satisfied with the
feature set. A free replacement will face the same technical hurdles.
Larry McVoy (founder of BitKeeper) stated on April 2002 what the
development effort was:
it took 4 years of at least 6 day/week efforts by a team that varied
in size from 3-8 engineers to get BitKeeper where it is today.
-
Who maintains the kernel?
-
(REG) Originally, Linus Torvalds
maintained the kernel. As the kernel has matured, he has delegated
maintenance for older stable versions to others, while he continues
development of the latest "bleeding edge" release. As of 27-MAY-2002,
the following kernel versions are maintained by these people:
-
The kernel doesn't compile cleanly. What shall I
do?
-
(REG) First make sure you have the latest
version of that kernel series. Perhaps a pre-patch already has a fix.
If not, search the list archives for a fix. Don't contribute to noise
on the list by asking a question that may already have been answered.
If the problem has not yet been fixed, try digging into the code
yourself and post a fix to the mailing list. You'll be famous! Beware
that making broken code compile just for the sake of a clean 'make
bzImage modules' doesn't count as a fix, and your fix will be
discarded, ignored or flamed.
Section 2 - Driver specific questions
-
Driver such and such is broken!
-
(RRR) Try to be more specific. Please,
provide information on your particular setup (see Qs How do I make a
bug report?) Also see the Q: "kernel x.y.z broken!" below.
-
(ADB) That's the worst possible way to start
a thread. Please try to reach the author of the driver first and report
the "broken" driver to him. Constructive criticism is welcome, usually.
-
Here is a new driver for hardware XYZ.
-
(REW) Good work! Please try to find a few
people that also have the XYZ hardware and have them test it on their configuration
(e.g. by posting a message on a newsgroup). No it won't go in the standard
kernel before some people have tested it.
Testing will take a while. In the mean time, kernel development will
continue, and you will have to rewrite your patch for the most recent version
before Linus might consider it.
As a whole new driver is most likely more than a few pages long, we'd
prefer it if you would put the actual driver up for ftp instead of posting
it to the list. Post the URL and the description that tells us what your
driver does for which hardware.
-
Is there support for my card TW-345 model C in
kernel version f.g.hh?
-
(REW) First check if your card is detected
at boot time. It usually is. Second see if you might need to configure
something like modules.conf for your card. Third see if there is a file
with the card name in the kernel sources. (e.g. you have a Buslogic card,
and there is a buslogic.c file in the kernel sources, you're in luck.).
Next, grep for the manufacturer name through ALL the kernel sources. And
try the model number of your card. Also try to find the largest chip on
your card and grep for the chip number on that thing. Realize that 53C80
chips might be named 5380 in the kernel. Other chips don't have their middle
name removed.
Nothing yet? Now check DejaNews, using the same arguments you used
to grep the kernel source. There are 99.99% chances that somebody has exactly
the same card TW-345 model C.
Ok. That's what you can do without bothering anyone. If all this doesn't
lead somewhere, you should really ask this question on a newsgroup like
comp.os.linux.hardware.
-
Who maintains driver such and such?
-
(RRR) Have a look at the /usr/src/linux/MAINTAINERS
file, this is the most authoritative source. Also check the source code
for the driver itself; in both cases, check the latest version of the kernel
that you have available. Some drivers have specific Web pages and sometimes
even a dedicated mailing list. Check those first. If you cannot contact
the maintainer then as a last resort post a short message to the
list. In any case, keep in mind that maintainers
are usually very busy people and most of them work on Linux for
free and in their spare time, so don't expect an
immediate response. Some maintainers get just too many mails in
too small periods of time to be able to answer them all, so please be kind
to them.
-
I want to write a driver for card TW-345 model C,
how do I get started?
-
(REW) Good initiative! First a piece of advise:
are you up to this? Ten times as many projects like this get started as
get finished. Also, make sure that you're not doing double work. Make sure
that such a driver is not already available: read Q/A 2.3
above...
First prepare yourself. Get the docs, read them (OK, you're allowed
to start skipping stuff if you've gotten to the part "detailed register
descriptions"). Next, get the Linux kernel source, find a driver that drives
similar hardware to the one you're going to work on, and read THAT. (I
usually use the smallest one I can find: wc -l *.c | sort -n | head -4).
Ok. You've thought about it. Now the question is, do you have technical
documentation for your card? You can reverse engineer the driver for MS
operating systems, but having the documentation is MUCH easier.
In the dark old ages (70s to middle of the 80s), you got a complete
technical description with every card you could get. This is no longer
the case. Anyway, contact your vendor and politely ask them for the "device
driver kit" or the "technical manual" for the card.
Try the head office and your local office at the same time. Local offices
occasionally have bad photo copies that they give out before you get an
official rejection from the head office. In that case whom you got the
documentation from becomes confidential information. Don't put the guy's
name in the source.
If you can't get the technical documentation, consider giving up and
investing in a competitors product (and tell the manufacturer about this).
Not given up yet? Ok. Next step is to find out what the DOS driver does.
Try to get the card to work while you run it in a microsoft emulator (dosemu
or WINE). This will allow you to program these tools to log the I/O accesses
of the driver. This will give you a large list of I/O accesses that the
driver did. If you're good, you might be able to see patterns, and deduce
how the driver works. From there you might be able to write a working driver.
Good luck! You'll need it.
-
I want to get the docs, but they want me to sign
an NDA (Non-Disclosure Agreement).
-
(REW) Some people find this a tremendous problem.
Some companies just want to know who has the docs to their hardware, and
don't mind if you write a GPL-ed driver. In that case, there is really
no problem: just tell them what you intend to do and ask them to acknowledge
in writing that they've understood what you're saying. In that case, you
can get your driver into the standard kernel, but you cannot send out the
docs to anybody who wants to work on the driver. They will have to rely
on the comments in the source.
Other companies (just like Netscape) themselves signed NDAs that forbids
them to disclose information to you.
Some really think that they have trade secrets in the interface towards
the software, and intend to keep them secret. Those won't allow you to
write a driver and then put the source on the net. Be careful with
these.
-
(ADB) The first and only NDA I ever received
instantly found its way to the wastebasket. I would advise anybody who
gets an NDA to refuse to sign it, if it refers to anything that may/will
be put under GNU/GPL. Of course, for contract work this doesn't apply.
-
I want/need/must have a driver for card TW-345 model
C! Won't anybody write one for me?
-
(REW) Some Linux developers will settle for
a beer, and develop the driver for you. Others want a "free sample" of
the hardware and will then go ahead and write the driver.
If you need more than a few of the cards or you manufacture the cards
yourself, you can consider paying one of the commercial Linux device driver
companies to get a commercially backed, officially maintained device driver.
-
What's this major/minor device number thing?
-
(REG) Device numbers are the traditional Unix
way to provide a mapping between the filesystem and device drivers. A device
number is a combination of a major number and a minor number. Currently
Linux has 8 bit majors and minors. When you open a device file (character
or block device) the kernel takes the major number from the inode and indexes
into a table of driver structure pointers. The specific driver structure
is then used to call the driver open() method, which in turn may interpret
the minor number. There are two tables: one for character devices and one
for block devices, each are 256 entries maximum. Obviously, there must
be agreement between device numbers used in a driver and files in /dev.
The kernel source has the file Documentation/devices.tex which lists all
the official major and minor numbers. H. Peter Anvin (HPA) maintains this
list. If you write a new driver (for public consumption), you will need
to get a major number allocated by HPA. See the Q/A on
devfs for an improved (IMHO) mechanism for handling device drivers.
-
Why aren't WinModems supported?
-
(REG, quoting Edward S. Marshall) The
problem is the lack of specifications for this hardware. Most
companies producing so-called "WinModems" refuse to provide
specifications which would allow non-Microsoft operating systems to
use them.
The basic issue is that they don't work like a traditional modem;
they don't have a DSP, instead making the CPU do all the work. Hence,
you can't talk to them like a traditional modem, and you need
to run the modem driver as a realtime task, or you'll have serious
data loss issues under any kind of load. They're simply a poor design.
-
(REG) Note that some people have been
putting effort into reverse engineering some WinModems, so you may be
lucky and find that yours is now supported. If not, it's time to get a
refund and buy a real modem.
Note that modems have to be approved by the appropriate statutory or
regulatory body for standards compliance (to make sure they don't send
crap down the line and blow up the exchange). With WinModems, the
driver software needs to be certified as well as the hardware. It's
harder to get approval for Open Source drivers, since it usually costs
money to obtain approval. Also, in theory, it's easier to modify an
Open Source driver, so it would no longer be compliant. In reality,
99.999% of users don't even know there is source code for the driver,
so "Standards Compliance" may well be a smoke-screen for manfacturers
who don't want to bother with non-WinTel systems. If certification was
the only problem, manufacturers could release binary-only drivers.
-
(DW)The good news is that a certain
amount of WinModem hardware is now supported. The bad news is that
that is just the tip of the iceberg. Although the WinModems can now be
used, they have functionality similar to that of a sound card - all the
modulation and demodulation has to be performed by the host CPU. Work
is progressing on this front too - see http://www.linmodems.org/
for more up-to-date information.
-
Modern CPUs are very fast, so why can't I write a user mode
interrupt handler?
-
(REG, quoting Pete Zaitcev) This is not a
question of having enough CPU cycles to waste them on mode
switches. Rather, the current Linux architecture does not allow
it. User processes run with interrupts enabled. Thus, any interrupt
handler must deactivate the particular interrupt source before a
process is scheduled to run, or an interrupt storm results. The
deactivation is done in a device specific manner, so at least a small
device driver must be present in kernel mode.
-
Do I need to test my driver against all distributions?
-
(REG, MEA) There are minor detail changes
in between each kernel version (even in stable series), and depending
on what configuration options are used (basically SMP or not), certain
things like spinlocks may or may not reserve space in structures, and
may or may not need to be called (are even optimized away in non-SMP
systems), meaning that a binary driver compiled for SMP might
not work with a non-SMP kernel. And vice versa.
Also different vendors tend to inject different things into their
kernel patch-sets, which again may subtly change data layouts, etc. In
stable kernel series great pains are suffered at maintenance so that
data layouts of in-kernel APIs (and API calls themselves) are not
changed. Nevertheless something may change making binary drivers to
fail in mysterious ways.
Subtle memory changes may appear with i386-PAE mode (large memory
machines which can't map all of RAM into the kernel at the same time).
Because of these differences, a driver compiled for one version of the
kernel, or one vendor's kernel, is not likely to work with another
kernel. Thus, if you are distributing a binary-only driver, you will
have a significant support load compiling drivers for different
kernels. If you are distributing a driver in source form, then,
provided the driver is well-written (i.e. does not make assumptions
about byte ordering or word sizes and uses standard kernel
interfaces), the driver should be portable across kernel versions and
architecture types. It will of course have to be compiled by end-users
for their particular kernel. Distribution maintainers are likely to
provide pre-compiled drivers, thus most end-users won't need to
compile the driver themselves.
Section 3 - Mailing list questions
The linux-kernel mailing list is for discussion of the development of the
Linux kernel itself. Questions about administration of a Linux based system,
programming on a Linux system or questions about a Linux distribution
are not appropriate.
"Test" messages are very,
very inappropriate on the lkml or any other list, for that
matter. If you want to know whether the subscribe succeeded, wait for
a couple of hours after you get a reply from the mailing list software
saying it did. You'll undoubtedly get a number of list messages. If
you want to know whether you can post, you must have something
important to say, right? After you have read the following paragraphs,
compose a real letter, not a test message, in an editor, saving the
body of the letter in the off chance your post doesn't succeed. Then
post your letter to lkml. Please remember that there are quite a
number of subscribers, and it will take a while for your letter to be
reflected back to you. An hour is not too long to wait.
(REG)
The essential point to remember when posting to the linux-kernel
mailing list is that there are a lot of very busy people reading the
list. No matter how important you think you are, it is most likely
that there are many people on the list who are more important than
you. "Important" is not measured by the amount of money you have, how
much your question is worth to your company or how desperate you are
for an answer, rather, it is measured by how much you contribute to
the linux kernel.
With that in mind, you should make sure that you are not wasting the
time of other people on the list. Write for maximum efficiency of
reading. It doesn't matter if it takes twice as long for you to
compose a more readable message, if it halves the time a hundred key
kernel developers spend trying to decode your message. Ignoring good
taste and consideration is most likely to result in you being ignored.
-
How do I subscribe to the linux-kernel mailing
list?
-
(ADB) Think again before you
subscribe. Do you really want to get that much traffic in your
mailbox? Are you so concerned about Linux kernel development that you
will patch your kernel once a week, suffer through the oopses, bugs
and the resulting time and energy losses? Are you ready to join the
Order of the Great Penguin, and be called a "Linux geek" for the rest
of your life? Maybe you're better off reading the weekly "Kernel
Traffic" summary at
http://kt.zork.net/.
OK, if you still want to read linux-kernel in its full glory,
send the line "subscribe linux-kernel your_email@your_ISP" in the body
of the message to [email protected] (don't include the "
characters, and of course replace the fake email address with your
true address). You have been
warned!
-
(MEA) Quite often I see things like what
this summary report tells:
FAILED:
<smtp cedar-republic.com [email protected] 60000>: ...\
<<- RCPT To:<[email protected]>
->> 550 <[email protected]>... we do not relay
Feeding this address to a page at URL:
http://vger.kernel.org/mxverify.html
yields information that ONE of their backup MX servers refuses to send email thru to them. Thus whenever all other servers fail to be reachable, that one ruins their email connectivity.
Do make sure YOU don't have this very problem!
See
http://vger.kernel.org/majordomo-info.html for information on
Majordomo.
-
How do I unsubscribe from the linux-kernel mailing
list?
-
(ADB) At the bottom of each and every message
sent by the linux-kernel mailing list server one can read:
-
To unsubscribe from this list: send the line
"unsubscribe linux-kernel" in
the body of a message to [email protected]
See
http://vger.kernel.org/majordomo-info.html for information on
Majordomo.
-
Do I have to be subscribed to post to the list?
-
(ADB) No, you don't have to be subscribed
to the list to post to it. The address of the list is
[email protected]. And you should indicate on your
message that you wish to be personally CC'ed the answers/comments
posted to the list in response to your posting.
-
(REG) It is, however, generally
considered good netiquette to be subscribed to a list (or a newsgroup
for that matter) and lurk for a while before posting. That way you can
learn what's considered an appropriate post and what isn't.
Don't treat the list as your personal helpdesk. Remember that the
list is a community.
-
Is there an archive for the list?
-
(REG) There are many. Here are some:
Here are some more resources:
- "Kernel Traffic" at
http://kt.zork.net/
provides a weekly summary of the discussions on the list, and archives
previous summaries.
-
How can I search the archive for a specific question?
-
(ADB) Use simple keywords which refer to the
issue that matters to you. For example, if you are investigating an oops
that happens whenever you plug in a network adapter NIC-007, use "NIC-007"
or "oops NIC-007". As soon as you have found a link to a message that interests
you, try to follow the thread. Remember that you will almost always get
more information by carefully searching the archive than by posting a question
to the list itself.
-
Are there other ways to search the Web for information
on a particular Linux kernel issue?
-
(ADB) Sure. Before you check the list archives,
you can search DejaNews and AltaVista (simultaneously, if your browser
allows you to open various windows). You can also follow some links on
the Linux Documentation Project
site.
-
How heavy is the traffic on the list?
-
(TAC) List traffic is very heavy; the average
number of messages per day is 290 [10/2002 - 04/2003]. That's over 8,700
message a month!!!
-
(ADB) You really don't want to read each
and every posting to the list. If you are concerned with list traffic,
I suggest you temporarily try the digest lists, which will be much
easier on your mailbox (thanks to A. Wik for this suggestion).
-
(REG) There is a weekly summary called
"Kernel Traffic" at
http://kt.zork.net/,
which can save you a lot of time.
-
What kind of question can I ask on the list?
-
(ADB) The basic rule is to avoid asking
questions that have been asked before, or that are irrelevant to other
list users, or that are off topic. Please use your good sense.
-
(REG) Remember that this is a list for
the discussion of kernel
development. If you have some ideas or bug reports to
contribute, this is the place. User space issues are not appropriate
for this forum. If you find a bug in the C library or some
application, it doesn't belong on linux-kernel.
-
What posting style should I use for the
list?
-
(REG, contributed by [email protected])
When following up a post on the kernel mailing list, please think
before you quote. Since everybody else on the list also got the
original post, don't quote it entirely. Highlight only the points that
you really need to understand your arguments. Make sure the quoted
part is recognizable as such, by ensuring each quoted line starts with
a > (or more >>, in case of multi-level quoting). Don't quote
signatures, entire patches, entire config files or entire posts. Don't
quote the standard signature. The kernel-list is crowded enough
already, let's take care!
-
(REG)
Be aware that your message is far more likely to be deleted without
being read if you have too much quoted material before your reply.
-
(REG)
And please reply after the quoted
text, not before it (as per
RFC 1855).
It's very confusing to see a reply before the quoted context. And it's
embarrassing: it makes you look like a newbie. Change your mailer if
necessary, if the one you have makes it hard to do
reply-after-quoting.
I know some people like to quote the entire message they are replying
to, so they put their reply right at the top so people won't give up
after the first page of quoted material. Don't do it. It's
annoying. Just learn to stop quoting everything. No-one wants to see
it all anyway (list archives allow people to see everything if they
missed it). You're not helping yourself anyway, as you're more likely
to be ignored if you reply-before-quoting.
-
(REG)
Please don't use tabs or multiple spaces to quote text. Use the
"> " sequence instead. Using whitespace to quote text
makes it difficult to differentiate between what's quoted and the
reply. And don't try to be cute or "different" and use some other
character like "}" or whatever. Again, it's confusing. It
wastes people's time. Write for maximum efficiency of
reading.
-
(REG)
Please try to have halfway reasonable spelling and grammar. When
reading text with really bad spelling or grammar, people stall while
trying to parse your post. Don't think you're being "artistic" by
stripping out all punctuation characters. Linux-kernel is not an
online gallery, it's a communications medium. Write for
maximum efficiency of reading.
-
(REG)
Please don't have long, inflammatory, controversial or offensive
signatures (see
RFC 1855). The rule
of thumb is no more than 4 lines of 80 characters each.
-
(PG)
Don't attach huge files to your post. One major culprit is people
attaching their kernel .config file to their post. These can
be in excess of 1000 lines, and will grow more as kernel options are
continuously added. If the contents of your .config file are relevant
to your post then attach the output of
grep ^C .config
or
grep "=[y|m]" .config
.
-
(MEA)
Some structures are forbidden as they appear to be used way too much
in SPAM mail. Specifically, messages with Content-Type:
text/html either as the only (primary) message, or as ANY of
component sub-messages are considered spam, and rejected outright
without any info to the sender.
Also, any message with header matching the regular expression:
X-Mailing-List:.*@vger.kernel.org
is considered to be LOOPING somewhere, and is thus diverted
to list-owner.
-
Is the list moderated?
-
(ADB) No, the linux-kernel list is not moderated.
-
Can I be ejected from the list?
-
(ADB) It is technically possible, but I
have never heard of anybody being ejected from the linux-kernel
list.
-
(REW) But
you will if you post questions or answers that are asked and
answered on this FAQ. ;-)
-
(MEA)
Oh definitely, all you need to have is malfunctioning email system
which does not accept email to you -- e.g. check your domain backup MX
servers by using the tool at:
http://vger.kernel.org/mxverify.html
It is known that over the years the keepers of vger's lists have
removed some people after getting sufficiently annoyed with them,
but there you really must try to exceed yourself, and will likely
get lots of peer pressure before getting kicked off.
Another way to quickly get yourself removed is to use the program
called "fetchmail" -- which in itself is not all that bad, but
apparently it is far too easy to accidentally re-post email to
addresses which the visible RFC 822 headers contain -- that is, what
the original sender used, like:
To: [email protected]
The result is duplicate messages on the mailing list. If you let that
happen, you can be quite sure that your subscription will be removed
as soon as possible.
-
Are there any implicit rules on this list that I
should be aware of?
-
(ADB) Here are a few implicit rules which
you should be aware of:
-
Stick to the subject. This is a Linux kernel list, mainly for developers.
-
Use English only!
-
Don't post in HTML format! If you are using IE or Netscape, please turn
off HTML formatting for your posts to the kernel list.
-
If you use that other OS, make sure your mailer doesn't use
Charset="Windows*" as those posts will be blocked.
-
If you will be asking a question, before you post to the list, try to find
the answer in the available documentation or in the list archives. Just
remember that 99% of the questions on this list have already been answered
at least once. Usually the first answer is the most detailed, so the
archives contain far better information than you will get from somebody
who has answered the same question a dozen times or more.
- Be precise, clear and concise, whether asking a question or
making a comment or announcing a bug, posting a patch or
whatever. Post facts, avoid opinions.
- Be nice, there is no need to be rude. Avoid expressions that may
be interpreted as aggressive towards other list participants, even if
the subject being treated is particularly relevant to you and/or
controversial.
- Don't drag on with controversies. Don't try to have the last
word. You will eventually have the last word, but meanwhile you'll
have lost all your sympathy credit.
- A line of code is worth a thousand words. If you think of a new
feature, implement it first, then post to the list for comments.
- It's very easy to criticize someone else's code, but when you
write something for the first time, it's not that simple. If you find
a bug, a mistake, or something that could be perfected, don't
immediately post a comment such as "This piece of code is crap, how
did it get into the kernel?". Contact the author of the code, explain
the issue, and try to get the point across in a simple, humble way. Do
that a few times and you will get a lot of credit as a good code
debugger. Then when you write a piece of
code people will pay attention to you.
- Don't flame beginners that ask the wrong questions. This adds
noise to the list. Send them a private mail pointing them to a source
of information e.g. this FAQ.
-
(MEA) If you post HTML, your email won't
make it to the lists (see section 3.9).
-
(REG)
Don't post post any religious or political material,
including in your signature. Doing it in the body of a message will
anger people, as it's always off-topic and is a waste of bandwidth
(remember that even in the 21st century, many people are still being
gouged by the second for bandwidth by their ISP or telco or both).
Including this unwanted material in your signature is less obnoxious,
but is pointless at best (preaching to the converted). Most people
will ignore it, and many will be prone to ignore the content of your
message, recognising you are a wanker. If you want to be taken
seriously, leave the soap-box at home. Limit your posts to technical
issues.
-
How do I post to the list?
-
Does the list get spammed?
-
(ADB) The linux-kernel list is no longer
spammed, you will rarely if ever find a commercial posting to the list
itself. OTOH once you post to the list, expect to get a few
undesirable mails in the following days. Unfortunately some people
watch the list and think it's a good idea to pick names from it. There
are many ways to avoid spam, check the dedicated anti spam sites on
the list. I learned many things this way.
-
(REW) Although the list maintainers do
their best to keep the list spam free, it is not possible to do this
100%. Some of the good kernel development people cannot keep up with
the volume on linux-kernel. But they do occasionally post. Therefore
we need to keep the submissions open for "everybody". Some of the
other important people have two or three Email addresses. They too
need to post from different addresses. Consequently something that
looks like a submission from a valid return address tends to go on the
list. There is nothing an automated filtering system can do about it.
The end result is about one spam a month. It happens. The maintainer
will get a flood of mail about it and he will block the domain it came
from. Please don't bother the list about it, don't add noise. Don't post "This guy is a jerk if he spams this
list". Don't post "I traced him, you can
mail bomb him at this address". Don't
post "I traced him, bother his postmaster at such and
such".
-
I am not getting any mail anymore from the list!
Is it down or what?
-
(ADB) Majordomo is an intelligent mail list
server. If for any reason your email address is unavailable, after some
retries you will be automatically unsubscribed.
-
(REW) On the other hand, accidents with
the mailing list server have happened. These have wiped out the whole
subscription list once or twice. Just resubscribe. Majordomo will get
you a nice note saying you're still subscribed if suddenly everybody
went dumb. Don't post "Just testing: Is
the list working? I didn't get any mail for a few days now".
-
(MEA) You may get unsubscribed because
MTAs relaying traffic to you get bounces for some reason. One thing
to verify is that your email routing data in the DNS is valid,
e.g. feed your address to the input box at:
http://vger.kernel.org/mxverify.html
-
(MEA) VGER and/or one of its fanout boxes
may be in overload. Usually system keepers notice the situation, and
it becomes fixed within 1-2 days without messages being lost, but we
don't track the entire world. Asking help from
[email protected]
could expedite the issue. Asking help on lists WILL NOT help, doing
so just puts more load on the system!
-
Is there an NNTP gateway somewhere for the
mailing list?
-
I want to post a Great Idea (tm) to the list. What
should I do?
-
(REG) OK, that's great. Now:
-
First make sure that your idea is relevant to kernel development. Perhaps
your idea is better implemented in the C library, or perhaps in a new library?
Before posting to linux-kernel, be sure it really is
a kernel issue.
-
OK, so you have this great idea for the kernel. Are you sure someone hasn't
thought of it before? Reading all of this document is a good starting point.
Also search the mailing list archives to see if that
topic has been raised before.
-
Now you have verified that you have an idea none has suggested before.
For the best response, code up an implementation/kernel
patch and post that to the kernel list when you announce your idea.
If you provide code, you can be sure someone will try it out and give you
comments. If you don't know anything about kernel hacking, this is a good
time to start learning:-) By the time you've implemented your idea, you'll
be able to call yourself a Linux Guru.
-
If you really can't code something up, but still would like your idea implemented,
post a message to the kernel list. Be as clear and precise as possible,
so that people can understand your idea quickly. If you are lucky, someone
who likes your idea may find the time to implement it. If nobody steps
forward to implement it, you're out of luck: remember, we're all volunteers
and we all have too many things to do as it is.
-
If you get a negative response to your idea, don't get offended, after
all, we all have different notions on what is a Good Idea (tm) and a
Bad Idea (tm). If someone is rude to you, please resist the temptation
to carry on a war on the list. Instead, email them privately saying
that you don't like their rudeness. If everybody is polite, but just
strongly disagrees with your idea, be careful not to push it too
hard. If people haven't understood the point you are making, try
explaining it a different way. But if people understand your idea but
maintain it is flawed, it's time to stop pushing it. Pushing harder
will just get you ignored.
-
If you're convinced you're right, despite what everybody else says, stop
talking about it and implement it! If you're right, you'll have the last
laugh.
-
(ADB) Good code (i.e. documented, elegant,
efficient) and some benchmarking data showing your Great Idea performs
well will go a long way to show you're right.
-
There is a long thread going on about something
completely offtopic, unrelated to the kernel, and even some people who
are in the "Who's who" section of this FAQ are mingling in it. What should
I do to fight this "noise"?
-
(REW, ADB) Ignore it.
-
(REG) Don't send a response to the kernel
list under any circumstances. If you feel compelled to respond, do so
privately informing the person that the message was offtopic. Or set
up a procmail recipe to drop all messages for that thread: that way
you'll never see the thread again.
-
Can we have the Subject: line modified
to help mail filters?
-
(REG) The usual proposition is that a
string like [LINUX-KERNEL] is prepended to the subject line.
This question has been raised many times before, and the answer has
always been "no" or "there are better ways to filter email". The
majority of the developers, and all (?) of the list maintainers take
this position. Some of the reasons are:
- It would increase the size of the Subject: line. This is a problem,
as it limits the amount of useful information that can be seen in the
Subject: line, making it harder to scan through a list of subject
lines looking for interesting subjects.
- It doesn't work for cross-posted messages, as the subject line for
a single email will change depending on which list it was sent
via. Not only can this confuse simple-minded filtering recipes, it can
also break threaded mail readers (people may end up reading the same
message twice).
-
The correct way to filter is to base your recipe on the
X-Mailing-List: line, which should always have
"[email protected]".
An example procmail recipe would look like this:
# Linux-kernel list
:0: /var/lib/emacs/lock/!home!fred!mfilter!linux!kernel
* ^X-Mailing-List:.*linux-kernel@vger\.kernel\.org
/home/fred/mfilter/linux/kernel
People subscribed to [email protected], which uses
GNU Mailman, may want to use something like this:
# linux-kernel-digest
:0
* ^X-BeenThere: linux-kernel-digest@lists\.us\.dell\.com
/home/fred/mfilter/linux/kernel-digest
People using mailagent might try this in their .rules file (thanks to
Martin Smith <[email protected]>):
To CC: /[email protected]/
{ SPLIT -adi ~/Kernel }
Similarly to procmail you can omit the mail folder from the split
command. This causes the split messages to go back into the mailagent
queue for further processing.
Most mailers with filtering capabilities can be similarly
configured. If not, then you can simply install procmail. If perchance
you're running a damaged OS that can't filter properly, and there is
no procmail port for it, then you should either upgrade, or accept
that you won't be able to filter linux-kernel. Don't bother asking for
a subject line modification.
-
Can we have a Reply-To: header
automatically added to the list traffic?
-
(DW) Some mailing lists automatically add
a Reply-To: header to the mails which go through them, forcing
people to reply to the list, rather than replying personally to the
original poster. This is a bad idea for many reasons which won't be
listed here. See Chip Rosenthal's excellent summary
Reply-To: Munging Considered Harmful
for more explanation.
-
Can I post job offers/requests to the
list?
-
(REG) Of course not! This is a technical
development list, not a job exchange. If you want to join a job
exchange list, send subscribe linuxjobs in the body
of the message to
[email protected] by executing
the following command:
echo "subscribe linuxjobs" | mail [email protected]
To send messages, email to
[email protected]
-
Why do I get bounces when I send private email
to some people?
-
(REG) This could be for a variety of
reasons, such as temporary problems with mail delivery. Your email may
also be blocked (permanently rejected) by that individual or their
ISP. This often happens if you send email from a machine or domain
which is listed in the MAPS RBL, DUL and ORBS lists. These lists have
been set up to protect people against spam. See
http://www.mail-abuse.org/
and http://www.orbs.org/ for more
information on these lists.
NOTE that these lists aren't trying to
block you personally, they are trying to block known spammers or
spammer-friendly sites (RBL and ORBS), or uncontrolled dial-up users
(DUL). If you are being blocked, it probably means you have the
misfortune to be using an ISP that is not a good net citizen and thus
has been added to the RBL or ORBS lists. In some cases, you may be
blocked because your ISP has volunteered their dial-up IP address
ranges to the DUL, in which case you should be using their approved
mail relay rather than sending email out directly from your host.
You must NOT post a message to the kernel list about this, as
the people there cannot and will not help you. Nor should you use the
list as a means of getting a message through to the individual you are
trying to contact. This is not what the list is for.
If you are intent on making a fool of yourself in public, follow the
same path as too many others before you, and complain on the kernel
list about how unfair it is that you are being blocked because your
ISP is bad. Expect sympathy from some, flames from others and silence
from most. The net gain will be that your mail will still be blocked
by the anti-spam lists, many people will ignore you in future emails
(because you've made a fool of yourself), and you may find yourself in
the killfiles of some people (i.e. you personally are being
blocked because some people are fed up with you and don't want to hear
anything more from you).
If you actually want your mails to no longer be blocked, get your ISP
to clean up their act, or switch to a decent ISP. If you are required
to use your ISP's mail relay, but it is crippled somehow, complain to
your ISP or switch to one with competent staff.
If your ISP is unresponsive and you don't have an alternative ISP you
could switch to, you'll just have to accept that an increasing
fraction of people will block your email (as more and more people
subscribe to the anti-spam lists). There's no point in shouting at the
people who are defending themselves against spam (no-one is obliged to
receive any and all email), go pester the spammers instead.
-
Why don't you split the list, such as having one each
for the development and stable series?
-
(REG, by "hacksaw") It's true that the
lkml is a high traffic list and can be a lot to handle. However,
splitting the list wouldn't help, since most developers would just
subscribe to both lists. In fact, there would then be extra traffic,
because of the number of issues that hit both the development and
stable kernels, or even farther back!
Section 4 - "How do I" questions
-
How do I post a patch?
-
(ADB) I assume you made the patch following
the general instructions found above. Now write a
short post describing your patch, the version of the kernel it applies
to, your tests, the feedback you would like to get, etc. This should fit
in 10 lines. Attach your patch and a one line README file describing it
very succinctly, and mentioning your name and email (either as two ASCII
files or as a MIME encoded tarball). In the subject of your post, put:
[PATCH] <the driver name or piece of code patched>, kernel <kernel
version>. Send. Wait.
The small README file insures that your patch will not start
circulating around the net without people noticing your name. If you
don't care about copyright and/or your patch is trivial, you can skip
tarring the files, just gzip the patch file and attach it to your
post.
-
(PG) If your patch is on the large size
(say larger than 500 lines) consider posting a URL pointing to the
patch along with the patch description, instead of the whole patch. If
you don't have a WWW site handy to put the patch on, then at least
gzip it prior to attaching it to your post/patch description.
-
(REG) Note that Linus does not read
linux-kernel very much. So if you want him to see a patch, you will
need to send it to him directly (say by Cc:ing him if you post to the
list). Note that Linus likes to be able to read patches in plain
ASCII, so anything that is uuencoded or MIMEd is likely to go straight
to the bit-bucket. If because your patch is large you only send a URL,
send a plain-text copy to Linus privately.
Also note that Linus drops patches silently when he is too busy (which
is always:-), so if you don't see it in the next kernel patch, send it
again. Oh, and don't expect him to tell you he's applied the patch,
either.
-
How do I capture an Oops?
-
(REG, quoting Keith Owens)
If an Oops is recoverable then the text appears first in the kernel
message buffer (/proc/kmsg). You can use the dmesg command to print
the contents but most of the time klogd and syslogd will automatically
capture the Oops and write it to your log files.
Sometimes an Oops is so bad that the kernel is completely hung. When
this occurs, almost anything that requires kernel support is also
dead. In particular most interrupt driven subsystems are unusable,
especially after the dreaded "Aiee, killing interrupt
handler" message. Since most disk controllers use interrupts, no
disk I/O is possible so the Oops does not get written to the log
files. The same problem applies to logging over the network, most
network cards require interrupt handlers.
In a complete hang, you have three options.
-
Write the Oops down by hand from the screen and type it in after you
have rebooted. This is the only option if you have not planned for a
kernel hang.
-
If you plan ahead and install a serial console linked to another
machine (read linux/Documentation/serial-console.txt) then
you can capture the Oops report on the other machine. By far the
easiest and most reliable option.
-
Since kernel 2.3.10 it has also been possible to use a parallel port
line printer as a console. You can either attach a real printer, or
another computer with EPP (Enhanced Parallel Port) support, which
pretends to be a printer.
-
There have been patches on linux-kernel to save the log somewhere in
hardware. Unfortunately these patches are very hardware
specific. Search the l-k archives for "Oops assist", "OOPS output over
reboot" and "KMSGDUMP". Most of these patches require that the
keyboard still works and even that can be useless when the kernel
hangs.
Other operating systems can save the log even when the machine hangs,
why doesn't Linux? Any OS that can save the log after a catastrophic
kernel failure must do so without kernel support, that typically means
using the underlying hardware. Alas the ix86 hardware does not
provide enough support for this, in particular most BIOS will clear
memory on reset, destroying any data in storage.
-
How do I post an Oops?
-
(ADB) Assuming you have found a genuine Oops
(those are rare nowadays, but they happen), you should post the relevant
portions of your system log, kernel configuration file and kernel
symbol map, plus a description of your hardware and the circumstances under
which the Oops occurred. Can the Oops be triggered by any particular method?
Did it happen after you changed any part of your hardware configuration?
Don't post your oops report before you have checked
linux/Documentation/oops-tracing.txt, the relevant paragraphs in
linux/README, the ksymoops C program in linux/scripts/ksymoops which
has another README, and the gdb man and info pages (thanks to Paul
Kimoto for this tip). These documents describe the basic procedure for
kernel oops tracing. Good trace info makes it much easier to
understand and solve apparently weird oopses.
-
(REG)
Don't even bother posting an Oops if you haven't run it through
ksymoops to decode the symbol addresses. The report will be
ignored because it contains too little useful information.
Make sure you copy the correct System.map file into
/boot or into the modules directory, otherwise you will get
incorrect results.
-
(REG, quoting "The Doctor What")
There are some situations that make a kernel oops useless. The two
most obvious are if your are overclocking your CPU or running
VMWARE's vmmon. The reason is that overclocking can introduce
random bit errors, while VMWARE's vmmon has the ability to (and
does) change parts of the kernel. In both cases, data in the
kernel, as reported by the oops, won't be useful.
-
I think I found a bug, how do I report it?
-
(ADB) A bug differs very slightly from an
oops, actually. An oops is when the kernel detects that something has gone
wrong. A bug is when something (in the kernel, presumably) doesn't behave
the way it should, either with a driver or in some kernel algorithm. If
you can detect this misbehaviour, you may or may not be getting an oops.
Perhaps the most important step is to determine under which conditions
this misbehaviour can be triggered, and whether it is reproducible.
-
What information should go in a bug report?
-
(ADB) Does it affect system security? Is
it related to a specific driver/hardware configuration? Did you manage
to identify the piece(s) of kernel code concerned? It really depends
on the kind of bug you found.
-
(TYT) Please follow general good bug
reporting guidelines: remember, the developers don't have access to
your system, and they're not mind readers. Tell us which kernel
version, and what your hardware is (if you're not sure, more detail
is better than less). At the very least, tell us what processor and
motherboard you have, how much memory, how many and what kind of disks
(IDE, SCSI, etc.), what kind of disk controllers you have, what other
expansion boards (specify whether they're PCI or ISA or some other
bus). Also useful: what version of gcc and binutils were used to
compile the kernel.
Try to find a simple, reliable way to trigger the problem. Telling
the developer that they have to set up some complicated application
environment (especially if it involves some ghastly expensive
proprietary software like SAP or Oracle :-) may cause the developer to
hit the 'd' key and move on.
In general, raw data is better than jumping to conclusions. If you
want to give your guesses in your bug reports, they're of course
welcome, but this is not a substitute for raw data. Many
problems are not what they first seem. A hardware problem can
masquerade as a VM problem. A device driver or VM problem can cause
the filesystem code to notice a discrepancy, and flag a warning. Even
if you're sure that the problem isn't a hardware problem, or
some other theory that the developer advances, the scientific method
demands that you do a test to rule these sorts of things
out. Sometimes, you will get surprised.....
If you get a kernel oops message, it's useless unless you give us the
proper symbolic information. This used to mean sending relevant pieces
out of System.map. Fortunately, with the latest syslogd/klogd, this is
much simpler (check the man page of klogd to see if your version has
this feature; if it doesn't, you should upgrade to the latest version,
and probably to a modern distribution). Make sure that you have the
System.map file installed the appropriate place so that klogd can find
it (the standard search path is in the /boot, /, and /usr/src/linux
directories).
If the system oops and then dies without a chance for klogd to record
the information into a syslog file, copy down the oops message
exactly, and then use the ksymoops (see the man page) to get the
symbolic information out. Remember, the raw numbers by themselves will
generally not be useful.
If you can, try to isolate the problem to a specific kernel version.
Knowledge that it worked in version 2.2.17, as well as 2.3.0-test6,
but it stopped working in 2.3.0-test7-pre1, is extremely helpful, and
will save developers a lot of time. (If you're comfortable disecting
patches, fell free, taking apart the individual file changes and try
to isolate to a particular change.)
-
(REG) You did of course read
REPORTING-BUGS from the kernel source tree first, didn't you?
-
I found a bug in an "old" version of the kernel,
should I report it?
-
(CP) Only if it hasn't been fixed yet.
The best thing to do is to try to repeat it with a new version of the kernel.
If not, you have to figure out if it's been fixed yet. The kernel
release announcements and patch descriptions from Jitterbug
are also useful. Failing that, look for discussion of the bug in
linux-kernel and check the patches between your kernel and the latest ones
for relevant changes.
If you can't find your bug mentioned, and you're not running a truly
ancient kernel, posting a bug report is worthwhile. You can probably
expect a request of the form "try it with the latest kernel" or "try it
with this patch" in response. If there's a reason why you can't run
the latest kernel (like it's your main dialin server and you don't want
to mess with it), saying it in your original report will save some explaining
later.
-
How do I compile the kernel?
Section 5 - "Who's who" questions
-
Who is in charge here?
-
(ADB) Do you mean "Who takes decisions relative
to the mailing list?" or do you mean "Who takes decisions relative to the
Linux kernel"? If the former: there is relatively little to decide when
it comes to the mailing list. Majordomo, once correctly setup, will manage
the list in an autonomous fashion. In any case, you can always reach the
Majordomo-owner for the list, if you have a very specific question about
the list mechanism itself. When it comes to kernel development management
and decision making, see the answer to Question 7.8
below.
-
Why don't we have a Linux Kernel Team page, same
as there are for other projects?
-
(ADB) Perhaps because there is no Linux
Kernel Team, per se. Also because so many people contributed to the
Linux kernel that it would be a tough task to setup and maintain such
a page. Finally, although this is not a rule, most Linux kernel
contributors prefer to keep a low profile, for various reasons.
-
Why doesn't <any of the below> answer my
mails? Isn't that rude?
-
(ADB) Probably because of sheer lack of
time to answer each email that gets sent to them. What would you do if
you got 1000 mails in your mailbox, from one day to the next? They
don't mean to be rude, however.
One hint: if you attach to your mail
a genuinely useful piece of good quality code that you wrote, there
are good chances that it will be answered (choose a good subject line,
too). If you ask a dozen beginner's questions, the truth is, there are
zero chances that you will get even the simplest reply pointing to
some source of information.
Aside from that, you may get "mail rejected" error messages if you
try to contact some major contributors of the list. It is due to the
spam filtering systems used by them. Please complain about it to your
ISP and don't post to the list about spam !!
.
-
(REG) Some people also have very
aggressive mail filtering which rejects (non-list) messages from
people they don't know, asking for a re-send with a password (this
stops SPAM dead). If you mail to someone and receive such an automatic
response, don't get upset. Remember, a person's mailbox is their
personal property.
Also, some people maintain "guru lists" and only
read posts on linux-kernel by someone on their guru list, other
people's posts go to /dev/null. This is done because there
are too many questions asked on linux-kernel which shouldn't be (which
is why people should read this FAQ first!), and people can't cope with
the load. If you post to the list and want make sure a specific
individual will see the message, Cc: that person.
-
Why do I get bounces when I send private email
to some of these people?
-
(REG) Some people, like Alan Cox, bounce
messages. Read this to find out why and what you
can do about it.
-
Who is Matti Aarnio?
-
(MEA) He is principally a ZMailer
hacker, and a co-postmaster of vger.kernel.org.
Sometimes he finds also cycles to hack on the kernel, and you
see some patches from him. (e.g. initial work on Large
File Summit; files over 2G in size, was his)
-
Who is H. Peter Anvin?
-
Who is Donald Becker?
-
Who is Alan Cox?
-
(AC) Alan Cox supervises the 2.0.34/35/36
kernel releases, works on the Mac68K port, the SGI port, 2.0
networking, modular sound, video capture and helps collect up and sort
patches to the kernel. He gets to do all this and sleep because the
nice guys at Red Hat pay him to
hack Linux.
-
Who is Richard E. Gooch?
-
(REG himself) "I've written various
utilities and kernel patches which you can find here including the
MTRR, devfs and fastpoll patches. My PhD in Computer Science was on
the topic of
Astronomical Visualization
, which is my current research interest. This is what I work on when I
don't get distracted by kernel hacking. See my
home page to find out
more about me."
-
Who is Paul Gortmaker?
-
(ADB, OK'ed by Paul) Paul has contributed
various pieces of kernel code over the last few years, among other things
the Real Time Clock driver. He is also the maintainer of the 8390 based
network drivers (NE-2000, etc.), and wrote the Linux Ethernet HOWTO and
the Boot-Prompt HOWTO.
-
Who is Mark Lord?
-
Who is Larry McVoy?
-
Who is David S. Miller?
-
(DSM) David Miller is mainly known for the
porting work he has done, primarily for the 32-bit and 64-bit Sparc platforms
although he has made significant contributions to the MIPS effort as well.
He is also the current maintainer of the IP networking layer in the kernel
and likes to address general performance and scalability problems all over
as his time permits.
-
Who is Linus Torvalds?
-
Who is Theodore Y. T'so?
-
(TYTSO) Theodore
Ts'o has over the years written, rewritten, or supported Posix Job
Control, the high level tty driver, the serial driver, the ramdisk support,
e2fsck/e2fsprogs, and other bits and pieces of code in and near the kernel.
He is currently a member of the Technical Board of Linux International.
His day job at MIT is concerned with Kerberos
and other network security and I/T architecture issues. He is also a member
of the Internet Engineering Task Force,
where he serves as a member of the Security
Area Directorate.
-
Who is Roger Wolff?
-
(REW himself) "I wrote the kmalloc that still
drives linux-2.0.x. I wrote the Specialix and Olicom device drivers. I
currently write Linux device drivers for a living. Contact
me if you need one."
Other OS developers
Rogier Wolff (REW) suggested we add a section
on OS developers who influenced/preceded the design of Linux.
-
Who is Prof. Douglas Comer?
-
(Prof. Comer) Dr. Douglas Comer is a full
professor of Computer Science at Purdue University, where he teaches courses
on operating systems and computer networks. He has written numerous research
papers and textbooks, and currently heads several networking research projects.
He has been involved in TCP/IP and internetworking since the late
1970s, and is an internationally recognized authority. He
designed and implemented X25NET and Cypress networks, and the Xinu
operating system. He is director of the Internetworking Research Group
at Purdue, editor of Software - Practice and Experience, and a former
member of the Internet Architecture Board.
Dr. Comer completed
the original version of Xinu (and wrote "The Xinu approach" book) in
1979. Since then, Xinu has been expanded and ported to a wide variety
of platforms, including: IBM PC, Macintosh, Digital Equipment
Corporation VAX and DECStation 3100, Sun Microsystems Sun 2, Sun 3 and
Sparcstations, and Intel Pentium. It has been used as the basis
for many research projects. Furthermore, Xinu has been used as
an embedded system in products by companies such as Motorola,
Mitsubishi, Hewlett-Packard, and Lexmark. There is a full TCP/IP
stack, and even the original version of Xinu (for the PDP-11)
supported arbitrary processes and network I/O.
-
Who is Richard M. Stallman?
-
(RMS) Richard
Stallman is the founder of the GNU project,
launched in 1984 to develop the free operating system GNU (an acronym for
"GNU's Not Unix"), and thereby give computer users the freedom that most
of them have lost. GNU is free software: everyone is free to copy
it and redistribute it, as well as to make changes either large or small.
Today, Linux-based variants of the GNU system, based on the kernel
Linux developed by Linus Torvalds, are in widespread use. There are
estimated to be over 10 million users of GNU/Linux systems today.
Richard Stallman is the principal author of the GNU C Compiler, a portable
optimizing compiler which was designed to support diverse architectures
and multiple languages. The compiler now supports over 30 different
architectures and 7 programming languages.
Stallman also wrote the GNU symbolic debugger (GDB), GNU Emacs, and
various other GNU programs.
Stallman received the Grace Hopper Award from the Association for Computing
Machinery for 1991 for his development of the first Emacs editor in the
1970s. In 1990 he was awarded a MacArthur Foundation fellowship,
and in 1996 an honorary doctorate from the Royal Institute of Technology
in Sweden. In 1998 he received the Electronic Frontier Foundation's
Pioneer award along with Linus Torvalds.
-
Who is Prof. Andrew Tanenbaum?
-
(Prof. Tanenbaum) Andrew S. Tanenbaum has
an S.B. degree from MIT and a Ph.D. from the University of California at
Berkeley. He is currently a Professor of Computer Science at the Vrije
Universiteit in Amsterdam, The Netherlands, where he heads the Computer
Systems Group.
His current research focuses primarily on the design of wide-area distributed
systems that scale to millions of users. These research projects have led
to over 70 refereed papers in journals and conference proceedings. He is
also the author of five books.
Prof. Tanenbaum has also produced a considerable volume of software.
He was the principal architect of the Amsterdam Compiler Kit, a widely-used
toolkit for writing portable compilers, and MINIX, a small UNIX-like operating
system for operating systems courses.
Prof. Tanenbaum is a Fellow of the ACM, a Senior Member of the IEEE,
a member of the Royal Netherlands Academy of Arts and Sciences, and winner
of the ACM Karl V. Karlstrom Outstanding Educator Award.
Section 6 - CPU questions
-
What is the "best" CPU for GNU/Linux?
-
(REW) There is no "best" CPU. The choice of
CPU always depends on your price/performance/technical requirements. On
the x86 side, we have Intel, AMD, Cyrix and IDT/Centaur, with various models
available. All of these work.
Besides the x86 processors, the Linux kernel runs on 68k processors,
MIPS R3000 and R4000, Power PC, ARM, Alpha and Sparc processors. There
are lots of different ways to build a computer around a processor. If you
have an x86, they built a PC around it. Don't go around buying second hand
R4000 computers because the Linux kernel runs on the R4000 processor. Check
the latest Linux kernel revision to see if the specific computer you're
buying is supported.
-
(ADB) OK, the Linux kernel is a good
start. Now, there is a huge difference between kernel support and a
ready-to-install distribution. Only four architectures have widely
available, reasonably homogeneous distributions: x86 (or i386), Alpha,
Sparc and Power-PC. And the Alpha and Sparc distributions that exist
still have some rough edges. IOW, if you don't want to spend a lot of
time installing and fine-tuning GNU/Linux, and you have a limited
budget, your "best" choice is an x86 machine. If you have very
specific needs (e.g. a hand-held computer running Linux, where the low
power ARM architecture would be the ideal choice, or a workstation
dedicated to scientific applications, where an Alpha or a Sparc would
provide superior performance), check the various architectures, list
your specific requirements, and make a choice. Nowadays Alpha 21164
machines are much more affordable than one or two years ago, but it's
certainly harder to put one together than your average PC clone.
-
What is the fastest CPU for GNU/Linux?
-
(REW, ADB) The CPU field is very active in
terms of technological developments. New CPU models, new architectures,
new manufacturing technologies keep pushing the state of the art. WRT GNU/Linux,
it is a general consensus that Alpha machines usually provide the best
floating point performance, when the actually shipping hardware available
at any given point in time is compared (June 1998: the 21164/600).
However for non floating point applications the issue is not as clear-cut.
Very high clock rate x86 machines (e.g. Pentium-II/400) provide impressive
integer performance, for use in e.g. databases or Web server applications.
For 3D rendering applications you may want to consider the GNU/GPL
Mesa OpenGL compatible library, which has support for some graphics accelerator
chips.
Also note that some applications are not CPU bound. Check the exact
bottleneck in your case.
-
I want to implement the Linux kernel for CPU Hyper123,
how do I get started?
-
(ADB) Is Hyper123 supported by gcc, or at
least is the Hyper architecture supported by gcc? Do you have a target
machine with a well defined architecture? If you have answered yes to
both questions proceed to REW's answer. If you have answered no to
either or both, don't even bother getting started. This is a major
project, not exactly the kind of thing you do over the
weekend. Quoting from a SparcLinux
paper by Miguel de Icaza:
"Thanks to having an international team of developers and support
people, when the first Linux/SPARC distribution on CD went out we had a
very strong port: a port that had taken only 22 months to engineer and
complete (starting from scratch up to releasing the operating system on
a bootable CD-ROM)."
-
(REW) Auch. Difficult task. Besides having
to write support for the processor, you will also have to write the boot
sequence to get things going. And a few device drivers.
You're not running away screaming yet? Good. Make sure you get the
programmers manual for Hyper123, and data sheets for all the peripheral
IC's. Make sure you have the docs for the computer that you're working
on (addresses, registers for the stuff on the motherboard).
After that, start on learning the processor, by writing the boot program.
Try booting a simple program that says "hello world". That will also allow
you to write a console device driver.
Next, there is the hard part: get Linux to compile and run on the
processor. Make a new arch directory and start putting things in
there that implement whatever needs implementing on your
processor.
-
Why is my Cyrix 6x86/L/MX/MII detected by the kernel
as a Cx486?
-
(RRR, ADB) Cyrix 6x86 CPUs are different in
many ways from Pentium (tm) and AMD K5/K6 (tm) CPUs, so special code must
be included for adequate CPU detection, setup and reporting. Cyrix 6x86
support isn't perfect in kernels 2.0.x up to 2.0.34. From 2.0.35 on things
should get much better ('cause we're working on it ;) ). Similarly, late
2.1.1xx kernels should fully support the Cyrix CPUs. Please check the
Linux Cyrix
6x86 HOWTO site for details and patches.
-
What about those x86 CPU bugs I read about?
-
(ADB) There are basically three known bugs
that affect x86 processors, and each CPU design got its fair share it seems:
-
The Intel Pentium F00F "Death" bug, affects ALL Pentium and
Pentium MMX CPUs. Linus implemented the Intel recommended workaround for
this bug a few days after the bug was first reported in the newsgroups.
All recent kernels will report and workaround the bug.
- The AMD K6 "sig11" bug, affects only a few K6
revisions. Was diagnosed by Benoit Poulot-Cazajous. There is no
workaround, but you can get your processor exchanged by contacting
AMD. 2.2.x kernels will detect buggy K6 processors and report the
problem in the kernel boot message. Recently, a new K6 bug has been
reported on the linux-kernel list. Benoit is checking into it.
-
The Cyrix 6x86(Classic, L, MX) "Coma" bug, affects ALL Cyrix
6x86 CPUs. I proposed a simple workaround which is implemented as a user
space boot option, a few hours after the bug was reported on the linux-kernel
mailing list. See the Linux
Cyrix 6x86 HOWTO site for details. Cyrix was notified of the bug, and
their new MII CPUs are not affected by this problem anymore.
-
I grabbed the standard kernel tarball from ftp.kernel.org
or some mirror of it, and it doesn't compile on the Sparc, what gives?
-
(DSM) Often the Sparc port diverges due
to the sheer high rate of changes which occur to that port. Also
changes can happen to major interfaces in the kernel and the Sparc
port is not updated at the same time. Eventually the Sparc port
maintainers do try to merge all of their work into the standard tree,
and at which time it will compile. In any event, trees which will
compile just fine are available via two mechanisms, the vger CVS tree
(accessible via read-only anonymous CVS) and pre made tarballs of
known working stable or test kernel trees. Check:
-
ftp://vger.kernel.org/pub/linux/README.CVS and
-
ftp://vger.kernel.org/pub/linux/Sparc/kernel/v2.{0,1}/
-
Does the Linux kernel execute the Halt instruction
to power down the CPU?
-
(REG, ADB) Yes. The Linux kernel will execute
the Halt instruction when the machine is idle (check the code for the idle_task
in sched.c). It has done so since the earliest i386 implementation, even
though on the i386 we didn't care about power saving; it's just that halting
the CPU is the Right Thing (tm) to do when there is no other task that
must be run.
On the Pentium, K6 and C6 CPUs, power consumption gets automatically
reduced from an average 12-24 Watts operating power down to 2-3 Watts when
the processor is Halted. On the Cyrix 6x86 CPUs, Halt state power consumption
can be further reduced down to 150 mw by enabling the Suspend-on-Halt feature.
Reduced power consumption means cooler, more reliable machine operation
and longer component life. And it saves trees too.
-
I have a non-Intel x86 CPU. What is the [best|correct]
kernel config option for my CPU?
-
(ADB) For 386 class machines, compile as a
386. For 486-class machines, compile as a 486.
For the Cyrix 6x86 family CPUs and the AMD K5 and K6, you should probably
compile the kernel as a Pentium or PPro. The only difference between the
Pentium (-M586) and PPro (-M686) compile options is in the string operations
(AFAIK). The Pentium option uses a header file that breaks down the complex
string opcodes into simpler operations (which are faster on the Intel Pentium
and Pentium MMX).
The PPro option uses the complex opcodes, but should be slightly faster
than a Pentium because of the PPro has deeper, smarter pipelines.
The same rules apply to the 6x86 family and the K5/K6, but the difference
in speed is minimal between the Pentium and PPro kernel config options
on these CPUs (PPro should be slightly better).
The 486 kernel config option (-M486) should not
be used for anything above a 486-class CPU. This option sets code alignment
options that work well on the 486, but that cause excessive NOP padding
on 586 and above class machines. Usually, the 6x86 speculative execution
capabilities will just optimize this padding at run time, but the NOP opcodes
still take precious L1/L2 cache space (same applies to the K6; I am not
100% sure of what the K5 does).
The 386 config option (-M386) does not suffer from excessive padding,
but does not produce code optimized for recent x86 CPUs either, so it is
also deprecated, except for kernels included in GNU/Linux distributions
which must run on the widest possible range of machines.
-
What CPU types does Linux run on?
-
(REG) Quite a few. Below is the list for
kernel 2.4.18. Note that for some CPUs advanced development is kept
outside the mainline kernel, and changes are merged into the mainline
periodically. The WWW pages for these projects are listed as well.
Section 7 - OS questions
-
OS $toomuch has this Nice feature, so it must be
better than GNU/Linux.
-
(ADB) Sorry, but this simply means that OS
$toomuch was designed with a given set of objectives and priorities, and
GNU/Linux was designed with a different one. Neither is better than the
other and also note that I am not referring to the respective implementations.
But please, no OS comparisons on the linux-kernel list. Check the newsgroups
instead, particularly comp.os.linux.advocacy which is dedicated to that
kind of debate.
-
Why doesn't the Linux kernel have a graphical boot
screen like $toomuch OS?
-
(ADB) Because it doesn't need one. You can
add that feature to the boot loader code, if you want to. The Linux kernel
has no graphics primitives, just like any UNIX kernel.
-
The kernel in OS CTE-variant has this Nice-very-nice
feature, can I port it to the Linux kernel?
-
(ADB) Sure, you can do (almost) anything you
want with Free Software. Oh, OS CTE-variant is not Free Software?
-
How about adding feature Nice-also-very-nice to the
Linux kernel?
-
(ADB) You should probably read the
definition of creeping featurism first. Related concepts, in
increasing order of obfuscation: the KISS rule-of-thumb, the "Small is
Beautiful" concept, Occam's
Razor and Complexity Theory. A good book to read on these concepts
as they apply to OS design is "The
Mythical Man-Month" by Frederick P. Brooks, Jr.
-
Are there more bugs in later versions of the Linux
kernel, compared to earlier versions?
-
(ADB) There are no more known
bugs in later kernel versions than in earlier kernel versions. However,
the Linux kernel source code has been growing at a constant rate. As a
general rule, large pieces of code tend to have undetected bugs. OTOH,
the core code for the Linux kernel seems to have stabilized at around 16
thousand lines of C code, according to Larry McVoy.
-
(REW) I'd say more than 23 thousand lines
in 2.1.x. Add together the totals from kernel, mm,
arch/<somearch>/, subtract fpu-emulation.
-
Why does the Linux kernel source code keep getting
larger and larger?
-
(ADB) There are four causes for this unbounded
growth:
-
New architectures are implemented. This is usually OK, because the
code that is specific to each architecture is (in theory, at least) separate
from the others. Common code doesn't grow.
-
New drivers are implemented. Again, this is OK, because each driver
has different source files, and those are selectively compiled in the kernel
executable or built as modules according to the specified kernel configuration.
-
Old code gets adequately documented. Adding comments and documentation
increases the size of the source, but it's still a Good Idea (tm).
-
Creeping featurism. It's generally considered
a Bad Idea (tm) to keep adding more features to an already working piece
of code.
-
The kernel source is HUUUUGE and takes too long
to download. Couldn't it be split in various tarballs?
-
(REG) The kernel (as of 2.1.110) has
about 1.5 million lines of code in *.c, *.h and *.S files. Of those,
about 253 k lines (17%) are in the architecture-specific
subdirectories, and about 815 k lines (54%) are in
platform-independent drivers. If, like most people, you are only
interested in i386, you could save about 230 k lines by removing the
other architecture-specific trees. That is a 15% saving, which is not
that much, really. The "core" kernel and filesystems take up about 433
k lines, or around 29%.
If you want to start pruning drivers away, the problem becomes
much harder, since most of that code is architecture independent. Or
at least, is supposed to be/will be. There is some driver code which
probably should be moved to an i386-specific subdirectory, and perhaps
over time it will be (it will take a lot of work!), but you need to be
careful. PCI cards for example should be architecture
independent. Throwing out the non i386-specific drivers will save
around 97 k lines, a saving of about 6%.
But the most important argument for/against splitting the kernel
sources is not about how much space/download time you could save. It's
about the work involved for Linus or whoever will be putting together
the kernel releases. Building tarballs (compressed tarfiles) of the
whole kernel already represents a considerable amount of work;
splitting it into various architecture-dependent tarballs would
dramatically increase this effort and would probably pose serious
maintainability problems too.
If you are really desperate for a reduced kernel, set up some
automated procedure yourself, which takes the patches which are made
available, applies them to a base tree and then tars up the tree into
multiple components. Once you've done all this, make it available to
the world as a public service. There will be others who will
appreciate your efforts.
Under no circumstances should you
complain to the kernel list. I promise you that Linus and the core
developers will completely ignore such messages, so whinging about it
is a complete waste of bandwidth. The only message on this subject
that should be posted is an announcement of a new service providing
split kernel sources.
-
What are the licensing/copying terms on the Linux
kernel?
-
(RRR) In the root directory of the Linux
kernel source tree (e.g. /usr/src/linux/), you will find a file COPYING.
The file states that the Linux kernel is placed under the GNU General Public
License (version 2), a copy of which is provided. If still in doubt, post
to the appropriate forums (such as gnu.misc.discuss) or ask a lawyer, but
don't ask about it on the linux-kernel
list.
-
What are those references to "bazaar" and
"cathedral"?
-
(ADB) These terms are used to describe
two different development models adopted by the Free Software
community, and were first coined by Eric S. Raymond. You should check
his
original article.
Note that Eric's article describes two among an infinite range of
possible different development models. You could for example create
new "Versailles", "Great Wall of China" or "Pyramid of Kheops"
software development models. As long as the end result is under a
GNU/GPL license, it will still be Free Software.
-
What is this "World Domination" thing?
-
(ADB) Geek humor? Please don't take this
seriously!
This is just a way of saying that there are more and more people using
GNU/Linux all over the world i.e. that the Free
Software movement is gaining momentum. Note that the "Free" in Free
Software refers to freedom, just about
the opposite of what's implied by "World Domination".
-
(REW) This is a reference to an interview
of Linus some years ago. After being pretty modest about the success that
Linux was enjoying he concluded the interview with the remark: "Of course,
what I really want is total world domination."
I've been browsing the net for the reference for this.
http://www.ukuug.org/newsletter/63/[email protected]
and http://www.linuxgazette.com/issue15/lg_toc15.html
are close but not quite close enough.
Linus has referred back to this remark often enough.
-
What are the plans for future versions of the Linux
kernel?
-
(ADB) Linus would be the best person to ask,
but I don't know if he would have the time and patience to answer this
question. There are some development issues that can be mentioned, though:
-
PnP support in the kernel. Right now one can get PnP support using the
isapnptools user space package and manually tuning the I/O, IRQ and DMA
channel allocation, but future Linux kernels will do that for you.
-
Improved SMP support.
-
Improved 64 bit code support.
-
Improved POSIX support.
-
Improved APM support.
-
Why does it show BogoMips instead of MHz in the
kernel boot message?
-
(ADB) On some processor architectures it is
very difficult to find out the clock speed of the processor, and since
the kernel does not depend on determining the MHz rating of a processor
to operate correctly, MHz simply do not get calculated at boot time. OTOH,
BogoMips get calculated because the kernel bases itself on BogoMips data
to implement small time delays (busy loops) needed by various drivers in
different circumstances. Note that neither BogoMips nor MHz measure processor
performance in any way. See the BogoMips HOWTO by Wim van Dorst for an
accurate description of BogoMips. Also take a look at the Linux Benchmarking
HOWTO (shameless plug) if you want some basic information on Linux performance
measurements.
Sometimes your BogoMips reading will vary by as much as 30%, from one
kernel to another. This is due to changes in the alignment of the BogoMips
calibration loop, which interacts with cache behavior. Richard
B. Johnson has recently proposed a small patch that takes care of this
problem.
-
I installed kernel x.y.z and package foo doesn't
work anymore, what should I do?
-
(RRR) Check out the /usr/src/linux/Documentation/Changes
and make sure you have the recommended versions (or newer) of the relevant
software. This is very important. A lot of
things are evolving on Linux and newer versions of the kernel may break
older packages (especially on the development kernels). If you are using
development kernels keep an eye for reports on the kernel list. If all
else fails post a bug report (see Q/A on bug reports) to the list.
-
People talk about user space vs. kernel space. What's
the advantage of each?
-
(REG) User space is what all user (including
root) programs run in. It is fully virtual memory (i.e. normally
swappable). The X server is in user space, for example. So is your
shell. Kernel space is the domain of the kernel (wow!), device drivers
and hardware interrupt handlers. Kernel memory is non-swapable
(i.e. it's REAL RAM), and hence should be used sparingly. Also,
operations performed in kernel space are not pre-emptive: this means
other processes are prevented from running until the operation
completes.
Some people think that it's better to implement stuff in kernel
space ("so that everyone has it"). In general this is a Bad Idea (tm)
(see "creeping featurism" above), since kernel
space resources are more "heavy" than user space resources. For
example, coding a Mandelbrot fractal generator in kernel space is a
*really stupid* idea.
The job of the kernel is to provide a safe and simple interface to
hardware and give different processes a fair share of the resources,
and to arbitrate access to resources/hardware.
Many ideas are best implemented in user space, with perhaps the
absolute minimum of kernel support. The only exceptions to this
principle are where it is particularly complicated or inefficient to
implement the solution in user space only. This is why filesystems are
in the kernel (you *could* put them in user space implemented as
daemons), because a kernel implementation is *much* faster.
Note that you can make user space memory non-swappable by using
the mlock(2) system call. This is a privileged operation and
should not be used trivially.
-
What are threads?
-
(ADB) Very shortly, threads are
"lightweight" processes (see the definition of a process in the Linux
Kernel book) that can run in parallel. This is a lot different from
standard UNIX processes that are scheduled and run in sequential
fashion. More threads information can be found here
or in the excellent Linux Parallel Processing HOWTO by Professor Hank
Dietz.
-
(REG) When people talk about threads, they
usually mean kernel threads i.e. threads
that can be scheduled by the kernel. On SMP hardware, threads allow you
to do things truly concurrently (this is particularly useful for large
computations). However, even without SMP hardware, using threads can be
good. It can allow you to divide your problems into more logical units,
and have each of those run separately. For example, you could have one
thread read blocking on a socket while another reads something from disk.
Neither operation has to delay the other. Read "Threads Primer" by Bill
Lewis and Daniel J. Berg (Prentice Hall, ISBN 0-13-443698-9).
-
Can I use threads with GNU/Linux?
-
(REG) Yes! The Linux kernel has the clone(2)
system call, which provides the underlying mechanism for implementing a
threads library. And Xavier Leroy has provided us with LinuxThreads, a
POSIX 1003b implementation of threads for the Linux kernel.
If you have a libc 5 system, you'll need to install LinuxThreads if
it is not already installed. You can get the LinuxThreads library here.
If you have a libc 6 (aka glibc 2) system, you shouldn't need to do
anything. Glibc has LinuxThreads merged in.
-
You mean threads are implemented in kernel space
in GNU/Linux? Why not a hybrid kernel/user space implementation? Wouldn't
that be more efficient?
-
(REG) It is not clear that there is any significant
benefit for Linux to have a hybrid threading library. If we look at Solaris
Threads, they have a hybrid scheme, and claim that is an advantage. Well,
yes, I suppose so, given their environment (the Solaris 2 kernel). They
have a very heavy kernel, so a pure kernel space implementation would be
too slow (remember the time it takes to enter/exit the kernel). Linux,
on the other hand, has a very efficient kernel, so the difference between
a kernel context switch under Linux and a user space context switch under
Solaris 2 is pretty small. Also note that Solaris Threads took a long time
to get right, because of problems such as signal delivery to threads. With
a pure kernel threads implementation, signal delivery is much simpler.
Fixing the signal delivery problems with Solaris Threads increased the
complexity of their library, leading to bloat and performance loss. We
don't want to make the same mistakes.
Now, you may argue that a hybrid scheme under Linux would be even
better. Maybe. Prove it. Code it and benchmark it. In any case, this
is a discussion that is not relevant to the kernel, since a hybrid scheme
is built on top of kernel threads (Solaris 2 builds their threads on top
of LWPs (Light Weight Processes) too). It's a user space issue, so please,
keep it off the kernel list.
BTW: if you do manage to code something up and it is much faster than
pure kernel space threads, you may need some kind of extra kernel support
(depending on how you implement things). If that happens, then come
and talk about it on the kernel list.
The Linux philosophy is to optimize the kernel first, so that all possible
implementations can share the benefits.
-
Can GNU/Linux machines be clustered?
-
(REG) Different people mean different
things when they talk about clustering. Some people want transparent
fault tolerance and load balancing of general applications, others
want parallel processing of a single job. Most people who talk about
fault tolerance expect hardware and OS support of this (if a node goes
down, the OS will automatically migrate the application to another
node). This is not (yet) available for Linux.
You can write a fault tolerant application for a network of computers
without direct OS support: you just need to structure your application
appropriately. Note that a fault tolerant distributed application may
also be a parallel, multithreaded application.
The Beowulf project provides an
API and system management software to write parallel distributed
applications on a network of Linux machines. The main emphasis here is
on parallelism to get maximum processing power, although fault
tolerance is possible too. An example of a Beowulf clustered Linux
system is Avalon, which has
just been listed among the world's 500 most powerful supercomputers.
Beowulf clusters deliver GFLOPS using arrays of commodity computers.
It is an incredibly cheap and elegant way to get significant computing
power for e.g. scientific applications.
-
(ADB) Also check the Parallel Processing
HOWTO by Professor Hank Dietz.
-
(REG) In June 2000,
Mission Critical Linux
released
Kimberlite
which they describe as an "open source linux clustering
cabability". Tim Burke, their Cluster Architect describes it thus:
A Kimberlite Cluster provides support for two server nodes connected
to a shared SCSI or Fibre Channel storage subsystem, in an
active-active failover environment. The software provides the ability
to detect when either node leaves the cluster, and will automatically
trigger recovery scripts which perform the procedures necessary to
restart applications on the remaining node. When the node rejoins the
cluster, applications can be moved back to it, manually or
automatically, if required. Sample recovery scripts are
provided. Kimberlite is designed to deliver the highest levels of data
integrity and be extremely robust. It is suitable for deployment in
any environment that requires high availability for un-modified Linux
applications.
-
How well does Linux scale for SMP?
-
(REG) Reasonably well. Kernel version 2.2
has much better scalability than version 2.0. People are running 4
processor Intel Xeon machines and 14 processor UltraSparc
machines. Version 2.2 still has a global kernel lock, but this is
often released quite quickly (for example, when the process blocks
waiting for a resource and/or data), so the net result is that it is
quite unlikely for two processors to compete for the global
lock. Experiments with 14 processor UltraSparc machines shows that
Linux scales well, indicating that the current locking strategy is not
hurting us for these machines.
Also consider that for parallel processing jobs, the kernel is not
involved, so even Kernel v2.0 scaled well for these applications. When
we talk about SMP scalability, we are referring to how many IO
operations the kernel can perform at the same time.
Unfortunately some hysterical NT supporters continue to spread FUD
that Linux does not scale well on SMP. Efforts to insert a bit of
truth have generally fallen on deaf ears. If someone tells you that NT
scales better than Linux, ignore them. They're operating in a
fact-free zone. Tests indicate that NT has trouble scaling to 4
processors. There really is no competition.
Note that kernel version 2.3 has replaced the remaining global kernel
lock with finer grained locking. This allows Linux to scale well to 64
processor machines and beyond.
-
Can I lock a process/thread to a CPU?
-
(RML) Yes, as of 2.5.8 the Linux kernel
supports binding a process to a particular CPU. Patches exist for the
2.4 kernel series but are not yet merged (as of 28-APR-2002). This is
called "task CPU affinity" and the interface was implemented via the
following syscalls:
int sched_setaffinity(pid_t pid, unsinged long len, unsigned long *mask)
int sched_getaffinity(pid_t pid, unsinged long len, unsigned long *mask)
which set and get a given task's CPU affinity, respectively.
Utilities for manipulating affinity and the patches for 2.4 are
available at
kernel.org. The interface allows any task's affinity to be
retrieved, although only the task's uid (or root) can change the
affinity. The calls assure the task has been successfully scheduled
to a valid CPU before returning.
-
How efficient are threads under Linux?
-
(REG) Incredibly. Compared with all the
other kernel-based thread implementations, Linux is probably the
fastest. Each thread takes only 8 kiB of kernel memory for the stack
and thread creation and context switching is very fast. I have
measured less than 1 microsecond context switch times on an old
Pentium/MMX 200 (see
http://www.atnf.csiro.au/~rgooch/benchmarks/linux-scheduler.html
for more details). However, the Linux scheduler is designed to work
well with a small number of running threads. Best results are
obtained when the number of running theads equals the number of
processors.
Avoid the temptation to create large numbers of threads in your
application. Threads should only be used to take advantage of multiple
processors or for specialised applications (i.e. low-latency
real-time), not as a way of avoiding programmer effort (writing a
state machine or an event callback system is quite easy). A good rule
of thumb is to have up to 1.5 threads per processor and/or one thread
per RT input stream. On a single processor system, a normal
application would have at most two threads, over 10 threads is
seriously flawed and hundreds or thousands of threads is progressively
more insane.
A common request is to modify the Linux scheduler to better handle
large numbers of running processes/threads. This is always rejected by
the kernel developer community because it is, frankly, stupid to have
large numbers of threads. Many noted and respected people will extol
the virtues of large numbers of threads. They are wrong. Some
languages and toolkits create a thread for each object, because it
fits into a particular ideology. A thread per object may be appealing
in the abstract, but is in fact inefficient in the real world. Linux
is not a good computer science project. It is, however, good
engineering. Understand the distinction, and you will understand why
many widely acclaimed ideas in computer science are held with contempt
in the Linux kernel developer community.
-
How does the Linux networking/TCP stack work?
-
(REG) The best guide may be found in the
Linux kernel sources. A popular reference is "TCP/IP Illustrated"
(volumes 1 to 3) by W. Richard Stevens, which explains much of the
theory and practice behind TCP/IP. This material is based on the BSD
implementation, which differs from Linux in fundamental ways.
Nevertheless, it is an excellent reference.
-
Can we put the networking/TCP stack into
user-space?
-
(REG) The short answer is no, because
this would slow it down (see the monolithic versus
microkernel debate for reasons why). The longer answer involves
the motivations behind the question. Some people want to inspect every
packet, and think it's easier to do in user-space. In fact, the kernel
has a network packet filtering API (Linux Socket Filter (LSF), which
is an easier-to-use implementation of the Berkeley Packet Filter
(BPF)). The LSF allows you to capture some or all packets and pass
them to user-space. This yields the advantages of a kernel-based
networking stack, but still allows you to inspect packets in
user-space if needed.
One reason people want to inspect packets is to perform
firewalling. In this case, a far superior solution is available, using
the Netfilter infrastructure.
This is a kernel-level firewalling/NAT solution which is fast and
reliable. You may create both stateful and stateless firewalling
configurations. This infrastructure was introduced during the 2.3.x
development cycle.
Section 8 - Compiler/binutils questions
-
I downloaded the newest kernel and it doesn't even
compile! What's wrong?
-
(REG) First check the kernel newsflash
page at
http://www.atnf.csiro.au/~rgooch/linux/docs/kernel-newsflash.html
where late-breaking patches may be posted.
-
(DW) Do not post any details of
the compile failure to the mailing list unless you have first checked
the archives to ensure that the question hasn't
been asked already.
Normally, if Linus allows a simple typo into a release kernel which
prevents it from compiling, a patch is posted to the list within
hours, yet still there are clueless idiots who continue to ask
about it for weeks thereafter.
Do not do this. We will find out where you live, and we
will come to your house and knock on your door at three
o'clock in the morning to ask you stupid questions. Repeatedly, if needs be.
REW's note below also says this; but evidently not explicitly
enough. Some people are just too stupid, I guess.
-
(RRR) Make sure you are compiling with
the recommended version of gcc with default optimizations flags (IOW,
leave the Makefiles alone) and a recent binutils. The binutils package
is the one that contains the assembler (gas) and linker (ld). See
Documentation/Changes for more info. If that works then, experiment
with different compiler/optimizations.
-
(REW) Linus cannot test every permutation
of drivers and options. He's a selfish little guy. He just compiles the
version that runs on his computers, and then releases it. Actually, he
sometimes even doesn't compile it before releasing. He's a busy man. Give
him a break. Wait for half a day. Someone will post a patch that will fix
it within that time. If that doesn't happen for more than a day, fix it
yourself, and post the patch to linux-kernel. If you don't have the expertise
to do this yourself, please wait for another day,
before reporting the problem.
Please check if it hasn't been reported before. Most companies have
a help desk that keeps the end users from bothering the developers. Linux
is different: You get to talk to the developers. But don't waste everybody's
time by posting stuff that has been reported already.
-
(DBE) Not all ports of the Linux kernel
to different hardware platforms are fully merged into the official
tree at kernel.org. If you have problems compiling a kernel for a
non-i386 architecture please check the related Web pages and
mailing-lists for that specific port.
-
What are the recommended compiler/binutils for
building kernels?
-
(REG) This depends on the kernel
version. Until 26-OCT-2000, gcc 2.7.2.3 was the recommended compiler
for all kernels. On this date, Linus announced that gcc 2.91.66 (aka
egcs 1.1.2) is the recommended compiler for 2.4.x kernels up to
version 2.4.9. Gcc 2.95.3 is the recommended compiler for kernel
2.4.10 and later.
The recommended binutils is 2.9.1.0.25. Avoid binutils versions from
2.8.1.0.25 to 2.9.1.0.2, these were beta releases and known to be
buggy.
Always see the Documentation/Changes file for details.
-
Why the recommended compiler? I like
xyz-compiler better.
-
(RRR) Quick Answer: it's what Linus uses.
Real Answer: the recommended compiler has been extensively tested and
proven to be a very stable compiler. What is at issue is not whether
other compilers can optimize better, but whether they will compile the
kernel correctly. Current kernels and compilers are very complex
pieces of software. There are just too many ways that the two can
interact and cause trouble (a recent example: gcc 2.8.x and kernel
2.0.x). By keeping constant one of the variables - the compiler -
kernel developers can concentrate on the kernel. If both the compiler
and kernel are changing then it's anyone's guess what went wrong.
-
Can I compile the kernel with gcc 2.8.x, egcs, (add
your xyz compiler here)? What about optimizations? How do I get to use
-O99, etc.?
-
(RRR) Sure, it's your kernel. But if it
doesn't work, you get to fix it. Seriously now, there is really no
point in compiling a production kernel with an experimental
compiler. Production kernels should only be compiled with the
recommended compiler. Newer compilers are known to break the 2.0
series kernels, known symptoms of this breakage are hwclock and the X
server seg.faulting.
Compiling a 2.0 kernel with egcs or gcc 2.8, even after applying the
workaround of copying the ioport.c file from a late 2.1 kernel to 2.0,
is not recommended and will inevitably lead to unpredictable kernel
behaviour.
Regarding 2.1 kernels, they usually compile fine with other compiler
versions, but do NOT complain to the list if your are not using the
recommended compiler. Linux developers have enough work tracking
kernel bugs, to also be swamped with compiler related bugs.
If you want to play with the optimization options, you need to hack
the Makefile in arch/i386/Makefile (assuming you have an x86 processor),
but if it breaks... well, you should know the answer by now. Also keep
in mind that some demented optimizations (such as -O99) may even produce
slower and bigger kernels, due to gratuitous loop unrolling and function
inlining.
-
(ADB) I think the standard Paul
Gortmaker disclaimer (?) is: "If it breaks, you get to keep the pieces."
;-)
-
I compiled the kernel with xyz compiler and get the
following warnings/errors/strange-behavior, should I post a bug report
to the list? Should I post a patch?
-
(RRR) In general, no, unless you get these
with the recommended compiler. Few exceptions:
Everyone welcomes code cleanup patches, for instance, newer compilers
may complain a lot more. Some of these warnings may even be warranted
(i.e. ambiguous use of else statements), fixing these is a good thing.
There could be some aging code around that makes too many assumptions
about the compiler (especially true about inline assembly), some of
the newer compilers break these statements. Fixing these is also a
good thing, but be very sure you're are really fixing a bug in the
kernel. Workarounds for other compilers will be ignored (if the
compiler is buggy, fix the compiler!).
-
Why does my kernel compilation stops at random
locations with: "Internal compiler error: program cc1 caught fatal
signal 11."?
-
(REW) Sometimes bad hardware causes this.
Read the Web page at
http://www.BitWizard.nl/sig11/
about this. The important word here is random. If it stops at the same place every
time, the kernel source might have a glitch or your compiler might be
bad. The Web page is mostly about the random
error source: hardware. There is a bunch of different error
messages that you can get if you have bad or marginal hardware.
-
(ADB) Overclocked processors very often fail
long compilations with a sig11, because a long gcc compilation puts more
strain on the processor. As the processor heats up, it may attain a point
where internal timings get out of spec. At this point, something gives
and you get a sig11.
Also, some old K6 revisions would sig11 when compiling large programs
if > 32 Mb of RAM were installed on the Linux box. AMD will exchange these
faulty processors for free. Benoit Poulot-Cazajous correctly diagnosed
the problem and devised an ingenious test for this bug that is run at boot
time in 2.2.x kernels.
-
What compiler flags should I use to compile
modules?
-
(REG) At the very least, you need these:
-O2 -DMODULE -D__KERNEL__ -DLINUX -Dlinux
-
(KO) I don't advise compiling modules by
hand if the directory is in the kernel source tree. The rest of the
Makefile system will not know about the extra modules so it will not
recompile them if the config changes nor will it install the modules.
The best method (until the 2.5 Makefile rewrite) is to add the
directory into the kernel Makefile system.
Create a kernel Makefile in your new directory. Example:
#
# Example Makefile for your own modules
#
SUB_DIRS :=
MOD_SUB_DIRS := $(SUB_DIRS)
ALL_SUB_DIRS := $(SUB_DIRS)
M_OBJS := example-module1.o example-module2.o
include $(TOPDIR)/Rules.make
Edit the Makefile in the parent directory to add your subdirectory to
the SUB_DIRS list. make dep, make modules
and make modules_install will automatically handle your
modules.
-
(VKh)
If you have a local makefile with which you wish to build your module
not linked under the kernel tree in the proper way, you still can
"ride" on the master Makefile.
This way one can eliminate the dependency on your particular
machine kernel compilation options to be hardwired in the local Makefile.
I.e., once you reconfigure the kernel, your driver will compile
itself when you do a local "make" with the correct set of the new flags.
This is what you can do on 2.2 (Makefile excerpt follows):
EXTRA_CFLAGS := -DDEBUG -DLINUX -I/usr/src/foo/include
MI_OBJS := your-module.o
O_TARGET := your-module.o
O_OBJS := your1.o your2.o
# Reuse Linux kernel master makefile on this directory
ifdef MAKING_MODULES
include $(TOPDIR)/Rules.make
else
all::
cd '/usr/src/linux' && make modules SUBDIRS=$(PWD)
endif
In 2.4 the syntax is different. Rename
MI_OBJS
to
obj-m
and O_OBJS
to obj-y
to achieve the same goal there:
obj-m := your-module.o
O_TARGET := your-module.o
obj-y := your1.o your2.o
-
Why do I get unresolved symbols like
foo__ver_foo in modules?
-
Why do I get unresolved symbols with __bad_ in
the name?
-
(REG) This is an indication that a
function has been called with an invalid parameter. In some cases,
these invalid parameters can be detected at compile time (through
clever use of preprocessor tricks), so the preprocessor will modify
the called function name into an invalid one. This will prevent the
final link stage from completing (or will prevent the module from
loading).
OK, so now that you know why, go forth and pester the maintainer of
the section of code that is making the invalid function call. You
should check the CREDITS and MAINTAINERS files to
determine the maintainer.
Section 9 - Feature specific questions
-
GNU/Linux Y2K compliance?
-
(ADB) Y2K compliance under GNU/Linux is a
multi-level problem.
-
Applications. Check your application sources for routines
that only operate on/test the last two digits of the year field/variable(s).
Obviously the problem here is that 2000 > 1999, but 00 < 99. Unfortunately,
poor programming practices are just as common and unavoidable as death
and taxes...
-
Libraries. Libc5 and glibc are known to be Y2K compliant.
Alan Cox mentioned that libc4 had some problems.
-
Kernel. The Linux kernel is y2k compliant. BTW the code snippet
in the /arch/i386/kernel/time.c will force those non-y2k compliant RTC
implementations to the correct date on 00:00:00 Jan 1, 2000. It's been
there for quite some time, now, nice and quiet; added by Alan Modra circa
1994!
-
BIOS. On x86 PC machines, upon boot some BIOS's will wrap
back to 1900, later versions will correctly wrap the RTC clock to 2000.
This is a rather critical problem in embedded systems if they are not running
Linux; if they are running Linux this is solved by Alan Modra's code snippet
mentioned above. :-)
-
Hardware. The standard PC RTC chip will not wrap the century.
Wrapping must be done in software/BIOS. The chip will store the century
data, but it just won't increment it on 00:00:00 Jan 1, 2000. Same issue
as BIOS WRT embedded systems.
Testing the kernel, the BIOS and the RTC hardware is relatively easy if
you are allowed to reboot the machine; just enter the CMOS setup routine
and set the time to Dec 31 1999, 23:58:00. Boot and check what happens.
Checking applications and libraries takes a lot more work... Specially
checking applications when you don't have access to the source code :(
The only way is simulation. But this is getting off topic: if you don't
have access to the source code, then it's not relevant to GNU/Linux. ;)
-
What is the maximum file size supported under ext2fs?
2 GB?
- (REW, AC) In the 2.0.x kernels,
maximum file size (not to be confused with partition sizes,
which can be much larger) under ext2fs is 2GB. Larger files are only
supported on 64-bit architectures (Alpha and UltraSPARC) in late
2.1.1xx kernels.
Files larger than 2GB are difficult to support on 32-bit architectures.
This will probably be implemented in the 2.3 kernel series.
-
GGI/KGI or the Graphics Interface in Kernel Space
debate?
-
(REG, ADB) GGI/KDI information can be
found here. The GGI/KGI
developers warn against useless debates on the kernel list.
-
How do I get more than 16 SCSI disks?
-
(REG) Get kernel version 2.2.0 or later.
-
What's devfs and why is it a Good Idea (tm)?
-
(REG) OK, pushing my own barrow
here. Devfs allows device drivers to have a direct link with device
special files (what you see in /dev). The current dependence on
major/minor numbers to provide this link poses scalability and
performance problems. Devfs also only has device nodes for devices
that you have available. Read the devfs
FAQ for more details. Note that devfs went into the official
2.3.46 kernel.
-
Linux memory management? Zone allocation?
-
(ADB) Rik van Riel has setup a nice page on Linux memory management.
It has a link to an excellent tutorial on virtual memory.
-
How many open files can I have?
-
(REG) With kernels 2.0.x you can have 256
open FDs (file descriptors). With 2.2.x you can have 1024. Various patches
exist which allow you to increase these limits. Note that this can break
select(2).
-
When will the Linux accept(2) bug be fixed?
-
(REG) Firstly, this is not a bug in the Linux
kernel, despite the fact that the "Sendmail 8.9.0 Known Bugs List" states
there is a bug with Linux accept(2). The Linux accept(2) call can return
the ETIMEDOUT error when there are system resource problems. This is not
wrong, just different from what Sendmail expects. Since accept(2) is not
part of the POSIX standard, it cannot be claimed that Linux is
violating it. I'm told that the Single UNIX Specification, Version 2
(SUSv2), which is much newer, implicitly prohibits
ETIMEDOUT. Nevertheless, the networking hackers are not inclined to
change this behaviour. They seem to prefer to follow POSIX in this,
perhaps following the maxim the great thing about standards is
that there are so many to choose from. Note also that BSD
documents slightly different behaviour from SUSv2. It is prudent for
an application to deal gracefully with unexpected error codes.
-
What about STREAMS? I noticed Caldera has a STREAMS package, when
will that go in the kernel source proper?
-
(REG) STREAMS allow you to "push" filters
onto a network stack. The idea is that you can have a very primitive
network stream of data, and then "push" a filter ("module") that
implements TCP/IP or whatever on top of that. Conceptually, this is
very nice, as it allows clean separation of your protocol
layers. Unfortunately, implementing STREAMS poses many performance
problems. Some Unix STREAMS based server telnet implementations even
ran the data up to user space and back down again to a pseudo-tty
driver, which is very inefficient.
STREAMS will never be available
in the standard Linux kernel, it will remain a separate implementation
with some add-on kernel support (that comes with the STREAMS package).
Linus and his networking gurus are unanimous in their decision to keep
STREAMS out of the kernel. They have stated several times on the kernel
list when this topic comes up that even optional support will not be included.
-
(REW, quoting Larry McVoy) "It's too bad,
I can see why some people think they are cool, but the performance cost
- both on uniprocessors and even more so on SMP boxes - is way too high
for STREAMS to ever get added to the Linux kernel."
Please stop asking for them, we have agreement amongst the head guy,
the networking guys, and the fringe folks like myself that they aren't
going in.
-
(REG, quoting Dave Grothe, the STREAMS
guy) STREAMS is a good framework for implementing complex
and/or deep protocol stacks having nothing to do with TCP/IP, such as
SNA. It trades some efficiency for flexibility. You may find the
Linux STREAMS package (LiS) to be quite useful if you need to port
protocol drivers from Solaris or UnixWare, as Caldera did.
The Linux STREAMS (LiS) package is
available for download if you want to use STREAMS for Linux. The
following site also contains a
dissenting view, which supports STREAMS.
-
I need encryption and steganography*. Why
isn't it in the kernel?
-
(TJ) In France and Russia, strong
encryption is essentially illegal: according to
http://cwis.kub.nl/~frw/people/koops/jenc8bjk.htm, using it there
requires a license which is seldom granted. The United States has
cumbersome restrictions on exporting such software (it's considered a
"munition"--see
http://www.epic.org/crypto/export_controls/
). Having these features in the standard kernel would therefore cause
great inconvenience to people in those countries. However, separate
programs and patches to the kernel are available at:
(*) Steganography is disguising sensitive data as noise in a digitized
image, sound file, or the like.
-
How about an undelete facility in the
kernel?
-
(REG) This idea keeps being raised again
and again. There is no need for kernel support to do this. You can
easily do it in user space. There are replacement versions of the
rm utility which will move/copy files to a wastebasket area
instead of actually deleting. If you're really keen, you could
implement a wrapper for the unlink system call, and use
LD_PRELOAD to override the function for all applications. This has
been done by Manuel Arriaga and is called "libtrash". It is available
at:
http://m-arriaga.net/software/libtrash/
-
How about tmpfs for Linux?
-
(REG) The 2.4 series kernels have
introduced a tmpfs. The old SysV shared memory code has been replaced
with a new shm file-system, which is much simpler and cleaner, thanks
to the improved VFS. Since the shm code can be shared to create a
tmpfs, this was done. You may find tmpfs useful if you have an
embedded system which has the root file-system on a read-only media
but needs a writable file-system.
-
(REG) Prior to the introduction of tmpfs,
many people asked for its development, on the grounds that it would
be faster than /tmp in a conventional file-system. This was never
considered a valid reason for tmpfs development, because the Linux
ext2 filesystem is so good that it outperforms tmpfs (memory-based
filesystems) in other operating systems. Jim Nance
([email protected]) has posted a comparison to linux-kernel. Here
is an extract of his message:
The original question is enough of an FAQ that I thought it would be
good to have real numbers rather than just my assurances that Linux
has a fast FS layer. Therefore I wrote a benchmarking program that
creates/writes/destroys files and ran it under several operating
systems and on several types of file systems. I have included that
program as an attachment to this mail. Here are the results:
OS Hardware FS Type Loops/Second
--------------------------------------------------------------------
Linux 2.2.5-ac6 1 nfs 16.33
Linux 2.2.5-ac6 1 arla 73.67
Linux 2.2.5-ac6 1 ext2 15383.32
Solaris 2.6 2 afs 71.33
Solaris 2.6 2 nfs 10.00
Solaris 2.6 2 ufs 23.67
Solaris 2.6 2 tmpfs 9162.32
Digital Unix 4.0D 3 afs 49.33
Digital Unix 4.0D 3 nfs 14.67
Digital Unix 4.0D 3 ufs 28.67
Digital Unix 4.0D 3 memfs 3062.66
Linux 2.0.33 4 afs 69.33
Linux 2.0.33 4 nfs 15.00
Linux 2.0.33 4 ext2 2218.33
Hardware:
1 -> 333 MHz PII, 512M ram, Compaq WDE4360W disk
2 -> Ultra450 class Sun server (300MHz?)
3 -> Personal Workstation 600 AU. 600 MHz alpha. 1.5G ram
4 -> 75 MHz Pentium, 32M ram, Segate ST31200N disk
Notice how Linux writting to an ext2 file system is significantly
faster than any other OS/FS combination. The next closest is Solaris
writting to tmpfs, and its still far behind ext2. It's also good to
notice how slow both Solaris and Digital Unix are on their local file
systems. This is probably why both have a ram base file system.
Please note that this benchmark is intended to measure the time it
takes to create and delete files, which is expensive on most non-linux
systems. It does not indicate anything about the data I/O rate to an
existing file.
It would be interesting to see a comparison between Linux ext2fs and
tmpfs.
-
(REG, by Adam Sulmicki) If after reading
all the above you still feel you need tmpfs, and you're stuck in the
stone age with a 2.2 kernel, read on. However, keep in mind it is more
of a hack than true tmpfs.
The magic way to do it is:
-
compile ramdisk support into kernel, the option is:
CONFIG_BLK_DEV_RAM=y
-
Run the following command to create 2mb ext2 ram disk:
/sbin/mke2fs -vm0 /dev/ram 2048
-
mount it:
mount /dev/ram /tmp
And you are done.
-
What is the maximum file size/filesystem
size?
-
(REG) Maximum file size depends on the
block size on your filesystem. For ext2 (and UFS, SysVFS and similar
filesystems), the limits are:
Block size Maximum file size (GiBytes)
512 B 2
1 kiB 16
2 kiB 128
4 kiB 1024
8 kiB 8192 (PAGE_SIZE must be >= 8 kiB)
plus a small amount. The limitation is due to the classic
triply-indirect addressing scheme. In the future, ext2 will have
extent-based addressing, which will overcome this problem.
The limit for a single filesystem (partition) on a 32 bit CPU is 4 Gi
blocks. Each block is 512 Bytes, so that works out to 2 TiB. For 64
bit CPUs, the limits are bigger than you can imagine.
-
Linux uses lots of swap while I still have
stuff in cache. Isn't this wrong?
-
(MRW) Not really. Linux will swap out
binaries which haven't been used for a long time (e.g. lprd on many
systems) in favour of retaining data from files which have been used
recently (e.g. header files while compiling a big program). This is
more efficient. Trust us, we're engineers.
-
Why don't we add resource forks/streams to Linux
filesystems like NT has?
-
(REG) Resource forks (aka "named
streams") are a way of storing multiple "streams" of data in a
file. Each stream may be read, written and seeked in just like in
files with only one stream of data. Resource forks are used to store
ancillary data with files (such as which icon to display for the file
when using a graphical filemanager). These extra streams of data may
be manipulated by any user who has write access to the file, just as
the "primary" stream can be manipulated.
Unix only supports one "stream" of data per file. Adding support for
multiple streams to the Linux kernel is not considered to be
especially difficult. However, files with multiple data streams would
break a large number of user-space programmes (which currently only
manipulate the "primary" stream) and protocols (such as ftp,
http, email, NFS and many more). A number of new utilities would need
to be written, and a large number of shell scripts would have to be
audited for correctness in a multiple-stream world. Because of this
massive breakage, many kernel developers consider resource forks to be
a bad idea.
Rather than add kernel support, a user-space library could be written
which provided easy management of multiple steams of data for
applications, while still storing the data in a single Unix file. If
someone wants to write such a library, please do so. Once it's
completed, send an email to the FAQ maintainer.
Note that the GNUstep/Foundation library has the NSBundle
class, which provides this functionality. A number of APIs to this
class for different languages are available:
Note that a separate problem is the storage of "extended attributes".
These are attributes like file permissions (such as ACLs and POSIX
capability sets), which have limited size, and tend to be read and
written atomically (i.e. you can't read or write part of the attribute
nor seek in it). These usually require special privileges to modify.
Also, you normally don't want to copy these attributes when copying
files around, thus these extended attributes don't present the
problems of massive breakage that resource forks would.
-
Why don't we internationalise kernel messages?
-
(REG) There are several reasons why this
should not be done:
- It would bloat the kernel sources
- It would drastically increase the cost of maintaining the
kernel message database
- Kernel message output would slow down
- English is the language in which the kernel sources are written,
and thus is the language in which kernel messages are written.
Developers cannot be expected to provide translations
- Bug reports should be submitted in English, and that includes
kernel messages. If kernel messages were to be output in some other
language, most developers could not help in fixing bugs
- Translation can be performed in user-space, there is no need to
change the kernel
- It would bloat the kernel sources
Finally, it will not be done. No core
developer supports this. Neither does Linus. Don't even ask.
Section 10 - "What's changed between kernels 2.0.x and
2.2.x" questions
-
Size (source and executable)?
-
(REW) I use the following to quickly estimate
the size of a project:
cat `find . -name \*.c -o -name \*.h -o -name \*.S `| wc -l
I get 811985 (lines of code, including comments and blank lines) when
I run this on the 2.0.33 kernel source, and 1460508 when I run this on
a 1.0.106 kernel.
This means that the Linux kernel qualifies as an "extremely large"
software product, requiring the effort of 200 to 500 programmers for 5
to 10 years. [Richard Farley: Software engineering concepts, Mc Graw-Hill,
1985, page 11].
Actually, the Linux kernel is now 7 years old, and has seriously involved
100 to 1000 programmers. (i.e. not counting those that have contributed
a "one line fix"). This is my personal guess, so feel free to disagree
or tell me otherwise.
-
(ADB) I can't compare actual kernel footprints
of 2.0.x vs. 2.1.x, but I think it's worth mentionning that 2.1.x kernels
have the ability to "jettison" kernel initialization code, freeing the
corresponding physical memory. So, even though the executable is
certainly larger for 2.1.x kernels, you may actually get a smaller memory
footprint.
-
Can I use a 2.2.x kernel with a distribution based
on a 2.0.x kernel?
-
(REW) Yes. However some applications may need
upgrading. Read /usr/src/linux/Documentation/Changes before you complain
about something not working. Also note that the 2.1.(x+1) kernel may need
a different set of upgrades than 2.1.x, so you should check the Changes
file every single time you upgrade
your Linux kernel.
-
New filesystems supported?
-
NTFS (read-only). Allows read-only access to Windows NT (tm)
partitions.
-
Coda. Coda is an advanced experimental distributed file system
with features such as server replication and disconnected operation for
laptops. Note that Coda is also available for 2.0.x kernels as an add-on
package. Check the Coda Web site
for more information.
-
Performance?
-
(REG) Here are some performance optimizations
which are only available on 2.2.x kernels:
-
MTRRs. MTRRs are registers in PPro and Pentium II CPUs
which define memory regions with distinct properties. The default mode
for PCI memory accesses is "uncacheable" which means memory and I/O
addresses on a PCI peripheral are not cached. For linear frame
buffers, a better mode is "write-combining" which allows the CPU to
re-order and slightly delay writes to memory so that they can be done
in blocks. If you are writing to the PCI bus, you then use PCI burst
mode transfers, which are a few times faster.
-
Finer grained locking. Most instances of the global SMP
spinlock have been replaced with finer grained locking. This gives
much better concurrency.
-
User buffer checks. Replaced the old, painful way of
checking if user buffers passed to syscalls were legal by a kernel
exception handler. The kernel now assumes a buffer is OK. If not, an
exception handler catches the fault and returns -EFAULT to user
space. The advantage is that legal buffers no longer need to be
carefully checked, which is much faster. The old scheme was also
suffering from race conditions under SMP.
-
New directory entry cache (dcache). This makes file lookups
much faster.
Example: time find /usr -name gcc -print
2.1.104: cold cache: 0.180u 0.460s 0:15.02 4.2% 0+0k 0+0io 85pf+0w
2.1.104: warm cache: 0.100u 0.150s 0:00.25 100.0% 0+0k 0+0io 72pf+0w
2.0.33: cold cache: 0.100u 0.660s 0:14.87 5.1% 0+0k 0+0io 85pf+0w
2.0.33: warm cache: 0.090u 0.600s 0:00.69 100.0% 0+0k 0+0io 72pf+0w
Note /usr had 17750 files/directories. We see how with a cold cache
(no disc blocks cached) there is very little difference. However, once
the cache is warm, we see a fourfold reduction in
system time. This is because inode lookups are not needed when a
dcache entry is available. Tests performed on a Pentium/MMX 200.
-
New drivers not available under 2.0.x?
-
(XXX) Please add your answer here...
-
What are those __initxxx macros?
-
(KGB) __initfunc() for example is a macro
used to put its first argument (a function) into a special ELF section
that is dropped from memory once drivers's initialization is over.
So if you write an initialization function, whose code will never
be used again after your driver is initialized, you can use
__initfunc() around its declaration in order to reduce your kernel
memory footprint by a few KB of memory. Similarly, __initdata() is
used for variables, arrays, strings, etc. For implementation details
and examples please consult the file include/linux/init.h from a 2.2.x
source tree.
The main idea here is that the kernel memory is not
swappable. Jettisoning useless code represents a nice way to save
RAM.
-
I have seen many posts on a "Memory Rusting Effect".
Under what circumstances/why does it occur?
-
Why does ifconfig show incorrect statistics with
2.2.x kernels?
-
(TJ) This is in
linux/Documentation/Changes that comes with the kernel
sources:
"For support for new features like IPv6, upgrade to the latest
net-tools. This will also fix other problems. For example,
the format of /proc/net/dev changed; as a result, an older ifconfig
will incorrectly report errors."
-
My pseudo-tty devices don't work any more.
What happened?
-
(TJ) Support for ptys using a major number
of 4 was dropped in Linux 2.1.115. Replace your device files with
ones using the new major numbers, 2 and 3. They will work with later
1.3 versions of Linux, and any 2.x version.
-
(REG) If you use devfs, then this problem
magically goes away.
-
Can I use Unix 98 ptys?
-
(TJ, with much information provided by H. Peter Anvin)
Yes, but only if you have a kernel and libc which support them, and if
your applications are written and compiled to use them. They will
be supported by Linux 2.2 and glibc 2.1. This is in
Documentation/Changes that comes with the kernel sources.
There is also the new standalone libpt by Duncan Simpson which implements
the Unix98 PTY API independently of libc (check the Incoming directory
on metalab.unc.edu/Linux and mirrors). You still need to have your
apps compiled to use this API, of course.
-
Capabilities?
-
Kernel API changes
-
(REG) Some parts of the kernel API
(programming interface) have changed from v2.0 to v2.2. This is
relevant to the authors of 3rd party device drivers, filesystems and
other code. So called "3rd party" code is any kernel code which is not
distributed with the official kernel tarball that Linus distributes. A
quick reference for programmers wishing to port their code to v2.2 is
available
here. Note that this document is not relevant for programmes
running in user space.
If you want to port your drivers to the 2.4 series kernel, then read
this, which tells you how to port code from 2.2 to 2.4.
Section 11 - Primer documents
-
What's a primer document and why should I read it
first?
-
(REG) From time to time various technical
debates start on the linux kernel list. Some of these are about quite important
topics, however often these debates are repeated every few months or so
and much of the same ground is covered each time around. Other times, questions
about how some part of the Linux kernel works are posted. Often we see
the same old questions time and time again. Don't get me wrong: these are
often reasonable questions, it's just that seeing them over and over is
something we'd rather avoid.
This section has some primer document links on various topics that
should be read before starting a debate or posing a question (which itself
can lead to a debate). This is not an attempt to censor debate, rather,
it's an attempt to get you familiar with the current arguments so that
you can contribute something new without going over old ground. If it's
just a question you have, hopefully we can explain it clearly once, in
a single document, and then point everybody to it.
-
How about having I/O completion ports?
-
(REG) The existing UNIX semantics - select(2)
and poll(2) - for polling for activity on FDs do not scale very well: the
overhead is too high with large numbers of FDs. Here is a primer
document which explains some of the problems and explores some solutions.
-
What is the VFS and how does it work?
-
(REG) The VFS (Virtual FileSystem or Virtual
Filesystem Switch, depending on who you talk to) is basically the Linux
filesystem layer. It incorporates the dentry cache and standard UNIX file
semantics. It also contains a "switch" to specific filesystem types (ext2,
vfat, iso9660 and so on), which is why Linux supports so many different
filesystems. Read this VFS
primer document if you want to know more.
-
What's the Linux kernel's notion of time?
-
(ADB) I have tried to put together some
information on this topic, which you can find
here. Colin Plumb
is working on new code for the Linux kernel software clock.
-
Is there any magic in /proc/scsi that I can use
to rescan the SCSI bus?
-
(TJ) The text below is from drivers/scsi/scsi.c.
/*
* Usage: echo "scsi add-single-device 0 1 2 3" >/proc/scsi/scsi
* with "0 1 2 3" replaced by your "Host Channel Id
Lun".
* Consider this feature BETA.
* CAUTION: This is not for hotplugging
your peripherals. As
* SCSI was not designed for this
you could damage your
* hardware !
* However perhaps it is legal to switch on an
* already connected device. It is perhaps not
* guaranteed this device doesn't corrupt an ongoing data
transfer.
*/
For a typical discussion of this topic, see http://jpj.net/~trevor/linux/rescan_scsi.txt.gz.
Section 12 - Kernel Programming Questions
-
When is cli() needed?
-
(ADB) cli() is a kernel wide function
that disables maskable interrupts, whereas sti() is the equivalent
function that enables maskable interrupts. Some routines must be run
with interrupts disabled, because some peripherals need a guaranteed
access sequence, or because the routine is not reentrant and could be
reentered from an interrupt, etc. You should never use cli() in a user
space program/daemon.
-
(REW) The use of cli() is no longer
encouraged. On a single processor, this simply clears an internal CPU
flag, which is ANDed with the Maskable Interrupt Request pin. On SMP
systems it is quite troublesome to keep ALL processors from servicing
interrupts if one processor wants to do something
uninterrupted. Currently we try to do locking on a much finer
scale. For example, you should put a spinlock on the record that
describes THIS INSTANCE of the device that needs the handling without
accesses to other registers (e.g. from the interrupt routine). Besides
preventing the overhead of trying to keep the other CPUs from handling
interrupts, this allows the other CPUs to service interrupts from a
second card of the same type in the same machine.
-
Why do I see sometimes a cli()-sti() pair, and
sometimes a save_flags-cli()-restore_flags sequence?
-
(RRR) The cli()-sti() pair assumes that
interrupts were enabled when execution of the code began, and thus
proceeds to reenable them at the end. The save_flags-cli-restore_flags
sequence doesn't make this assumption. Since the interrupt flag is one
of the flags saved by save_flags(), it will be correctly restored to
its previous state by restore_flags(). This is critical for code that may be called with
interrupts either on or off.
Using save_flags-cli-restore_flags does incur in a very slight overhead
as compared to the cli()-sti) pair, which may be significant for speed
critical code (apart from being superfluous if it's known a priori that
the code will never be called with interrupts off).
-
(REG) Note that on UP systems cli(),
sti() and restore_flags() operate immediately. However, on SMP
systems, these functions may have to wait for the global IRQ lock
(when another CPU has disabled interrupts). Other than this
difference, these functions are SMP safe. It is also safe to call
cli() multiple times on one CPU: the global IRQ lock is only grabbed
the first time.
-
Can I call printk() when interrupts are
disabled?
-
(REG) Yes, you can,
although you should be careful. Older kernels had the infamous
cli()-sti() pair in printk(), so you would get
enabled interrupts when returning from printk(), whether printk() was
called with interrupts disabled or enabled; whereas recent kernels
(e.g. 2.1.107) restore the flags when printk() is finished. You have
to know which version of the kernel you are coding for. Read the
Source, Luke. Also note that in 2.2.x kernels, printk() grabs a
spinlock for SMP machines to avoid any possible deadlocks.
-
What is the exact purpose of start_bh_atomic()
and end_bh_atomic()?
-
(REG, quoting Krzysztof G. Baranowski)
To protect your code from being
interrupted by a bottom half handler. It is mostly used in syscalls
and functions called from userspace and is better than cli/sti pair,
because most of the time there is no need to mask interrupts on
hardware level..
-
Is it safe to grab the global kernel lock
multiple times?
-
(REG) Yes. The global kernel lock is
recursive per process. That means a process can grab the lock multiple
times and not deadlock. The lock is released when
unlock_kernel() is called as many times as lock_kernel()
was called.
-
When do I need to initialise variables?
-
(REG) All variables should be initialised
(implicitly or explicitly) before they are read from. Automatic
variables are placed on the stack, and thus will have a random initial
value. This means that you need to manually initialise them.
Static variables are placed in the .bss section, which is initialised
to zero by the kernel (at the start of the boot sequence). If the
initial value of a static variable should be zero, you don't need to
do anything. If it should be a non-zero value, you will need to
initialise it. Note that you should not explicitly initialise a
static variable to zero, as this will increase the size of the kernel
image, which causes problems for embedded systems.
Section 13 - Mysterious kernel messages
-
What exactly does a "Socket destroy delayed"
mean?
-
(TJ, from a post by Henner Eisen) Sometimes
you may get:
Jul 25 22:14:02 zero kernel: Socket destroy delayed (r=212
w=0)
in /var/log/messages.
It means that the kernel cannot free the internal data structures
associated with a released socket because there are still socket data
buffers (in the above case 212 bytes read memory) accounted to the
socket. For this reason, destroying is delayed and tried again
later. At some point, after the remaining sk_buffs accounted to
the socket are freed, destroying should succeed. Also:
It keeps spitting that out about every 5 seconds or so. the only
way to fix it is to reboot. It doesn't happen very often, but I'd like
to find out what's causing it.
This might indicate a problem that some kernel entity (i.e protocol
module or network device driver), which is responsible for freeing an sk_buff,
fails to do so. To help tracking down the problem, try to find out under
which circumstances the messages start to appear (in particular, which
program closed a socket right before the messages appears, which network
protocol does it use, which network device drivers are involved).
-
What do I do about "inconsistent MTRRs"?
-
(REG) Sometimes you may get:
mtrr: your CPUs had inconsistent ...
MTRR settings
mtrr: probably your BIOS does not setup all CPUs
In English, using "had" as past or past perfect tense commonly implies
that the condition no longer exists. While it isn't absolutely proper,
it is very common. The MTRRs were inconsistent, but they
aren't anymore. The kernel fixed them up. Everything is fine now.
-
Why does my kernel report lots of
"DriveStatusError BadCRC" messages?
-
Why does my kernel report lots of "APIC error"
messages?
-
(REG, contributed by Mark Hahn) You may
get messages like: APIC error on CPU1: 00(08).
APIC is the hardware that ia32 systems use to communicate between
CPUs to handle low-level events like interrupts and TLB flushes. APIC
messages are checksummed, and automatically retried when they fail.
This message indicates that a transaction failed; it's only a problem
when there are many of them. The APIC checksum is quite weak, so even
a few failures is a cause for concern, since it implies that some
corruption has likely gone undetected.
Assuming you're not forcing your motherboard to use an invalid system
clock (i.e. AGP other than 66 MHz), this is strictly a physical design
flaw in your motherboard. The Abit BP6 is notorious for this flaw,
but it's not unheard of on other boards (such as the Gigabyte BXD),
and it's possible on any board that uses APICs.
You can force the kernel not to use APIC like this with the "noapic"
kernel option. This also forces CPU0 to handle all interrupts.
Section 14 - Odd kernel behaviour
-
Why is kapmd using so much CPU time?
-
(REG) Don't worry, it's not stealing
valuable CPU time from other processes. It's just consuming idle
cycles (normally charged to the idle task, which is displayed
differently in top).
Normally, when your system is idle, the system idle task is run, and
this is shown as idle time (i.e. the "unused" CPU time is not charged
to a specific process). With APM (Advanced Power Management), a
special idle task (kapmd) is required so that greater power saving
techniques can be enabled. So now, the "unused" CPU time is charged to
the kapmd task instead.
-
Why does the 2.4 kernel report Connection
refused when connecting to sites which work fine with earlier
kernels?
-
Why does the kernel now report zero shared
memory?
-
(REG, contributed by Erik Mouw) Yes, the
processes still share memory, but due to changes to the VM in 2.4 it
became too CPU intensive to calculate the total amount of shared
memory. In order not to break the userland tools, the "MemShared"
field in /proc/meminfo was set to 0.
-
Why does lsmod report a use count of -1
for some modules? Is this a bug?
-
(REW) There are several
possibilities. First:
-
(DW) No, this is not necessarily a bug.
A module may report a use count of -1 if it has a
can_unload function, which is called when
necessary by the system to determine if it is safe to unload the
module.
-
(REW) But then again, it could be a bug
anyway. In that case, you'd normally see the usage count at 0 (or more
when it's actually used), and when "something" happens, the usage may
drop below zero. If you can repeat this, please drop the driver
maintainer an Email. Some modules lack the code to unload. They will
deliberately set their usage count to -1 to prevent unloading.
-
Why doesn't the kernel see all of my RAM?
-
(REG, based on contribution from Mark
Hahn) Some older distributions like (RedHat 6.1) are quite old,
and use a 2.2 kernel which has not fundamentally changed since
mid-to-late 1998. Way back then, the safe thing for the kernel to do
was trust the standard bios memory detection mechanism. That bios call
returns memory size as a 16 bit count of 1 KiB chunks, leading to a 64
MiB limit. Modern kernels (2.4 is the current stable kernel) use more
modern bios calls that can detect all your memory, and even keep track
of which memory is used by the bios itself. So your best option is to
install a modern kernel. You can workaround the 64 MiB limit with
obsolete kernels by telling the kernel how much memory you have, by
using the mem= boot argument. For example, if you have 128
MiB of RAM, you would type mem=128M at the lilo prompt, or
can have lilo use the argument automatically
(add append="mem=128M" to your /etc/lilo.conf file).
-
I've mounted a filesystem in two different
places and it worked. Why?
-
(AV, paraphrased by William
Stearns)Because you've asked the kernel to do that. Yes, it
works. No, it's not a bug. To unmount it from either mountpoint,
simply run umount <mountpoint>. Repeat for each
mountpoint on which you do not wish the filesystem mounted.
Section 15 - Programming Religion
-
Why is the Linux kernel written in C/assembly?
-
(ADB) For many reasons, some practical, others
theoretical. The practical reasons first: when Linus began writing Linux,
what he had available was a 386, Minix (a minimal OS designed by Andrew
Tanenbaum for OS design teaching purposes) and gcc. The theoretical reasons:
some small parts of any OS kernel will always be written in assembly language,
because they are too dependent on the hardware to be coded in C; for example,
CPU and virtual memory setup. Or because we are dealing with very short
routines that must be implemented in the fastest possible code e.g. the
stubs for the "top half" interrupt handlers. WRT C, OS designers (since
Thompson and Ritchie first wrote UNIX) have traditionally used C to implement
as many OS kernel routines as possible. In this sense C can be considered
the "canonical" language for OS kernel implementation, and particularly
for UNIX variants.
-
Why don't we rewrite it all in assembly language
for processor Mega666?
-
(ADB) Basically because we wouldn't gain much
in terms of efficiency, but would lose a lot in terms of ease of maintenance
and readability of the source code. Gcc is actually quite efficient, when
we look at the assembler code generated. You are referred to Andrew Tanenbaum's
book "Structured Computer Organization", 3rd ed., pages 401-404, for a
more detailed comparison of the use of high level languages vs. assembly
language in the implementation of OS's. There are a number of references
on the subject at the end of the book, too.
-
Why don't we rewrite the Linux kernel in
C++?
-
(ADB) Again, this has to do with
practical and theoretical reasons. On the practical side, when Linux
got started gcc didn't have an efficient C++ implementation, and some
people would argue that even today it doesn't. Also there are many
more C programmers than C++ programmers around. On theoretical
grounds, examples of OS's implemented in Object Oriented languages are
rare (Java-OS and Oberon System 3 come to mind), and the advantages of
this approach are not quite clear cut (for OS design, that is; for GUI
implementation KDE is a good example
that C++ beats plain C any day).
-
(REW) In the dark old days, in the time
that most of you hadn't even heard of the word "Linux", the kernel was
once modified to be compiled under g++. That lasted for a few
revisions. People complained about the performance drop. It turned out
that compiling a piece of C code with g++ would give you worse
code. It shouldn't have made a difference, but it did. Been there,
done that.
-
(REG) Today (Nov-2000), people claim that
compiler technology has improved so that g++ is not longer a worse
compiler than gcc, and so feel this issue should be revisited. In
fact, there are five issues. These are:
-
Should the kernel use object-oriented programming techniques?
Actually, it already does. The VFS (Virtual Filesystem Switch) is a
prime example of object-oriented programming techniques. There are
objects with public and private data, methods and inheritance. This
just happens to be written in C. Another example of object-oriented
programming is Xt (the X Intrinsics Toolkit), also written in
C. What's important about object-oriented programming is the
techniques, not the languages used.
-
Should the kernel be rewritten in C++? This is likely to be a
very bad idea. It would require a very large amount of work to rewrite
the kernel (it's a large piece of code). There is no point
in just compiling the kernel with g++ and writing the odd function in
C++, this would just result in a confusing mix of C and C++
code. Either the kernel is left in C, or it's all moved to C++.
To justify the enormous effort in rewriting the kernel in C++,
significant gains would need to be demonstrated. The onus is clearly
on whoever wants to push the rewrite to C++ to show such gains.
-
Is it a good idea to write a new driver in C++?
The short answer is no, because there isn't any support for C++
drivers in the kernel.
-
Why not add a C++ interface layer to the kernel to support C++
drivers?
The short answer is why bother, since there aren't any C++ drivers for
Linux. However, if you are bold enough to consider writing a driver in
C++ and a support layer, be aware that this is unlikely to be well
received in the community. Most of the kernel developers are
unconvinced of the merits of C++ in general, and consider C++ to
generate bloated code. Also, it would result in a confusing mix of C
and C++ code in the kernel. Any C++ code in the kernel would be a
second-class citizen, as it would be ignored by most kernel developers
when changes to internal interfaces are made. A C++ support layer
would be frequently be broken by such changes (as whoever is making
the changes would probably not bother fixing the C++ code to match),
and thus would require a strong commitment from someone to regularly
maintain it.
-
Can we make the kernel headers C++-friendly?
This is the first step required for supporting C++ drivers, and on the
face seems quite reasonable (it is not a C++ support layer). This has
the problem that C++ reserves keywords which are valid variable or
field names in C (such as private and new). Thus,
C++ is not 100% backwards compatible with C. In effect, the C++
standards bodies would be dictating what variable names we're allowed
to have. From past behaviour, the C++ standards people have not shown
a commitment to 100% backwards compatibility. The fear is that C++
will continue to expand its claim on the namespace. This would
generate an ongoing maintenance burden on the kernel developers.
Note that someone once submitted a patch which performed this
"cleaning up". It was ~250 kB in size, and was quite invasive. The
patch did not generate much enthusiasm.
Apparently, someone has had the temerity to label the above paragraph
as "a bit fuddy". So Erik Mouw did a short back-of-the-envelope
calculation to show that searching the kernel sources for possible C++
keywords is a nightmare. Here is his calculation and comments (dates
April, 2002):
% find /usr/src/linux-2.4.19-pre3-rmap12h -name "*.[chS]" |\
xargs cat | wc -l
4078662
So there's over 4 million lines of kernel source. Let's assume 10% is
comments, so there's about 3.6 million lines left. Each of those lines
has to be checked for C++ keywords. Assume that you can do about 5
seconds per line (very optimistic), work 24 hours per day, and 7 days
a week:
5 s 1 hour 1 day 1 week
3600000 lines * ------ * -------- * ---------- * -------- = 29.8 weeks
line 3600 s 24 hours 7 days
Sounds like a nightmare to me. You can automate large parts of this,
but you'll need to write a *very* intelligent search-and-replace tool
for that. Better use that time in a more efficient way by learning C.
Note that this is the time required to do a proper manual audit of the
code. You could cheat and forgo the auditing process, and instead just
compile with C++ and fix all compiler errors, figuring that the
compiler can do most of the work. This would still be a major effort,
and has the problem that there may be uses of some C++ keyword which
don't generate a compiler error, but do generate unintended code. In
other words, introduced bugs. That is not a risk the kernel
development community is prepared to take.
My personal view is that C++ has its merits, and makes
object-oriented programming easier. However, it is a more complex
language and is less mature than C. The greatest danger with C++ is in
fact its power. It seduces the programmer, making it much easier to
write bloatware. The kernel is a critical piece of code, and must be
lean and fast. We cannot afford bloat. I think it is fair to say that
it takes more skill to write efficient C++ code than C code. Not every
contributer to the linux kernel is an uber-guru, and thus will not
know the various tricks and traps for producing efficient C++ code.
-
Why is the Linux kernel monolithic? Why don't we
rewrite it as a microkernel?
-
(REG) The short answer is why should we?
The longer answer is that experience has shown that microkernels have
poor performance compared to monolithic kernels. Microkernels have a
fundamental design problem, where different components of the kernel
cannot interact without passing a privilege barrier (which is
expensive). Microkernel advocates claim this is a feature, as it
increases modularity and protects one part of the kernel from
another. Whether this is a feature or a mis-feature is in the eye of
the beholder, but it is clear that there is a performance cost
inherent in the microkernel design. This is a cost the Linux kernel
developers (and apparently, the users) are unwilling to bear.
There are projects which have ported the Linux kernel to generic
microkernels (such as
Mach3),
usually making Linux a "personality". There are also other projects to
create microkernel-based Unix-like implementations. Here is a short
list:
- MkLinux was funded by Apple, and runs Linux on PowerPC Macs. It is
available at:
http://www.mklinux.org/. An x86
version is also available. Note that there is now a native Linux
kernel for the PowerPC which is much faster, and is actively
maintained. MkLinux has become a historical footnote.
- The Hurd is a microkernel-based Unix, and is supposed to be the
promised GNU kernel. It sits on top
of Mach3. The Debian Project
provides a full
distribution for the
Hurd.
- FIASCO is another project for creating MicroKernel LINUX. See
http://os.inf.tu-dresden.de/fiasco/ for details.
There is a
historical Usenet thread related to this subject, dating back from
1992, with posts from Linus, Andrew Tanenbaum,
Roger Wolff, Theodore Y
T'so, David Miller and others. Nice reading
on a rainy afternoon. It's fascinating to see how some predictions
(which seemed rather reasonable at the time) have proved wrong over
the years (for example, that we would all be using RISC chips by
1998).
-
Why don't we replace all the goto's with C
exceptions?
-
(REG) Admittedly, all those goto's do
look a bit ugly. However, they are usually limited to error paths, and
are used to reduce the amount of code required to perform cleanup
operations. Replacing these with Politically Correct if-then-else
blocks would required duplication of cleanup code. So switching to
if-then-else blocks might be good Computer Science theory, but using
goto's is good Engineering. Since the Linux kernel is one designed to
be used, rather than to demonstrate theory, sound engineering
principles take priority.
So now we come to the suggestion for replacing the goto's with C
exception handlers. There are two main problems with this. The first
is that C exceptions, like any other powerful abstraction, hide the
costs of what is being done. They may save lines of source code, but
can easily generate much more object code. Object code size is the
true measure of bloat. A second problem is the difficulty in
implementing C exceptions in kernel-space. This is convered in more
detail below.
-
(REG, quoting Keith Owens) The exceptions
patch has to use assembler to walk the stack frames. Exceptions are
being touted as a replacement for goto in new driver code but the
sample patch only works for i386. No arch independent code can use
exceptions until you have arch specific code that does the equivalent
of longjmp for _all_ architectures.
Doing longjmp in the kernel is _hard_, I know because I had to do it
for kdb on i386 and ia64. The kernel does things differently from
user space and sometimes the arch maintainers decide to change the
internal register usage. They are allowed to do this because it only
affects the kernel, but any change to kernel register usage will
probably require a corresponding change to setjmp/longjmp.
So you have arch dependent code which has to be done for all
architectures before any driver can use it and the code has to be kept
up to date by each arch maintainer. Tell me again why the existing
mechanisms are not working and why we need exceptions? IOW, what
existing problem justifies all the extra arch work and maintenance?
-
Why are the kernel developers so dismissive of
new techniques?
-
(REG) This is a complaint that is raised
periodically, usually shortly after some debate or flamewar following
on from a suggestion to use a "new" technique. Often one or more
noted kernel developers will shoot down the idea with a dismissive
"that's a dumb idea" or "all pain, no gain", without a detailed
explanation of why it's a bad idea. This does indeed look arrogant and
dismissive, and gives the impression that the kernel developers are a
pack of old dogs unwilling to learn new tricks. This perception is
compounded by proclamations made by various computer science teachers
about the positive value of the proposed new technique.
It should be noted, however, that kernels developers are exceptionally
busy people, and generally prefer to write code than engage in lengthy
discussions about why some idea is not good (at least for the kernel).
Further, it's fairly likely that the "new" technique that is being
proposed has already been evaluated, and found to be
inadequate/inappropriate for the kernel. Or perhaps the developer has
had prior experience with this technique and found it lacking.
If you are convinced that your favourite technique has value, you have
to prove it. You can't demand that other people spend the time
explaining to you why they think it's a bad idea. You have to do the
hard work yourself to show you're right. Code up a patch and benchmark
it compared to the standard kernel. Be prepared to defend your patch
in a broader context, and demonstrate that it doesn't have costly
side-effects. Remember that many micro-optimisations result in macro
slowdowns.
Finally, some personal advice. Coding up a controversial patch and
proving you're right is a time-consuming task. Because of this, avoid
pushing ideas which you read in a book or heard from some CS notable.
Stick to pushing ideas which you have either had prior experience, or
have spent a lot of time thinking about. This will increase your
chances of picking a winner, and decrease your frustration levels.
Section 16 - User-space programming questions
-
Why does setsockopt() double SO_RCVBUF?
Contributing
Contributions are welcome on
this FAQ. These can be submitted, preferably in diff -u format,
(against this HTML document source)
by Email to Richard (see the Contributors
section above).
Sometimes, we may feel your contribution is controversial and/or
incomplete and/or could be improved somehow. Also, the turnaround
time has a wide range, from hours to months, depending on how busy
Richard is. Please do not email him to chase changes as it slows
him down. Suggestions and patches are queued, and will be
processed eventually. Acknowledgements are usually sent when the
change is made. Please be patient, FAQ updates are rarely urgent. Note
that small, "obviously correct" patches are more likely to be
processed faster, and often jump the queue ahead of larger patches.
Last updated on 19 Sep 2003 by Richard
Gooch. This document is GPL'ed by its various
contributors.