[Published in Open Source For You (OSFY) magazine, October 2013 edition.]
Sparse is a semantic parser written for static analysis of the Linux kernel code. Here’s how you can use it to analyse Linux kernel code.
Sparse implements a compiler front-end for the C programming language, and is released under the Open Software License (version 1.1). You can obtain the latest sources via git:
$ git clone git://git.kernel.org/pub/scm/devel/sparse/sparse.git
You can also install it on Fedora using the following command:
$ sudo yum install sparse
The inclusion of ‘C=1’ to the make command in the Linux kernel will invoke Sparse on the C files to be compiled. Using ‘make C=2’ will execute Sparse on all the source files. There are a number of options supported by Sparse that provide useful warning and error messages. To disable any warning, use the ’-Wno-option’ syntax. Consider the following example:
void
foo (void)
{
}
int
main (void)
{
foo();
return 0;
}
Running sparse on the above decl.c file gives the following output:
$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
-I/usr/include decl.c
decl.c:2:1: warning: symbol 'foo' was not declared. Should it be static?
The ’-Wdecl’ option is enabled by default, and detects any non-static variables or functions. You can disable it with the ’-Wno-decl’ option. To fix the warning, the function foo() should be declared static. A similar output was observed when Sparse was run on Linux 3.10.9 kernel sources:
arch/x86/crypto/fpu.c:153:12: warning: symbol 'crypto_fpu_init' was not declared.
Should it be static?
While the C99 standard allows declarations after a statement, the C89 standard does not permit it. The following decl-after.c example includes a declaration after an assignment statement:
int
main (void)
{
int x;
x = 3;
int y;
return 0;
}
When using C89 standard with the ’-ansi’ or ’-std=c89’ option, Sparse emits a warning, as shown below:
$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
-I/usr/include -ansi decl-after.c
decl-after.c:8:3: warning: mixing declarations and code
This Sparse command line step can be automated with a Makefile:
TARGET = decl-after
SPARSE_INCLUDE = -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
-I/usr/include
SPARSE_OPTIONS = -ansi
all:
sparse $(SPARSE_INCLUDE) $(SPARSE_OPTIONS) $(TARGET).c
clean:
rm -f $(TARGET) *~ a.out
If a void expression is returned by a function whose return type is void, Sparse issues a warning. This option needs to be explicitly specified with a ’-Wreturn-void’. For example:
static void
foo (int y)
{
int x = 1;
x = x + y;
}
static void
fd (void)
{
return foo(3);
}
int
main (void)
{
fd();
return 0;
}
Executing the above code with Sparse results in the following output:
$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
-I/usr/include -Wreturn-void void.c
void.c:12:3: warning: returning void-valued expression
The ’-Wcast-truncate’ option warns about truncation of bits during casting of constants. This is enabled by default. An 8-bit character is assigned more than it can hold in the following:
int
main (void)
{
char i = 0xFFFF;
return 0;
}
Sparse warns of truncation for the above code:
$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
-I/usr/include trun.c
trun.c:4:12: warning: cast truncates bits from constant value (ffff becomes ff)
A truncation warning from Sparse for Linux 3.10.9 kernel is shown below:
arch/x86/kvm/svm.c:613:17: warning: cast truncates bits from
constant value (100000000 becomes 0)
Any incorrect assignment between enums is checked with the ’-Wenum-mismatch’ option. To disable this check, use ’-Wno-enum-mismatch’. Consider the following enum.c code:
enum e1 {a};
enum e2 {b};
int
main (void)
{
enum e1 x;
enum e2 y;
x = y;
return 0;
}
Testing with Sparse, you get the following output:
$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
-I/usr/include enum.c
enum.c:10:7: warning: mixing different enum types
enum.c:10:7: int enum e2 versus
enum.c:10:7: int enum e1
Similar Sparse warnings can also be seen for Linux 3.10.9:
drivers/leds/leds-lp3944.c:292:23: warning: mixing different enum types
drivers/leds/leds-lp3944.c:292:23: int enum led_brightness versus
drivers/leds/leds-lp3944.c:292:23: int enum lp3944_status
NULL is of pointer type, while, the number 0 is of integer type. Any assignment of a pointer to 0 is flagged by the ’-Wnon-pointer-null’ option. This warning is enabled by default. An integer pointer ‘p’ is set to zero in the following example:
int
main (void)
{
int *p = 0;
return 0;
}
Sparse notifies the assignment of 0 as a NULL pointer:
$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
-I/usr/include nullp.c
nullp.c:4:12: warning: Using plain integer as NULL pointer
Given below is another example of this warning in Linux 3.10.9:
arch/x86/kvm/vmx.c:8057:48: warning: Using plain integer as NULL pointer
The corresponding source code on line number 8057 contains:
vmx->nested.apic_access_page = 0;
The GNU Compiler Collection (GCC) has an old, non-standard syntax for initialisation of fields in structures or unions:
static struct
{
int x;
} local = { x: 0 };
int
main (void)
{
return 0;
}
Sparse issues a warning when it encounters this syntax, and recommends the use of the C99 syntax:
$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
-I/usr/include old.c
old.c:4:13: warning: obsolete struct initializer, use C99 syntax
This option is also enabled by default. The ’-Wdo-while’ option checks if there are any missing parentheses in a do-while loop:
int
main (void)
{
int x = 0;
do
x = 3;
while (0);
return 0;
}
On running while.c with Sparse, you get:
$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
-I/usr/include -Wdo-while while.c
while.c:7:5: warning: do-while statement is not a compound statement
This option is not enabled by default. The correct use of the the do-while construct is as follows:
int
main (void)
{
int x = 0;
do {
x = 3;
} while (0);
return 0;
}
A preprocessor conditional that is undefined can be detected with the ’-Wundef’ option. This must be specified explicitly. The preprocessor FOO is not defined in the following undef.c code:
#if FOO
#endif
int
main (void)
{
return 0;
}
Executing undef.c with Sparse, the following warning is shown:
$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
-I/usr/include -Wundef undef.c
undef.c:1:5: warning: undefined preprocessor identifier 'FOO
The use of parenthesised strings in array initialisation is detected with the ’-Wparen-string’ option:
int
main (void)
{
char x1[] = { ("hello") };
return 0;
}
Sparse warns of parenthesised string initialization for the above code:
$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
-I/usr/include -Wparen-string paren.c
paren.c:4:18: warning: array initialized from parenthesized string constant
paren.c:4:18: warning: too long initializer-string for array of char
The ’-Wsparse-all’ option enables all warnings, except those specified with ’-Wno-option’. The width of a tab can be specified with the ’-ftabstop=WIDTH’ option. It is set to 8 by default. This is useful to match the right column numbers in the errors or warnings.
You can refer to the following manual page for more available options:
$ man sparse