C/C++ 代码检测工具

上文 中,笔者简单的介绍了 C/C++ 检测与调试常用的工具,在本文中,笔者将测试

  • clang-tidy
  • cppcheck
  • AddressSanitizer
  • valgrind memcheck

这 4 种工具,在笔者故意编造的简单的 C 语言代码中的常见错误中的表现情况.

笔者进行测试的环境信息为:

Arch Linux 5.13.6-arch1-1

1
clang --version

clang version 13.0.0 (/startdir/llvm-project 5cd63e9ec2a385de2682949c0bbe928afaf35c91)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

1
clang-tidy --version

LLVM (http://llvm.org/):
LLVM version 13.0.0
Optimized build.
Default target: x86_64-pc-linux-gnu
Host CPU: znver1

Cppcheck 2.5
flawfinder 2.0.17
valgrind-3.17.0

测试环境所使用的命令如下:

1
2
3
4
clang-tidy "--checks=*"  $filename >> /tmp/analyze
cppcheck --enable=all $filename 2>> /tmp/analyze
clang -g -fsanitize=address -fno-omit-frame-pointer $filename -o /tmp/123.out && /tmp/123.out 2>> /tmp/analyze
clang -g $filename -o /tmp/123.out && valgrind --tool=memcheck /tmp/123.out 2>> /tmp/analyze

error
ERROR
本文中代码片段均不该作为学习编程语言或代码风格的参考资料.本文代码片段意在构造常见的编程错误,并尝试使用工具对其分析.相关代码片段的修改和纠错请查阅编程语言相关学习资料.

数组越界

简单的数组越界

1
2
3
4
int main() {
char a[10];
a[10] = 0;//越界
}

多么常见的编程错误,下面是各种工具给出的分析结果:

clang-tidy

1
2
3
4
5
6
arr01.c:3:5: warning: array index 10 is past the end of the array (which contains 10 elements) [clang-diagnostic-array-bounds]
a[10] = 0;
^ ~~
arr01.c:2:5: note: array 'a' declared here
char a[10];
^

cppcheck
1
2
3
4
5
6
arr01.c:3:6: error: Array 'a[10]' accessed at index 10, which is out of bounds. [arrayIndexOutOfBounds]
a[10] = 0;
^
arr01.c:3:11: style: Variable 'a[10]' is assigned a value that is never used. [unreadVariable]
a[10] = 0;
^

AddressSanitizer
1
2
3
4
5
6
7
8
9
10
11
12
13
14
stack-buffer-overflow on address 0x7fffd170488a at pc 0x000000500c07 bp 0x7fffd1704850 sp 0x7fffd1704848
WRITE of size 1 at 0x7fffd170488a thread T0
#0 0x500c06 in main arr01.c:3:11
#1 0x7faeb9712b24 in __libc_start_main /usr/src/debug/glibc-2.33/csu/../csu/libc-start.c:332:16
#2 0x41f12d in _start /usr/src/debug/glibc-2.33/csu/../sysdeps/x86_64/start.S:120

Address 0x7fffd170488a is located in stack of thread T0 at offset 42 in frame
#0 0x500b1f in main arr01.c:1

This frame has 1 object(s):
[32, 42) 'a' (line 2) <== Memory access at offset 42 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow arr01.c:3:11 in main

valgrind memcheck 未给出有价值的信息.

使用指针的数组越界

1
2
3
4
5
6
7
8
9
int main() {
char a[10];
char *pr = a;
pr[10] = 10;//越界
pr = a + 5;
pr[5] = 10;//越界
pr[-1] = 2;
pr[-6] = 7;//越界
}

clang-tidy 未能给出与数组越界相关的错误信息.
cppcheck 未给出错误信息.

静态分析工具没有给出任何有价值的信息.
AddressSanitizer

1
2
3
4
5
6
7
8
9
10
11
12
13
14
stack-buffer-overflow on address 0x7fffcf8cde6a at pc 0x000000500c1f bp 0x7fffcf8cde30 sp 0x7fffcf8cde28
WRITE of size 1 at 0x7fffcf8cde6a thread T0
#0 0x500c1e in main arr03.c:4:12
#1 0x7f6f26625b24 in __libc_start_main /usr/src/debug/glibc-2.33/csu/../csu/libc-start.c:332:16
#2 0x41f12d in _start /usr/src/debug/glibc-2.33/csu/../sysdeps/x86_64/start.S:120

Address 0x7fffcf8cde6a is located in stack of thread T0 at offset 42 in frame
#0 0x500b1f in main arr03.c:1

This frame has 1 object(s):
[32, 42) 'a' (line 2) <== Memory access at offset 42 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow arr03.c:4:12 in main

valgrind memcheck 也未能给出任何有价值的信息.

在这次实验中,只是使用指针代替数组完成操作便规避了绝大多数的分析工具.

二维数组的数组越界

1
2
3
4
5
6
7
int main() {
char arr[5][5];
arr[4][6] = 0;//越界
arr[5][4] = 0;//越界
arr[-1][0] = 0;//越界
arr[0][-1] = 0;//越界
}

clang-tidy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
arr17.c:3:5: warning: array index 6 is past the end of the array (which contains 5 elements) [clang-diagnostic-array-bounds]
arr[4][6] = 0;
^ ~
arr17.c:2:5: note: array 'arr' declared here
char arr[5][5];
^
arr17.c:4:5: warning: array index 5 is past the end of the array (which contains 5 elements) [clang-diagnostic-array-bounds]
arr[5][4] = 0;
^ ~
arr17.c:2:5: note: array 'arr' declared here
char arr[5][5];
^
arr17.c:5:5: warning: array index -1 is before the beginning of the array [clang-diagnostic-array-bounds]
arr[-1][0] = 0;
^ ~~
arr17.c:2:5: note: array 'arr' declared here
char arr[5][5];
^
arr17.c:6:5: warning: array index -1 is before the beginning of the array [clang-diagnostic-array-bounds]
arr[0][-1] = 0;
^ ~~
arr17.c:2:5: note: array 'arr' declared here
char arr[5][5];
^

cppcheck
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
arr17.c:3:8: error: Array 'arr[5][5]' accessed at index arr[4][6], which is out of bounds. [arrayIndexOutOfBounds]
arr[4][6] = 0;
^
arr17.c:4:8: error: Array 'arr[5][5]' accessed at index arr[5][4], which is out of bounds. [arrayIndexOutOfBounds]
arr[5][4] = 0;
^
arr17.c:5:8: error: Array 'arr[5][5]' accessed at index arr[-1][*], which is out of bounds. [negativeIndex]
arr[-1][0] = 0;
^
arr17.c:6:8: error: Array 'arr[5][5]' accessed at index arr[*][-1], which is out of bounds. [negativeIndex]
arr[0][-1] = 0;
^
arr17.c:6:16: style: Variable 'arr[0][-1]' is assigned a value that is never used. [unreadVariable]
arr[0][-1] = 0;
^

AddressSanitizer
1
2
3
4
5
6
7
8
9
10
11
12
13
14
AddressSanitizer: stack-buffer-overflow on address 0x7ffef126b89a at pc 0x000000500c20 bp 0x7ffef126b850 sp 0x7ffef126b848
WRITE of size 1 at 0x7ffef126b89a thread T0
#0 0x500c1f in main arr17.c:3:15
#1 0x7f7522d69b24 in __libc_start_main /usr/src/debug/glibc-2.33/csu/../csu/libc-start.c:332:16
#2 0x41f12d in _start /usr/src/debug/glibc-2.33/csu/../sysdeps/x86_64/start.S:120

Address 0x7ffef126b89a is located in stack of thread T0 at offset 58 in frame
#0 0x500b1f in main arr17.c:1

This frame has 1 object(s):
[32, 57) 'arr' (line 2) <== Memory access at offset 58 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow arr17.c:3:15 in main

valgrind memcheck 未输出有价值的信息.

使用字符串的数组越界

1
2
3
4
5
6
7
8
9
10
11
#include <limits.h>
#include <stdio.h>
#include <string.h>
int main() {
unsigned int n = UINT_MAX;
char buf[8];
sprintf(buf, "%u", n);//溢出

char *str = "hello world";
strcpy(buf, str);//溢出
}

clang-tidy

1
2
3
4
5
6
7
8
9
10
11
12
arr04.c:7:5: warning: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11 [clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling]
sprintf(buf, "%u", n);
^~~~~~~
arr04.c:7:5: note: Call to function 'sprintf' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'sprintf_s' in case of C11
sprintf(buf, "%u", n);
^~~~~~~
arr04.c:10:5: warning: Call to function 'strcpy' is insecure as it does not provide bounding of the memory buffer. Replace unbounded copy functions with analogous functions that support length arguments such as 'strlcpy'. CWE-119 [clang-analyzer-security.insecureAPI.strcpy]
strcpy(buf, str);
^~~~~~
arr04.c:10:5: note: Call to function 'strcpy' is insecure as it does not provide bounding of the memory buffer. Replace unbounded copy functions with analogous functions that support length arguments such as 'strlcpy'. CWE-119
strcpy(buf, str);
^~~~~~

clang-tidy 只发现了第二处错误,未能找到第一处错误.

cppcheck

1
2
3
arr04.c:10:12: error: Buffer is accessed out of bounds: buf [bufferAccessOutOfBounds]
strcpy(buf, str);
^

cppcheck 只发现了第二处错误,未能找到第一处错误.

AddressSanitizer

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
stack-buffer-overflow on address 0x7ffc1fa69e48 at pc 0x000000498768 bp 0x7ffc1fa69d20 sp 0x7ffc1fa694d0
WRITE of size 11 at 0x7ffc1fa69e48 thread T0
#0 0x498767 in __interceptor_vsprintf (/tmp/123.out+0x498767)
#1 0x498ad6 in sprintf (/tmp/123.out+0x498ad6)
#2 0x500bf1 in main arr04.c:7:5
#3 0x7fba9700eb24 in __libc_start_main /usr/src/debug/glibc-2.33/csu/../csu/libc-start.c:332:16
#4 0x41f12d in _start /usr/src/debug/glibc-2.33/csu/../sysdeps/x86_64/start.S:120

Address 0x7ffc1fa69e48 is located in stack of thread T0 at offset 40 in frame
#0 0x500b1f in main arr04.c:4

This frame has 1 object(s):
[32, 40) 'buf' (line 6) <== Memory access at offset 40 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow (/tmp/123.out+0x498767) in __interceptor_vsprintf

AddressSanitizer 正确的定位到了第一次发生错误的位置:
1
#2 0x500bf1 in main arr04.c:7:5

valgrind memcheck 未能给出有价值的信息.

在上面的 3 次测试中,能看到 cppcheck、clang-tidy 都不能做到检测出所有的数组越界问题,AddressSanitizer 则都定位到了程序首次发生数组越界的位置.

内存管理

内存管理是 C/C++ 中的难点,让我们来看看与内存管理相关的错误是否能通过工具进行高效的定位.

多次 free 与释放后使用

1
2
3
4
5
6
7
8
9
#include <stdlib.h>
#include <string.h>
int main() {
char *buf = malloc(10);
strcpy(buf, "hello world");
free(buf);
free(buf);
strcpy(buf, "123456");
}

clang-tidy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
mem02.c:5:5: warning: Call to function 'strcpy' is insecure as it does not provide bounding of the memory buffer. Replace unbounded copy functions with analogous functions that support length arguments such as 'strlcpy'. CWE-119 [clang-analyzer-security.insecureAPI.strcpy]
strcpy(buf, "hello world");
^~~~~~
mem02.c:5:5: note: Call to function 'strcpy' is insecure as it does not provide bounding of the memory buffer. Replace unbounded copy functions with analogous functions that support length arguments such as 'strlcpy'. CWE-119
strcpy(buf, "hello world");
^~~~~~
mem02.c:7:5: warning: Attempt to free released memory [clang-analyzer-unix.Malloc]
free(buf);
^~~~~~~~~
mem02.c:4:17: note: Memory is allocated
char *buf = malloc(10);
^~~~~~~~~~
mem02.c:6:5: note: Memory is released
free(buf);
^~~~~~~~~
mem02.c:7:5: note: Attempt to free released memory
free(buf);
^~~~~~~~~
mem02.c:8:5: warning: Call to function 'strcpy' is insecure as it does not provide bounding of the memory buffer. Replace unbounded copy functions with analogous functions that support length arguments such as 'strlcpy'. CWE-119 [clang-analyzer-security.insecureAPI.strcpy]
strcpy(buf, "123456");
^~~~~~
mem02.c:8:5: note: Call to function 'strcpy' is insecure as it does not provide bounding of the memory buffer. Replace unbounded copy functions with analogous functions that support length arguments such as 'strlcpy'. CWE-119
strcpy(buf, "123456");
^~~~~~

cppcheck
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
mem02.c:5:12: error: Buffer is accessed out of bounds: buf [bufferAccessOutOfBounds]
strcpy(buf, "hello world");
^
mem02.c:4:15: note: Assign buf, buffer with size 10
char *buf = malloc(10);
^
mem02.c:5:12: note: Buffer overrun
strcpy(buf, "hello world");
^
mem02.c:7:5: error: Memory pointed to by 'buf' is freed twice. [doubleFree]
free(buf);
^
mem02.c:6:5: note: Memory pointed to by 'buf' is freed twice.
free(buf);
^
mem02.c:7:5: note: Memory pointed to by 'buf' is freed twice.
free(buf);
^

AddressSanitizer
1
2
3
4
5
6
7
8
9
10
11
12
13
14
AddressSanitizer: heap-buffer-overflow on address 0x60200000001a at pc 0x000000489aef bp 0x7ffef9e30a40 sp 0x7ffef9e301f0
WRITE of size 12 at 0x60200000001a thread T0
#0 0x489aee in __interceptor_strcpy.part.0 asan_interceptors.cpp.o
#1 0x500b33 in main mem02.c:5:5
#2 0x7f00689feb24 in __libc_start_main /usr/src/debug/glibc-2.33/csu/../csu/libc-start.c:332:16
#3 0x41f12d in _start /usr/src/debug/glibc-2.33/csu/../sysdeps/x86_64/start.S:120

0x60200000001a is located 0 bytes to the right of 10-byte region [0x602000000010,0x60200000001a)
allocated by thread T0 here:
#0 0x4c7d99 in __interceptor_malloc (/tmp/123.out+0x4c7d99)
#1 0x500b21 in main mem02.c:4:17
#2 0x7f00689feb24 in __libc_start_main /usr/src/debug/glibc-2.33/csu/../csu/libc-start.c:332:16

SUMMARY: AddressSanitizer: heap-buffer-overflow asan_interceptors.cpp.o in __interceptor_strcpy.part.0

这次基础的测试中,clang-tidy、cppcheck 均能检测出内存的二次释放,也给出了有关不安全的 api strcpy 的警告,但未能对使用释放后的内存给出提示.
valgrind memcheck
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
Invalid write of size 1
at 0x4844914: strcpy (vg_replace_strmem.c:523)
by 0x401173: main (mem02.c:5)
Address 0x4a4404a is 0 bytes after a block of size 10 alloc'd
at 0x483E7C5: malloc (vg_replace_malloc.c:380)
by 0x401161: main (mem02.c:4)
Invalid write of size 1
at 0x4844926: strcpy (vg_replace_strmem.c:523)
by 0x401173: main (mem02.c:5)
Address 0x4a4404b is 1 bytes after a block of size 10 alloc'd
at 0x483E7C5: malloc (vg_replace_malloc.c:380)
by 0x401161: main (mem02.c:4)
Invalid free() / delete / delete[] / realloc()
at 0x484118B: free (vg_replace_malloc.c:755)
by 0x401185: main (mem02.c:7)
Address 0x4a44040 is 0 bytes inside a block of size 10 free'd
at 0x484118B: free (vg_replace_malloc.c:755)
by 0x40117C: main (mem02.c:6)
Block was alloc'd at
at 0x483E7C5: malloc (vg_replace_malloc.c:380)
by 0x401161: main (mem02.c:4)
Invalid write of size 1
at 0x4844914: strcpy (vg_replace_strmem.c:523)
by 0x401193: main (mem02.c:8)
Address 0x4a44040 is 0 bytes inside a block of size 10 free'd
at 0x484118B: free (vg_replace_malloc.c:755)
by 0x40117C: main (mem02.c:6)
Block was alloc'd at
at 0x483E7C5: malloc (vg_replace_malloc.c:380)
by 0x401161: main (mem02.c:4)
Invalid write of size 1
at 0x4844926: strcpy (vg_replace_strmem.c:523)
by 0x401193: main (mem02.c:8)
Address 0x4a44046 is 6 bytes inside a block of size 10 free'd
at 0x484118B: free (vg_replace_malloc.c:755)
by 0x40117C: main (mem02.c:6)
Block was alloc'd at
at 0x483E7C5: malloc (vg_replace_malloc.c:380)
by 0x401161: main (mem02.c:4)
HEAP SUMMARY:
in use at exit: 0 bytes in 0 blocks
total heap usage: 1 allocs, 2 frees, 10 bytes allocated
All heap blocks were freed -- no leaks are possible
For lists of detected and suppressed errors, rerun with: -s
ERROR SUMMARY: 10 errors from 5 contexts (suppressed: 0 from 0)

复杂的多次释放

1
2
3
4
5
6
7
8
9
10
11
12
#include <stdlib.h>
#include <string.h>
int main() {
char *table[5];
for (int i = 0; i < 5; i++) {
table[i] = malloc(0x10);
strncpy(table[i], "1234567890", 11);
}
for (int i = 0; i < 10; i++) {
free(table[rand() % 5]);
}
}

clang-tidy

1
2
3
4
5
6
mem07.c:7:9: warning: Call to function 'strncpy' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'strncpy_s' in case of C11 [clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling]
strncpy(table[i], "1234567890", 11);
^~~~~~~
mem07.c:7:9: note: Call to function 'strncpy' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'strncpy_s' in case of C11
strncpy(table[i], "1234567890", 11);
^~~~~~~

很遗憾,clang-tidy 只是告诉开发者更换更安全的 api 来防止溢出,对本程序中的多次释放内存的问题视而不见.
cppcheck 也未能给出有价值的信息.

AddressSanitizer

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
attempting double-free on 0x602000000070 in thread T0:
#0 0x4c7af9 in free (/tmp/123.out+0x4c7af9)
#1 0x500d39 in main mem07.c:10:9
#2 0x7f8e0d928b24 in __libc_start_main /usr/src/debug/glibc-2.33/csu/../csu/libc-start.c:332:16
#3 0x41f14d in _start /usr/src/debug/glibc-2.33/csu/../sysdeps/x86_64/start.S:120

0x602000000070 is located 0 bytes inside of 16-byte region [0x602000000070,0x602000000080)
freed by thread T0 here:
#0 0x4c7af9 in free (/tmp/123.out+0x4c7af9)
#1 0x500d39 in main mem07.c:10:9
#2 0x7f8e0d928b24 in __libc_start_main /usr/src/debug/glibc-2.33/csu/../csu/libc-start.c:332:16

previously allocated by thread T0 here:
#0 0x4c7db9 in __interceptor_malloc (/tmp/123.out+0x4c7db9)
#1 0x500c3d in main mem07.c:6:20
#2 0x7f8e0d928b24 in __libc_start_main /usr/src/debug/glibc-2.33/csu/../csu/libc-start.c:332:16

SUMMARY: AddressSanitizer: double-free (/tmp/123.out+0x4c7af9) in free

valgrind memcheck
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Invalid free() / delete / delete[] / realloc()
at 0x484118B: free (vg_replace_malloc.c:755)
by 0x4011EB: main (mem07.c:10)
Address 0x4a44130 is 0 bytes inside a block of size 16 free'd
at 0x484118B: free (vg_replace_malloc.c:755)
by 0x4011EB: main (mem07.c:10)
Block was alloc'd at
at 0x483E7C5: malloc (vg_replace_malloc.c:380)
by 0x401189: main (mem07.c:6)
HEAP SUMMARY:
in use at exit: 0 bytes in 0 blocks
total heap usage: 5 allocs, 10 frees, 80 bytes allocated
All heap blocks were freed -- no leaks are possible
For lists of detected and suppressed errors, rerun with: -s
ERROR SUMMARY: 5 errors from 1 contexts (suppressed: 0 from 0)

AddressSanitizer 和 valgrind memcheck 都给出了明确的多次释放的提示和错误位置.

总结

很多自动化的工具能够帮助开发者发现代码中的 bugs,但不可否认的是这些工具都还有很多不足.静态分析工具容易别较为复杂的代码绕过,而动态分析工具又容易因测试不能覆盖所有分支而被绕过.
这些工具能帮助开发者进行更加高效的开发与调试,但若仅仅依赖工具的自动检测则可能众多隐蔽的 bugs 深植于代码之中.

作为开发者,应该当增强自己寻找 bugs 的能力.bugs 是常见的,找到的 bugs 的能力是珍贵的.

参考资料

1. google/sanitizers:AddressSanitizer, ThreadSanitizer, MemorySanitizer [G/OL]. https://github.com/google/sanitizers.
2. Clang 13 documentation [G/OL]. https://clang.llvm.org/docs/.

C/C++ 代码检测工具

https://blog.y7n05h.dev/cpptest/

作者

Y7n05h

发布于

2021-09-01

更新于

2021-09-01

许可协议

CC BY-SA 4.0