2021-09-01发表Dev9 分钟读完 (大约1333个字)

C/C++ 调试工具

众所周知，计算机程序在开发过程中不出现 Bugs 是困难的，随着程序设计的日益复杂，「Bug Free」也才成为了一种可贵的能力．「Bug Free」通常是困难的，也离不开长期努力的学习和练习．期待先达成「Bug Free」，再开始写程序是不切实际的幻想，Bugs 又必须修复，故此才体现了调试的价值．

Lint

静态分析工具是开发时的良师，静态分析工具常常能在开发过程中发现许多错误或疑似错误问题，并给出 error 或 warning．

有很多自由且功能强大的工具能提供 C/C++ 代码静态分析(按照名称升序排列)：

当然，编译器在编译代码时，也能提供有关代码错误和警告信息．

静态分析工具常常可通过插件或扩展等方式与 IDE 整合，在开发过程中，自动分析代码错误，并及时修改．静态分析工具不仅能分析代码中的错误，还能给出有关可读性的建议，能促使开发者规范的编码．

需要注意的是：不必「过分的讨好」静态分析工具，只要开发者确认静态分析工具给出了错误的信息，那么请坚持自己的做法，并在静态分析工具中禁用相关 checks．

动态分析

Sanitizers

Google 与 LLVM 为开发者提供了一套内置于 clang 内的动态分析工具用于检测众多代码问题．

AddressSanitizer:detects addressability issues
LeakSanitizer:detects memory leaks
ThreadSanitizer:detects data races and deadlocks
MemorySanitizer:detects use of uninitialized memory
Hardware-assisted AddressSanitizer:a newer variant of AddressSanitizer thatconsumes - much less memory
UndefinedBehaviorSanitizer[1]

Thread Safety Analysis

线程安全的问题时常让人苦恼，虽然编写线程安全的 C/C++ 程序算是 C/C++ 开发者的一项基本功，但时常出现的线程不安全与条件竞争也让人防不胜防．

ThreadSafetyAnalysis

Clang Thread Safety Analysis is a C++ language extension which warns about potential race conditions in code. The analysis is completely static (i.e. compile-time); there is no run-time overhead. The analysis is still under active development, but it is mature enough to be deployed in an industrial setting. It is being developed by Google, in collaboration with CERT/SEI, and is used extensively in Google’s internal code base.
Thread safety analysis works very much like a type system for multi-threaded programs. In addition to declaring the type of data (e.g. int, float, etc.), the programmer can (optionally) declare how access to that data is controlled in a multi-threaded environment. For example, if foo is guarded by the mutex mu, then the analysis will issue a warning whenever a piece of code reads or writes to foo without first locking mu. Similarly, if there are particular routines that should only be called by the GUI thread, then the analysis will warn if other threads call those routines.[3]

Valgrind

valgrind

Valgrind is an instrumentation framework for building dynamic analysis tools. There are Valgrind tools that can automatically detect many memory management and threading bugs, and profile your programs in detail. You can also use Valgrind to build new tools.
The Valgrind distribution currently includes seven production-quality tools: a memory error detector, two thread error detectors, a cache and branch-prediction profiler, a call-graph generating cache and branch-prediction profiler, and two different heap profilers. It also includes an experimental SimPoint basic block vector generator. It runs on the following platforms: X86/Linux, AMD64/Linux, ARM/Linux, ARM64/Linux, PPC32/Linux, PPC64/Linux, PPC64LE/Linux, S390X/Linux, MIPS32/Linux, MIPS64/Linux, X86/Solaris, AMD64/Solaris, ARM/Android (2.3.x and later), ARM64/Android, X86/Android (4.0 and later), MIPS32/Android, X86/Darwin and AMD64/Darwin (Mac OS X 10.12).

memcheck：内存错误检查
cachegrind：缓存使用
callgrind：函数调用
dhat
drd
exp-bbv
getoff
helgrind
lackey：资源泄漏
massif

网络分析

网络相关的错误常常可以使用 Wireshark 进行调试，Wireshark能直观的查看程序发送或接受的数据，能够为调试带来很多的便利．
Netcat 则是常见的测试工具，方便在终端下直接操作 socket．

单元测试

单元测试是一种良好的测试 API 的方式，在编码阶段即可通过单元测试检查 API 是否满足预期．

CUnit

C++：

日志

对于一个复杂的软件系统，常常需要在长期在后台静默的运行，那么日志的输出就十分重要，日志也常常被用来定位系统中的 Bugs，高效的记录日志对调试有很大的帮助．

spdlog:Very fast, header-only/compiled, C++ logging library.
Google Logging Library:Google Logging (glog) is a C++98 library that implements application-level logging. The library provides logging APIs based on C++-style streams and various helper macros.

Debuger

有时面对的问题是复杂的，调试器也是解决问题的利器．

参考资料

¹. google/sanitizers:AddressSanitizer, ThreadSanitizer, MemorySanitizer [G/OL]. https://github.com/google/sanitizers. ↩

². Clang 13 documentation [G/OL]. https://clang.llvm.org/docs/. ↩

³. Thread Safety Analysis [G/OL]. https://clang.llvm.org/docs/ThreadSafetyAnalysis.html. ↩

2021-09-01发表Dev1 小时读完 (大约6992个字)

x86_64函数调用

x86_64 函数调用

本文将讨论 x86_64 平台的函数调用过程，简要介绍部分常见的调用约定．阅读本文需要读者对 x86_64 汇编语言有一些基本的了解．

本文只讨论「长度不大于 64 bit 的整数类型」与「指针类型」作为函数参数、返回值时传递的方式，不涉及「长度大于 64 bit 的整数类型参数」与结构体、浮点数等类型的传递方式．

本文代码为了展示函数调用与返回过程中的汇编语言实现，引入了大量无意义、冗余的代码，本文代码不能作为学习编程语言中写法的推荐或参考．

前置知识

栈

相信很多人都遇到过因函数的递归次数过多，导致程序运行时出现栈溢出的问题．这个溢出的「栈」是本文要关注的重点，函数调用的过程和它密不可分．
栈从高地址向低地址增长．

info
INFO
无特殊说明时，本文中说提及的「栈」均指代程序的「调用栈」或者说「运行时栈」，而不是指数据结构中的「堆栈」．

`PUSH`

PUSH 前
PUSH 操作类似数据结构中的「堆栈」．
PUSH 指令总是

递减 rsp
将 PUSH 的值存储在 rsp 递减后指向的位置

笔者说明 PUSH 过程意在强调：在 PUSH 操作中， rsp 指向的是最后一个数据的位置，而不是指向栈上待写入数据的位置．
简单的说：rsp 总是指向有效的数据．

`POP`

POP 操作与 PUSH 操作相对应．
POP 指令总是：
POP 前

将栈顶的值取出
递增 rsp

tip
TIP
栈顶：当前 rsp 指向的值．

`LEAVE`

LEAVE 操作是等价于

1 2	mov %rbp , %rsp pop %rbp

也就是：

通过执行 mov %rbp , %rsp（rsp = rbp），恢复 rsp 至执行 CALL 后的位置．
通过执行 pop %rbp，恢复 rbp 至原来的栈底．

无返回值与参数的「函数调用与返回过程」

本小节将说明函数无参数、无返回值的函数调用与返回的过程．请读者们将关注点集中理解在函数调用的流程上，不必过多的关注具体的细节．

首先，尝试写出简单的函数调用的示例．

/* call1.c */
void func1()
{
    int v = 0;
}
int main(void)
{
    func1();
}

查看该程序的反汇编代码：
以下代码由 clang 生成并使用 objdump 反编译获得，笔者已删去其中的次要部分（删节部分并未全部标注）．（后同）

<func1>:
push   %rbp
mov    %rsp,%rbp
sub    $0x10,%rsp
;此处有删节
leave
ret

<main>:
;此处有删节
call   1147 <func1>
mov    $0x0,%eax
pop    %rbp
ret

首先请关注由 main() 到 func1() 的调用过程．

调用前

在 main()，执行 call 1147 <func1> 就完成了对 func1 的调用．call 将 main() 中下一条语句的地址（也就是「mov $0x0,%eax」这句的地址）压入栈中并修改 rip 的值为 func1() 的地址．
在 func1()，通过将 rbp 压栈的方式，保存 rbp．

tip
TIP
笔者特意删节掉了关于 rbp 里面的值的部分．
也请读者暂时不要关注在执行 call 1147 <func1> 时 rbp 的值是多少．
请读者暂且记住此时 rbp 指向栈的某一个位置，而且 rbp < rsp 即可．

通过执行 mov %rsp,%rbp（rbp = rsp），原来的栈底（rbp 指向的位置）成为了新的栈顶（rsp 指向的位置）

此时调用函数 func1() 的过程结束．现在关注如何从 func1() 返回至 main()．

函数返回时，使用 LEAVE 恢复了先前保存栈底．
使用 ret 根据 rsp 指向的位置从栈中弹出 返回位置，并通过修改 rip的值为 返回地址 完成了函数的返回．

有返回值和参数的「函数调用与返回过程」上

无返回值与参数的「函数调用与返回过程」可以看作本节要讨论的有返回值和参数的「函数调用与返回过程」的一种简化情况．

和上一节一样，研究一个简单的函数示例对理解该过程有帮助．

/* call2.c */
void func1()
{
}
int func2(int a, long b, char *c)
{
    *c = a * b;
    func1();
    return a * b;
}
int main()
{
    char value;
    int rc = func2(1, 2, &value);
}

反汇编后得到：

<func1>:
push   %rbp
mov    %rsp,%rbp
pop    %rbp
ret

<func2>:
push   %rbp
mov    %rsp,%rbp
sub    $0x20,%rsp
mov    %edi,-0x4(%rbp)
mov    %rsi,-0x10(%rbp)
mov    %rdx,-0x18(%rbp)
movslq -0x4(%rbp),%rax
imul   -0x10(%rbp),%rax
mov    -0x18(%rbp),%rcx
mov    %al,(%rcx)
call   1140 <func1>
movslq -0x4(%rbp),%rcx
imul   -0x10(%rbp),%rcx
mov    %ecx,%eax
add    $0x20,%rsp
pop    %rbp
ret

<main>:
push   %rbp
mov    %rsp,%rbp
sub    $0x10,%rsp
mov    %fs:0x28,%rax
mov    %rax,-0x8(%rbp)
mov    $0x1,%edi
mov    $0x2,%esi
lea    -0x9(%rbp),%rdx
call   1150 <func2>
mov    %eax,-0x10(%rbp)
mov    %fs:0x28,%rcx
mov    -0x8(%rbp),%rdx
cmp    %rdx,%rcx
jne    11d9 <main+0x49>
xor    %eax,%eax
add    $0x10,%rsp
pop    %rbp
ret

可以清晰的看到，在执行 call 1140 <func2> 之前 main() 进行了如下操作：

1
2
3

mov    $0x1,%edi
mov    $0x2,%esi
lea    -0x9(%rbp),%rdx

事实上，这三条语句意在进行参数的传递．在进行函数调用时，主调函数将参数存储在寄存器中，在被调函数中直接使用，通过这样的方式传递参数．

观察函数调用的实参 1, 2, &value，可以看到：

1 使用 rdi 的低 32 位，也就是 edi 来传递．
2 使用 rsi 的低 32 位，也就是 esi 来传递．
&value 使用 rdx 进行传递．

值得注意的是：即使 &value 的类型是指针，与整型变量看似不同，但在传递方式上并无差异．

这是 func2() 的尾部代码片段：

imul   -0x10(%rbp),%rcx
mov    %ecx,%eax
add    $0x20,%rsp
pop    %rbp
ret

可以看到乘法产生的结果通过 mov %ecx,%eax 放在了 rax 的低 32 位（eax）中．返回后，在 main() 有：

1 2	call 1150 <func2> mov %eax,-0x10(%rbp)

请看，此处 eax 中仍是 func2() 中计算的 a * b 的值，但在 main() 却进行了读取．这不就是从 被调函数 中传送给主调函数的值吗？是的，rax 寄存器常常被用来传递返回值．

讨论完了返回值与参数，此时再来看看 func2() 的调用流程：

<func2>:
push   %rbp
mov    %rsp,%rbp
sub    $0x20,%rsp
mov    %edi,-0x4(%rbp)
mov    %rsi,-0x10(%rbp)
mov    %rdx,-0x18(%rbp)
movslq -0x4(%rbp),%rax
imul   -0x10(%rbp),%rax
mov    -0x18(%rbp),%rcx
mov    %al,(%rcx)
call   1140 <func1>
movslq -0x4(%rbp),%rcx
imul   -0x10(%rbp),%rcx
mov    %ecx,%eax
add    $0x20,%rsp
pop    %rbp
ret

相信不难注意到：sub $0x20,%rsp．前文提及过，栈是由高地址向低地址的方向增长的．rsp 减少 0x20 意味着栈增长 0x20．那么栈为什么需要增长呢？因为需要在栈上为 func2() 的局部变量或临时的变量等分配空间．与 sub $0x20,%rsp 对应的操作是 add $0x20,%rsp 在函数返回前需要增加 rsp 以释放栈上的空间．其余步骤与无返回值与参数的「函数调用与返回过程」所述并无实质差异，此处不再赘述．

有返回值和参数的「函数调用与返回过程」下

前面的小节中，描述了函数参数较少的情况下参数传递的方式．本节则将关注较多的参数将为函数的调用带来什么变化．
本节采取的示例拥有 10 个参数．

/* call3.c */
void func1()
{
}
int func3(int a, int b, int c, int d, int e, int f, int g, int h, int i, int j)
{
    func1();
    return a * 1 + b * 2 + c * 3 + d * 4 + e * 5 + f * 6 + g * 7 + h * 8 + i * 9 + j * 10;
}
int main()
{
    func3(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
}

反汇编得到了较长的汇编代码．

<func1>:
push   %rbp
mov    %rsp,%rbp
pop    %rbp
ret

<func3>:
push   %rbp
mov    %rsp,%rbp
push   %rbx
sub    $0x28,%rsp
mov    0x28(%rbp),%eax
mov    0x20(%rbp),%r10d
mov    0x18(%rbp),%r11d
mov    0x10(%rbp),%ebx
mov    %edi,-0xc(%rbp)
mov    %esi,-0x10(%rbp)
mov    %edx,-0x14(%rbp)
mov    %ecx,-0x18(%rbp)
mov    %r8d,-0x1c(%rbp)
mov    %r9d,-0x20(%rbp)
mov    %eax,-0x24(%rbp)
mov    %r10d,-0x28(%rbp)
mov    %r11d,-0x2c(%rbp)
mov    %ebx,-0x30(%rbp)
call   1120 <func1>
mov    -0xc(%rbp),%eax
shl    $0x0,%eax
mov    -0x10(%rbp),%ecx
shl    $0x1,%ecx
add    %ecx,%eax
imul   $0x3,-0x14(%rbp),%ecx
add    %ecx,%eax
mov    -0x18(%rbp),%ecx
shl    $0x2,%ecx
add    %ecx,%eax
imul   $0x5,-0x1c(%rbp),%ecx
add    %ecx,%eax
imul   $0x6,-0x20(%rbp),%ecx
add    %ecx,%eax
imul   $0x7,0x10(%rbp),%ecx
add    %ecx,%eax
mov    0x18(%rbp),%ecx
shl    $0x3,%ecx
add    %ecx,%eax
imul   $0x9,0x20(%rbp),%ecx
add    %ecx,%eax
imul   $0xa,0x28(%rbp),%ecx
add    %ecx,%eax
add    $0x28,%rsp
pop    %rbx
pop    %rbp
ret
<main>:
push   %rbp
mov    %rsp,%rbp
sub    $0x30,%rsp
mov    $0x1,%edi
mov    $0x2,%esi
mov    $0x3,%edx
mov    $0x4,%ecx
mov    $0x5,%r8d
mov    $0x6,%r9d
movl   $0x7,(%rsp)
movl   $0x8,0x8(%rsp)
movl   $0x9,0x10(%rsp)
movl   $0xa,0x18(%rsp)
call   1130 <func3>
xor    %ecx,%ecx
mov    %eax,-0x4(%rbp)
mov    %ecx,%eax
add    $0x30,%rsp
pop    %rbp
ret

笔者首先关注的是 main() 的这个部分：

mov    $0x1,%edi
mov    $0x2,%esi
mov    $0x3,%edx
mov    $0x4,%ecx
mov    $0x5,%r8d
mov    $0x6,%r9d
movl   $0x7,(%rsp)
movl   $0x8,0x8(%rsp)
movl   $0x9,0x10(%rsp)
movl   $0xa,0x18(%rsp)
call   1130 <func3>

可以发现．在进行参数的传递时，第 1 个参数（从 1 开始计数）至第 6 个参数分别采用 rdi、rsi、rdx、rcx、r8、r9 这 6 个寄存器对应的低 32 位部分．而剩余的 4 个参数采取了压栈的方式进行传递．

x86-64 调用约定

首先，笔者要声明的是：调用约定与设备的 ABI（application binary interface）有关，而 ABI 依赖「硬件特性」与「操作系统」．在 x86-64 上也不只有一种调用约定．

Microsoft x64 calling convention

这张表展示了 Microsoft x64 calling convention 的部分内容，笔者展示这张表的目的不在于向读者介绍 Microsoft x64 calling convention 的具体内容，仅仅是为了说明调用约定不止一种．当遇到与笔者接下来介绍的 System V AMD64 ABI 不同的调用约定时，也不要对此感到惊奇和诧异．

Parameter type fifth and higher fourth third second leftmost
floating-point stack XMM3 XMM2 XMM1 XMM0
integer stack R9 R8 RDX RCX
Aggregates (8, 16, 32, or 64 bits) and __m64 stack R9 R8 RDX RCX
Other aggregates, as pointers stack R9 R8 RDX RCX
__m128, as a pointer stack R9 R8 RDX RCX
[3]

Parameter type	fifth and higher	fourth	third	second	leftmost
floating-point	stack	XMM3	XMM2	XMM1	XMM0
integer	stack	R9	R8	RDX	RCX
Aggregates (8, 16, 32, or 64 bits) and __m64	stack	R9	R8	RDX	RCX
Other aggregates, as pointers	stack	R9	R8	RDX	RCX
__m128, as a pointer	stack	R9	R8	RDX	RCX

System V AMD64 ABI

本节中将介绍 System V AMD64 ABI 的部分特性．

函数的前六个参数（每个参数均小于等于 8 byte 且不为浮点型变量）将由左至右依次存放在 rdi、rsi、rdx、rcx、r8、r9 的相应位置，更多的参数将由右向左依次入栈，借助栈完成参数的传递．返回值将保存在 rax 中．

请看代码：

int func4(int a, unsigned b, long c, unsigned long d, long long e, unsigned long long f);
int main()
{
    func4(1, 2U, 3L, 4UL, 5LL, 6ULL);
}

通过编译器与反汇编工具可以得到这段代码的汇编语言描述．

<main>:
push   %rbp
mov    %rsp,%rbp
mov    $0x6,%r9d
mov    $0x5,%r8d
mov    $0x4,%ecx
mov    $0x3,%edx
mov    $0x2,%esi
mov    $0x1,%edi
call   29 <main+0x29>
mov    $0x0,%eax
pop    %rbp
ret

可以看到常量（更准确的叫法是「立即数」）0x1 被存放在 edi、0x2 被存放在 esi、0x3 被存放在 edx、0x4 被存放在 ecx、0x5 被存放在 r8d、0x6 被存放在 r9d．
此时，可能有读者为此感到疑惑：「不是说第一个参数放在 rdi 吗？怎么放在 edi 里了？（后面的几个参数也会有雷同的疑惑）」
事实上，这并非是什么错误．edi 在 x86_64 上是 rdi 的低 32 位；类似的，esi 在 x86_64 上是 rsi 的低 32 位；edx 在 x86_64 上是 rdx 的低 32 位；ecx 在 x86_64 上是 rcx 的低 32 位，r8d 在 x86_64 上是 r8 的低 32 位；r9d 在 x86_64 上是 r9 的低 32 位．

值的注意的还有一点：在 x86_64 平台上，例如： mov $0x1,%edi 等源操作数为 double word 的 mov 指令中，目的寄存器的高 32 位会被置为 0．这也使得可以使用将 零扩展 与 复制 一步完成．

info
INFO

本文将不会给予进一步说明的是：

XMM0 到 XMM7 用来放置浮点型变量
对于系统调用，R10 用来替代 RCX [4]

回看上文中给出的示例，将会发现文中示例无不符合了 System V AMD64 ABI 的要求．

结构体的按值传递

struct test_small
{
    int a;
    char ch[4];
};

struct test_small func1(struct test_small arg)
{
    arg.a = 0;
    arg.ch[0] = 3;
    return arg;
}
struct test_big
{
    long a, b, c;
};
struct test_big func2(struct test_big arg)
{
    arg.a = arg.b + arg.c;
    return arg;
}

int main()
{
    struct test_small s;
    func1(s);
    struct test_big b;
    b.a = 1;
    b.b = 2;
    b.c = 3;
    func2(b);
}

可以看到源码中定义了两个结构体．其中 struct test_small 大小为 8 bytes，struct test_big 大小为 24 bytes．
大小对结构体按值传递的方式有及其重要的影响．

通过反汇编可以得到：

<func1>:
push   %rbp
mov    %rsp,%rbp
mov    %rdi,-0x8(%rbp) # 将 rdi 中的 strcut test_small arg 复制到 rbp - 0x8
movl   $0x0,-0x8(%rbp) # arg.a = 0 ;
movb   $0x3,-0x4(%rbp) # arg.ch[0] = 3 ;
mov    -0x8(%rbp),%rax # 将 arg 复制在 rax 中返回
pop    %rbp
ret

<func2>:
push   %rbp
mov    %rsp,%rbp
mov    %rdi,-0x8(%rbp) # rdi 里存放的是 arg 的地址，把 arg 的地址复制到 rbp - 0x8
mov    0x18(%rbp),%rdx # 把 在 main() 中复制到栈上的 b.c 复制到 rdx
mov    0x20(%rbp),%rax # 把 在 main() 中复制到栈上的 b.b 复制到 rax
add    %rdx,%rax # arg.b + arg.c
mov    %rax,0x10(%rbp) # arg.a = arg.b + arg.c
mov    -0x8(%rbp),%rcx # arg 的地址被放在了 rcx 里
mov    0x10(%rbp),%rax
mov    0x18(%rbp),%rdx
mov    %rax,(%rcx) # 本行开始是为返回 arg 作准备 arg.a = arg.a
mov    %rdx,0x8(%rcx) # arg.b = arg.b
mov    0x20(%rbp),%rax # 把 rbp + 0x20 指向的值复制到 rax
mov    %rax,0x10(%rcx) # arg.c = arg.c
mov    -0x8(%rbp),%rax # 把 arg 的地址放在 rax 里返回
pop    %rbp
ret

<main>:
push   %rbp
mov    %rsp,%rbp # rbp = rsp
sub    $0x50,%rsp # rsp -= 0x50 所以 rsp == rbp - 0x50
mov    %fs:0x28,%rax
mov    %rax,-0x8(%rbp)
xor    %eax,%eax
mov    -0x10(%rbp),%rax # 将 rbp - 0x10 处的 Quad Word 复制到 rax．rbp - 0x10 存放的是 s．
mov    %rax,%rdi # 将 rax 里的 s 复制到 rdi．作为 struct test_small arg 实参传递给 func1()．
call   1139 <func1> # 调用 func1()
movq   $0x1,-0x30(%rbp) # b.a = 1 ;
movq   $0x2,-0x28(%rbp) # b.b = 2 ;
movq   $0x3,-0x20(%rbp) # b.c = 3 ;
lea    -0x50(%rbp),%rax # 当前栈顶的地址为 rbp - 0x50，将栈顶的地址复制到 rax
push   -0x20(%rbp) # 将 b.c 压入栈
push   -0x28(%rbp) # 将 b.b 压入栈
push   -0x30(%rbp) # 将 b.a 压入栈，这三次压栈完成了对 结构体 struct test_big b 的复制，且 struct test_big b 的副本的地址已存放在了 rax
mov    %rax,%rdi # 将 struct test_big b 的副本的地址复制给 rdi．
call   1152 <func2> # 调用 func2()
add    $0x18,%rsp
mov    $0x0,%eax
mov    -0x8(%rbp),%rdx
sub    %fs:0x28,%rdx
je     11f7 <main+0x6d>
call   1030 <__stack_chk_fail@plt>
leave
ret

可以看到大小为 8 bytes 的 struct test_small 存储在 rdi 中完成了传递．而大小为 24 bytes 的struct test_big 则无法存放仅仅能容纳 8 bytes 的 rdi 中，自然没法使用 rdi 进行传递．使用栈完成对 struct test_big 等大于 8 bytes 的结构体（当然也不仅仅只是结构体，联合体、int128_t 等数据也使用类似的方式传递）进行传递成为了仅有的办法．

本段代码中，func2 的逻辑较为复杂，可能需要读者将 main() 与 func2() 相互参考才能明白其中的逻辑．
当遇到困难时，读者可通过画出栈的图示的方式进行分析．（可参考笔者在本文开始处的做法）

笔者也画了三张图用来表示 struct test_big 的传递过程，供读者参考．

即将执行push -0x20(%rbp)

完成执行call 1152 <func2>

完成执行mov %rdi,-0x8(%rbp)

需要提醒一下的是：结构体的大小并不是结构体的各个成员的大小的代数和．结构体的大小还需要考虑内存对齐的因素．在判断结构体的按值传递方式时，内存对齐将是一个不容忽略的因素．

通过这次的分析，可以发现，大结构体（大于 8 bytes）的按值传递的效率较低．当对程序的运行效率有较高的要求时，应当首先考虑传址而不是传值．

C++ 与参数传递

在 x86_64 Linux 平台上，C++ 的程序的普通函数调用过程与上文中所述并无差异．

将上文代码使用 g++ 编译后重新反汇编得到的代码为：

int func4(int a, unsigned b, long c, unsigned long d, long long e, unsigned long long f);
int main()
{
    func4(1, 2U, 3L, 4UL, 5LL, 6ULL);
}

<main>:
push   %rbp
mov    %rsp,%rbp
mov    $0x6,%r9d
mov    $0x5,%r8d
mov    $0x4,%ecx
mov    $0x3,%edx
mov    $0x2,%esi
mov    $0x1,%edi
call   29 <main+0x29>
mov    $0x0,%eax
pop    %rbp
ret

但类的非静态成员函数的调用与上文有较多不同．在不同中，又可分为两类：

非虚成员函数
虚成员函数

非虚成员函数

#include <cstdio>
class test
{
    int a, b;

public:
    test() = default;
    int sum()
    {
        return a + b;
    }
};
int main()
{
    test t;
    int s = t.sum();
    printf("%d\n", s);
}

C++ 语言通过 g++ 生成的程序反汇编得到的代码可能没有 C 语言通过 gcc 生成的程序反汇编的得到的代码那么简单易懂．

<main>:
push   %rbp
mov    %rsp,%rbp
sub    $0x20,%rsp
mov    %fs:0x28,%rax
mov    %rax,-0x8(%rbp)
xor    %eax,%eax
lea    -0x10(%rbp),%rax # -0x10(%rbp) 是个局部变量，本指令将局部变量的地址存储在了 rax
mov    %rax,%rdi # 将 rax 拷贝至 rdi
call   11a0 <_ZN4test3sumEv> # 调用 sum() 函数
mov    %eax,-0x14(%rbp)
mov    -0x14(%rbp),%eax
mov    %eax,%esi
lea    0xe89(%rip),%rdi
mov    $0x0,%eax
call   1030 <printf@plt>
mov    $0x0,%eax
mov    -0x8(%rbp),%rdx
sub    %fs:0x28,%rdx
je     119e <main+0x55>
call   1040 <__stack_chk_fail@plt>
leave
ret

<_ZN4test3sumEv>:
push   %rbp
mov    %rsp,%rbp
mov    %rdi,-0x8(%rbp) # 把 rdi 中存放的地址拷贝至局部变量
mov    -0x8(%rbp),%rax # 将局部变量中存储的地址拷贝至 rax
mov    (%rax),%edx # 将 rax 指向的 一个 double word 拷贝至 edx
mov    -0x8(%rbp),%rax # 再次将局部变量中存储的地址拷贝至 rax
mov    0x4(%rax),%eax # 将 (rax + 0x4) 的一个 double word 拷贝至 eax
add    %edx,%eax # 将 edx 加在 eax 上
pop    %rbp
ret

众所周知，C++ 的非静态成员函数有一个隐式的参数就是 *this 指向成员函数所在的类的类型的指针．
例如：
在考虑 C++ 与汇编代码的关系时，可以将本例中 sum 的理解为：

int sum(class test *this)
{
    return this->a + this->b;
}

简而言之，C++ 非静态非虚成员函数含有一个隐式的 this 指针参数，作为第一个参数传递．

这与上文所说的一致．

「第一个小于等于 8 bytes 的整形参数在 System V AMD64 ABI」通过 rdi 传递

好，现在尝试增多 C++ 非静态非虚成员函数的参数数量．

#include <cstdio>
class test
{
    int a, b;

public:
    test() = default;
    int sum2(int u, int v, int w, int x, int y, int z)
    {
        return a + b + u + v + w + x + y + z;
    }
};
int main()
{
    test t;
    int s = t.sum2(1, 2, 3, 4, 5, 6);
    printf("%d\n", s);
}

反汇编得到：

<main>:
push   %rbp
mov    %rsp,%rbp
sub    $0x20,%rsp
mov    %fs:0x28,%rax
mov    %rax,-0x8(%rbp)
xor    %eax,%eax
lea    -0x10(%rbp),%rax # -0x10(%rbp) 是个局部变量，本指令将局部变量的地址存储在了 rax
sub    $0x8,%rsp
push   $0x6 # 注意 这个参数使用了 栈 进行传递
mov    $0x5,%r9d # 参数传递
mov    $0x4,%r8d # 参数传递
mov    $0x3,%ecx # 参数传递
mov    $0x2,%edx # 参数传递
mov    $0x1,%esi # 参数传递
mov    %rax,%rdi # 将 rax 里保存的指针 拷贝至 rdi
call   11c6 <_ZN4test4sum2Eiiiiii> # 调用 sum2() 函数
add    $0x10,%rsp
mov    %eax,-0x14(%rbp)
mov    -0x14(%rbp),%eax
mov    %eax,%esi
lea    0xe64(%rip),%rdi
mov    $0x0,%eax
call   1030 <printf@plt>
mov    $0x0,%eax
mov    -0x8(%rbp),%rdx
sub    %fs:0x28,%rdx
je     11c3 <main+0x7a>
call   1040 <__stack_chk_fail@plt>
leave
ret

<_ZN4test4sum2Eiiiiii>:
push   %rbp
mov    %rsp,%rbp
mov    %rdi,-0x8(%rbp) # 在栈上保存 rdi，rdi
mov    %esi,-0xc(%rbp) # 在栈上保存 esi
mov    %edx,-0x10(%rbp) # 在栈上保存 edx
mov    %ecx,-0x14(%rbp) # 在栈上保存 ecx
mov    %r8d,-0x18(%rbp) # 在栈上保存 r8d
mov    %r9d,-0x1c(%rbp) # 在栈上保存 r8d
mov    -0x8(%rbp),%rax # rdi 里保存的指针 复制到 rax
mov    (%rax),%edx # rdi 里保存的指针 指向的 double word 复制到 edx
mov    -0x8(%rbp),%rax # rdi 里保存的指针 复制到 rax
mov    0x4(%rax),%eax # rdi 里保存的指针+0x4 的 double word 复制到 eax
add    %eax,%edx # eax 加到 edx
mov    -0xc(%rbp),%eax # -0xc(%rbp) 里是之前 esi 里的值，也就是 形参 int u 的值
add    %eax,%edx # eax 加到 edx
mov    -0x10(%rbp),%eax # -0x10(%rbp) 里是之前 edx 里的值，也就是 形参 int v 的值
add    %eax,%edx # eax 加到 edx
mov    -0x14(%rbp),%eax # -0x14(%rbp) 里是之前 ecx 里的值，也就是 形参 int w 的值
add    %eax,%edx # eax 加到 edx
mov    -0x18(%rbp),%eax # -0x18(%rbp) 里是之前 r8d 里的值，也就是 形参 int x 的值
add    %eax,%edx # eax 加到 edx
mov    -0x1c(%rbp),%eax # -0x1c(%rbp) 里是之前 r9d 里的值，也就是 形参 int y 的值
add    %eax,%edx # eax 加到 edx
mov    0x10(%rbp),%eax # 0x10(%rbp) 之前被 push 在了栈上，也就是 形参 int z 的值
add    %edx,%eax # eax 加到 edx
pop    %rbp
ret

可以看到：算上隐式的 this 指针，函数 sum2() 共有 7 个参数．参数 1-6 仍然依次采用 rdi、rsi、rdx、rcx、r8、r9 进行传递．第 7 个参数 int z 也正常的使用了栈进行传递．

总结一下，C++ 非静态非虚函数成员的调用过程与 C 语言函数的唯一差别在于需要把 *this 理解为一个参数．

虚成员函数

在给出本节的示例之前，笔者认为有必要再次强调下面的代码只是为了演示虚成员函数的调用过程．如果有人在实际的程序设计的情景中仿照笔者给出的这些示例，那么请允许笔者借用 Scott Meyers 的一句话：

把他们隔离起来直到他们保证不再这样做为止

（笔者在Effective C++ 或是 More Effective C++ 中看到过这句话，但找不到具体出处了，这句只是根据自己的回忆写出的）．

#include <cstdio>
class test1
{
protected:
    int a;
    int b;

public:
    test1(int A, int B) : a(A), b(B) {}
    virtual void info()
    {
        printf("max=%d\n", a > b ? a : b);
    }
};
class test2 : public test1
{
public:
    test2(int x, int y) : test1(x, y) {}
    void info() override
    {
        printf("min=%d\n", a < b ? a : b);
    }
};
int main()
{
    test1 t1(1, 2);
    test2 t2(3, 4);
    test1 *pr1 = &t1;
    pr1->info();
    test2 *pr2 = &t2;
    pr2->info();
}

编译后反汇编得到：

<main>:
push   %rbp
mov    %rsp,%rbp
sub    $0x40,%rsp
mov    %fs:0x28,%rax
mov    %rax,-0x8(%rbp)
xor    %eax,%eax
lea    -0x30(%rbp),%rax # 把 rbp - 0x30
mov    $0x2,%edx # 参数
mov    $0x1,%esi # 参数
mov    %rax,%rdi # 参数 this 指针
call   11de <_ZN5test1C1Eii> # 构造 t1
lea    -0x20(%rbp),%rax
mov    $0x4,%edx # 参数
mov    $0x3,%esi # 参数
mov    %rax,%rdi # 参数 this 指针
call   1256 <_ZN5test2C1Eii> # 构造 t2
lea    -0x30(%rbp),%rax # t1 的地址存放在 rax 里
mov    %rax,-0x40(%rbp) # t1 的地址从 rax 里复制到 rbp - 0x40．rbp - 0x40 存储的是 pr1
mov    -0x40(%rbp),%rax # 把 pr1 复制到 rax
mov    (%rax),%rax # 把 pr1 指向的 Quad Word 复制到 rax
mov    (%rax),%rdx # 把 pr1 指向的 Quad Word 指向的 Quad Word 复制到 rdx
mov    -0x40(%rbp),%rax # 把 pr1 复制到 rax
mov    %rax,%rdi # 传递参数，把 this 指针从 rax 复制到 rdi．pr2 的值就是 this 指针的实参．
call   *%rdx # 调用 rdx 指向的函数指针
lea    -0x20(%rbp),%rax # t2 的地址存放在 rax 里
mov    %rax,-0x38(%rbp) # t2 的地址从 rax 里复制到 rbp - 0x38．rbp - 0x38 存储的是 pr2
mov    -0x38(%rbp),%rax # 把 pr2 复制到 rax
mov    (%rax),%rax # 把 pr2 指向的 Quad Word 复制到 rax
mov    (%rax),%rdx # 把 pr2 指向的 Quad Word 指向的 Quad Word 复制到 rdx
mov    -0x38(%rbp),%rax# 把 pr2 复制到 rax
mov    %rax,%rdi # 传递参数，把 this 指针从 rax 复制到 rdi．pr2 的值就是 this 指针的实参．
call   *%rdx # 调用 rdx 指向的函数指针
mov    $0x0,%eax
mov    -0x8(%rbp),%rcx
sub    %fs:0x28,%rcx
je     11db <main+0x92>
call   1040 <__stack_chk_fail@plt>
leave
ret

<_ZN5test1C1Eii>: # test1 构造函数
push   %rbp
mov    %rsp,%rbp
mov    %rdi,-0x8(%rbp)
mov    %esi,-0xc(%rbp)
mov    %edx,-0x10(%rbp)
lea    0x2ba5(%rip),%rdxs
mov    -0x8(%rbp),%rax
mov    %rdx,(%rax)
mov    -0x8(%rbp),%rax
mov    -0xc(%rbp),%edx
mov    %edx,0x8(%rax)
mov    -0x8(%rbp),%rax
mov    -0x10(%rbp),%edx
mov    %edx,0xc(%rax)
pop    %rbp
ret

<_ZN5test14infoEv>: # test1::info()
push   %rbp
mov    %rsp,%rbp
sub    $0x10,%rsp
mov    %rdi,-0x8(%rbp)
mov    -0x8(%rbp),%rax
mov    0x8(%rax),%edx
mov    -0x8(%rbp),%rax
mov    0xc(%rax),%eax
cmp    %eax,%edx
jle    1239 <_ZN5test14infoEv+0x27>
mov    -0x8(%rbp),%rax
mov    0x8(%rax),%eax
jmp    1240 <_ZN5test14infoEv+0x2e>
mov    -0x8(%rbp),%rax
mov    0xc(%rax),%eax
mov    %eax,%esi
lea    0xdbb(%rip),%rdi
mov    $0x0,%eax
call   1030 <printf@plt>
leave
ret

<_ZN5test2C1Eii>: # test2 构造函数
push   %rbp
mov    %rsp,%rbp
sub    $0x10,%rsp
mov    %rdi,-0x8(%rbp)
mov    %esi,-0xc(%rbp)
mov    %edx,-0x10(%rbp)
mov    -0x8(%rbp),%rax
mov    -0x10(%rbp),%edx
mov    -0xc(%rbp),%ecx
mov    %ecx,%esi
mov    %rax,%rdi
call   11de <_ZN5test1C1Eii>
lea    0x2afd(%rip),%rdx
mov    -0x8(%rbp),%rax
mov    %rdx,(%rax)
leave
ret

<_ZN5test24infoEv>: # test2::info()
push   %rbp
mov    %rsp,%rbp
sub    $0x10,%rsp
mov    %rdi,-0x8(%rbp)
mov    -0x8(%rbp),%rax
mov    0x8(%rax),%edx
mov    -0x8(%rbp),%rax
mov    0xc(%rax),%eax
cmp    %eax,%edx
jge    12b5 <_ZN5test24infoEv+0x27>
mov    -0x8(%rbp),%rax
mov    0x8(%rax),%eax
jmp    12bc <_ZN5test24infoEv+0x2e>
mov    -0x8(%rbp),%rax
mov    0xc(%rax),%eax
mov    %eax,%esi
lea    0xd47(%rip),%rdi
mov    $0x0,%eax
call   1030 <printf@plt>
leave
ret

笔者本段代码中 main() 的汇编语言描述提供了十分详细的注释，相信读者可根据注释自行理解．

C++ 众多编译器都采用虚函数表的方式实现了 C++ 的虚函数调用．在本例中，gcc 自然也没有什么例外的使用虚函数表实现 C++ 的虚函数功能．

虚函数表可以理解为一个函数指针的数组．编译器需要为含有虚函数的类型生成一张虚函数表，而同一个类型的多个实例将通过存储虚函数表的首元素的地址共享同一张虚函数表．

/* 下面的是伪代码，只是为了展示虚函数表的与类的关系 */
class test1
{
    static const void *virtualFunctionTable[SIZE];
    int a;
    int b;
};

此处只是一个粗略的描述，所以笔者采用了 void *virtualFunctionTable[SIZE]; 这种写法，实际上这种写法很不严谨．
写成 void (*virtualFunctionTable[SIZE])(); 这种写法并不能更好．写成 void* 首先较为方便，并且避免读者纠结于类似「void (*func)(int *a,int b);」这种函数指针不能存放在 void (*virtualFunctionTable[SIZE])() 这类次要问题．请务必注意这只是一个为了方便理解虚函数表，笔者给出的伪代码而已．
可以看到在虚函数的调用中，需要访问虚函数表来完成函数的定位，但除此之外，参数的传递与函数值的返回仍然遵守前文所述的规则．

参考资料

¹. 段刚.加密与解密[M].第4版.北京:电子工业出版社. ↩

². KipIrvine.汇编语言：基于x86处理器[M].原书第7版.贺莲,译.北京:机械工业出版社. ↩

³. Randal E.Bryant.深入理解计算机系统[M].第三版.龚奕利,译.北京:机械工业出版社. ↩

⁴. x64 calling convention[G/OL]. docs.microsoft.com. https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-160. ↩

⁵. 维基百科编者. X86调用约定[G/OL]. 维基百科. 2020(20200922)[2020-09-22]. https://zh.wikipedia.org/zh-hans/X86调用约定. ↩

⁶. WikipediaContributors. X86调用约定[G/OL]. 维基百科. 2020(20200922)[2020-09-22]. https://en.wikipedia.org/wiki/X86_calling_conventions. ↩

2021-08-08发表CyberspaceSecurity / Dev1 小时读完 (大约10249个字)

ptmalloc2 源码解析

warning
WARNING

本文所有解读仅代表 Y7n05h 的个人见解，不代表 Glibc 的贡献者观点．
受限于 Y7n05h 的能力于水平，Y7n05h 无法保证本文的正确性．

info
INFO

本文文中所有代码均使用 LGLP 2.1 or later 授权．
本文所述内容均基于 Glibc 2.33．Y7n05h 对 malloc.c 按照自己的代码风格偏好进行了重新格式化并添加了注释，修改后的 malloc.c 见本文附录．

malloc

tip
TIP

在初次阅读 ptmalloc2 源码的过程中，笔者建议暂且忽略「多线程环境下的特殊行为」、「调试信息」、「TCACHE」等相关内容．笔者将在了解 malloc 的大致流程后，再次阅读与「多线程环境」、「TCACHE」相关的代码．
有关「调试信息」的内容笔者不做解读．
有关「多线程环境」、「TCACHE」相关的内容将在本文结尾处补充．

略去关于「编译」和「链接」的细节，在此可以认为 malloc() 就是调用了 void *__libc_malloc(size_t bytes) ．

在 __libc_malloc() 源码中：

1
2
3

void *(*hook)(size_t, const void *) = atomic_forced_read(__malloc_hook);//使用原子操作读取 __malloc_hook 将其赋值给 hook
if (__builtin_expect(hook != NULL, 0))
    return (*hook)(bytes, RETURN_ADDRESS(0));

__malloc_hook 是全局变量，定义是：

1	void weak_variable (__malloc_hook)(size_t __size, const void *) = malloc_hook_ini;

static void *
malloc_hook_ini (size_t sz, const void *caller)
{
  __malloc_hook = NULL;
  ptmalloc_init ();
  return __libc_malloc (sz);
}

在程序运行时，__malloc_hook 被初始化为 malloc_hook_ini．在程序中调用 __malloc_hook 时，通过原子操作读取至局部变量 hook．

info
INFO

weak_variable 的定义是：

1 2	#define weak_variable weak_function # define weak_function __attribute__ ((weak))

__attribute__ ((weak)) 是 GCC 提供的语法扩展，可以将一个「符号」定义为「弱符号」．

这是关于「链接」的内容，有兴趣的读者请自行查阅相关资料．
如读者尚未理解 weak_variable 的含义，只需无视 weak_variable 即可，不影响后文的阅读．

info
INFO

1 2	# define __glibc_unlikely(cond) __builtin_expect ((cond), 0) # define __glibc_likely(cond) __builtin_expect ((cond), 1)

__builtin_expect ((cond), 0) 与 __builtin_expect ((cond), 1) 都是针对「分支预测」的做出的优化．

__builtin_expect ((cond), 0) 表示大多数情况下 cond == False．通常使用 __glibc_unlikely(cond) 简化．
__builtin_expect ((cond), 1) 表示大多数情况下 cond == True．通常使用 __glibc_likely(cond) 简化．
注意：__builtin_expect ((cond), 0) == cond 且 __builtin_expect ((cond), 1) == cond．__builtin_expect 不改变 cond 的值！

if (SINGLE_THREAD_P)// 是单线程
{
    victim = TAG_NEW_USABLE(_int_malloc(&main_arena, bytes));
    assert(!victim || chunk_is_mmapped(mem2chunk(victim)) || &main_arena == arena_for_chunk(mem2chunk(victim)));
    return victim;
}

这段代码看似复杂，实际上很简单，核心的代码只有：

if (SINGLE_THREAD_P)// 是单线程
{
    return _int_malloc(&main_arena, bytes);
}

进入 _int_malloc()

 /*
  Convert request size to internal form by adding SIZE_SZ bytes overhead plus possibly more to obtain necessary alignment and/or
  to obtain a size of at least MINSIZE, the smallest allocatable size.
  Also, checked_request2size returns false for request sizes that are so large that they wrap around zero when padded and aligned.
*/

 if (!checked_request2size(bytes, &nb))// 根据 bytes 、元数据大小、对齐要求计算 nb
 {
     __set_errno(ENOMEM);
     return NULL;
 }

 /* There are no usable arenas.  Fall back to sysmalloc to get a chunk from mmap.  */
 if (__glibc_unlikely(av == NULL)) {
     void *p = sysmalloc(nb, av);// av 不存在，向 system 申请更多的 memory
     if (p != NULL)
         alloc_perturb(p, bytes);
     return p;
 }

`fastbin` 检查

#define REMOVE_FB(fb, victim, pp)                                          \
    do {                                                                   \
        victim = pp;                                                       \
        if (victim == NULL)                                                \
            break;                                                         \
        pp = REVEAL_PTR(victim->fd);                                       \
        if (__glibc_unlikely(pp != NULL && misaligned_chunk(pp)))          \
            malloc_printerr("malloc(): unaligned fastbin chunk detected"); \
    } while ((pp = catomic_compare_and_exchange_val_acq(fb, pp, victim)) != victim);

    if ((unsigned long) (nb) <= (unsigned long) (get_max_fast()))// get_max_fast() 默认值为 DEFAULT_MXFAST
    {
        // 因为 nb 是对齐的，当 nb <= get_max_fast() 时，必为 fastbin 的合法大小．不存在 nb 不超过 get_max_fast() 却无 fastbin_index 与之对应的情况．
        idx = fastbin_index(nb);
        mfastbinptr *fb = &fastbin(av, idx);
        mchunkptr pp;
        victim = *fb;

        if (victim != NULL) {//对应的 fastbin 不为空
            if (__glibc_unlikely(misaligned_chunk(victim)))
                malloc_printerr("malloc(): unaligned fastbin chunk detected 2");

            if (SINGLE_THREAD_P)             //单线程
                *fb = REVEAL_PTR(victim->fd);// 删除第一个节点．REVEAL_PTR 是为了提高安全性，抵抗恶意攻击．
            else                             // 多线程
                REMOVE_FB(fb, pp, victim);
            if (__glibc_likely(victim != NULL)) {
                size_t victim_idx = fastbin_index(chunksize(victim));// 本行的计算是为了下一行的 assert
                if (__builtin_expect(victim_idx != idx, 0))
                    malloc_printerr("malloc(): memory corruption (fast)");
                check_remalloced_chunk(av, victim, nb);// malloc_debug 的 assert
#if USE_TCACHE
                /* While we're here, if we see other chunks of the same size, stash them in the tcache.  */
                size_t tc_idx = csize2tidx(nb);
                if (tcache && tc_idx < mp_.tcache_bins) {
                    mchunkptr tc_victim;

                    /* While bin not empty and tcache not full, copy chunks.  */
                    while (tcache->counts[tc_idx] < mp_.tcache_count && (tc_victim = *fb) != NULL) {
                        if (__glibc_unlikely(misaligned_chunk(tc_victim)))
                            malloc_printerr("malloc(): unaligned fastbin chunk detected 3");
                        if (SINGLE_THREAD_P)
                            *fb = REVEAL_PTR(tc_victim->fd);
                        else {
                            REMOVE_FB(fb, pp, tc_victim);
                            if (__glibc_unlikely(tc_victim == NULL))
                                break;
                        }
                        tcache_put(tc_victim, tc_idx);
                    }
                }
#endif
                void *p = chunk2mem(victim);
                alloc_perturb(p, bytes);
                return p;
            }
        }//对应的 fastbin 为空
    }

此时笔者只关注：

/* 删去了暂不关注的内容 */
if ((unsigned long) (nb) <= (unsigned long) (get_max_fast()))// get_max_fast() 默认值为 DEFAULT_MXFAST
{
    // 因为 nb 是对齐的，当 nb <= get_max_fast() 时，必为 fastbin 的合法大小．不存在 nb 不超过 get_max_fast() 却无 fastbin_index 与之对应的情况．
    idx = fastbin_index(nb);
    mfastbinptr *fb = &fastbin(av, idx);
    mchunkptr pp;
    victim = *fb;

    if (victim != NULL) {//对应的 fastbin 不为空
        if (__glibc_unlikely(misaligned_chunk(victim)))
            malloc_printerr("malloc(): unaligned fastbin chunk detected 2");

        if (SINGLE_THREAD_P)             //单线程
            *fb = REVEAL_PTR(victim->fd);// 删除第一个节点．REVEAL_PTR 是为了提高安全性，抵抗恶意攻击．

        if (__glibc_likely(victim != NULL)) {
            size_t victim_idx = fastbin_index(chunksize(victim));// 本行的计算是为了下一行的 assert
            if (__builtin_expect(victim_idx != idx, 0))
                malloc_printerr("malloc(): memory corruption (fast)");
            check_remalloced_chunk(av, victim, nb);// malloc_debug 的 assert

            void *p = chunk2mem(victim);
            alloc_perturb(p, bytes);
            return p;
        }
    }//对应的 fastbin 为空
}

若 nb 在 fastbin 范围内，则计算 nb 对应的 fastbin 下标 idx．根据 idx 从 av 中取出对应的链表（其头指针的指针为 fb，其首节点为 victim）．

若 victim == NULL，则链表为空，完成对 fastbin 的检查．
若 victim != NULL，则从 fb 中移除 victim，返回对应的指针．_int_malloc() 返回．

`smallbin` 检查

    /*
     If a small request, check regular bin.
     Since these "smallbins" hold one size each, no searching within bins is necessary.
     (For a large request, we need to wait until unsorted chunks are processed to find best fit.
     But for small ones, fits are exact anyway, so we can check now, which is faster.)
   */

    if (in_smallbin_range(nb)) {// 准确的表述为：不满足 LargeBin 分配的最小值

        //值得注意的是：在fastbin 环节分配失败会进入此处．
        idx = smallbin_index(nb);
        bin = bin_at(av, idx);

        //别忘了 FastBin 与 SmallBin 有重叠，在 FastBin 分配不成功的执行流有可能在 SmallBin 中完成分配
        if ((victim = last(bin)) != bin)// smallbin 是双向链表，last(bin)!=bin 则 bin 不为空．
        {
            bck = victim->bk;
            if (__glibc_unlikely(bck->fd != victim))
                malloc_printerr("malloc(): smallbin double linked list corrupted");
            set_inuse_bit_at_offset(victim, nb);
            bin->bk = bck;
            bck->fd = bin;

            if (av != &main_arena)
                set_non_main_arena(victim);
            check_malloced_chunk(av, victim, nb);
#if USE_TCACHE
            /* While we're here, if we see other chunks of the same size, stash them in the tcache.  */
            size_t tc_idx = csize2tidx(nb);
            if (tcache && tc_idx < mp_.tcache_bins) {
                mchunkptr tc_victim;

                /* While bin not empty and tcache not full, copy chunks over.  */
                while (tcache->counts[tc_idx] < mp_.tcache_count && (tc_victim = last(bin)) != bin) {
                    if (tc_victim != 0) {
                        bck = tc_victim->bk;
                        set_inuse_bit_at_offset(tc_victim, nb);
                        if (av != &main_arena)
                            set_non_main_arena(tc_victim);
                        bin->bk = bck;
                        bck->fd = bin;

                        tcache_put(tc_victim, tc_idx);
                    }
                }
            }
#endif
            void *p = chunk2mem(victim);
            alloc_perturb(p, bytes);
            return p;
        }
    }

若 nb 在 SmallBin范围内（也就是不足 LargeBin 的最小值），则尝试使用 SmallBin：

根据 nb 计算对应的下标 idx，根据 idx 定位到链表的头节点 bin
- 若 (victim = last(bin)) == bin，则链表为空，则 SmallBin 无法满足 nb 的需求，结束 SmallBin 检查．
- 若 (victim = last(bin)) != bin，则链表不空，那么从链表中删去尾节点．将尾节点对应的指针返回．

tip
TIP

对比 FastBin 与 SmallBin

FastBin 使用单链表管理空闲的 chunk．
SmallBin 使用双向循环链表管理空闲的 chunk．

FastBin 使用指向「第一个节点的指针」的指针管理单链表．

1	mfastbinptr *fb = &fastbin(av, idx);

当 *fb == NULL 时，链表为空．
当 *fb != NULL 时，链表不空．*fb 指向第一个空闲的 chunk．

SmallBin 使用指向「头节点」的指针管理双向循环链表．

1	bin = bin_at(av, idx);// bin 是头节点的指针

bin 指向链表中第头节点，头节点不是有效的空闲 chunk，头节点只是为了更方便管理双向循环链表人为添加的节点．

当 bin->bk == bin 时，链表为空（指除了头节点之外没有其他节点）．
当 bin->bk != bin 时，链表不空．

FastBin 与 SmallBin 中的每个链表中的 chunk 的大小相同．

`UnsortBin` 检查

执行至此说明在 FastBin 和 SamllBin 中未能完成内存分配．

        int iters = 0;// 下面的 while 循环的循环变量

        while ((victim = unsorted_chunks(av)->bk) != unsorted_chunks(av)) {// 当 UnsortBin 不为空时，进行循环．

            bck = victim->bk;//victim 是 UnsortBin 中的 尾节点，bck 指向倒数第 2 个节点
            size = chunksize(victim);
            mchunkptr next = chunk_at_offset(victim, size);

            /* 安全检查 */
            if (__glibc_unlikely(size <= CHUNK_HDR_SZ) || __glibc_unlikely(size > av->system_mem))
                malloc_printerr("malloc(): invalid size (unsorted)");
            if (__glibc_unlikely(chunksize_nomask(next) < CHUNK_HDR_SZ) || __glibc_unlikely(chunksize_nomask(next) > av->system_mem))
                malloc_printerr("malloc(): invalid next size (unsorted)");
            if (__glibc_unlikely((prev_size(next) & ~(SIZE_BITS)) != size))
                malloc_printerr("malloc(): mismatching next->prev_size (unsorted)");
            if (__glibc_unlikely(bck->fd != victim) || __glibc_unlikely(victim->fd != unsorted_chunks(av)))
                malloc_printerr("malloc(): unsorted double linked list corrupted");
            if (__glibc_unlikely(prev_inuse(next)))
                malloc_printerr("malloc(): invalid next->prev_inuse (unsorted)");


            /* 有删节 */

#define MAX_ITERS 10000
            if (++iters >= MAX_ITERS)// iter 表示迭代数量，防止 UnsortBin 过长造成响应速度严重下降
                break;
        }

安全检查用来及时发现相关数据结构被意外损坏或被恶意篡改．

/*
 If a small request, try to use last remainder if it is the only chunk in unsorted bin.
 This helps promote locality for runs of consecutive small requests.
 This is the only exception to best-fit, and applies only when there is no exact fit for a small chunk.
*/


if (in_smallbin_range(nb) &&                               // SmallBin 可满足需求
    bck == unsorted_chunks(av) &&                          //倒数第二个节点是头节点，即 UnsortBin 中有且仅有一个节点．请注意上方的英文注释．
    victim == av->last_remainder &&                        // 是上次剩余的块
    (unsigned long) (size) > (unsigned long) (nb + MINSIZE)//若 目标块 大于 所需 且 剩余部分可形成一个新的 chunk
) {
    /* split and reattach remainder */
    remainder_size = size - nb;             // 剩余 chunk 大小
    remainder = chunk_at_offset(victim, nb);//返回 剩余的 chunk 的 ptr
    unsorted_chunks(av)->bk = unsorted_chunks(av)->fd = remainder;
    av->last_remainder = remainder;
    remainder->bk = remainder->fd = unsorted_chunks(av);
    if (!in_smallbin_range(remainder_size)) {//检测剩余 chunk 是否在 SmallBin 范围中，不在则设置 fd 与 bk
        remainder->fd_nextsize = NULL;
        remainder->bk_nextsize = NULL;
    }

    set_head(victim, nb | PREV_INUSE | (av != &main_arena ? NON_MAIN_ARENA : 0));// ？？？ 为什么在这里需要设置 PREV_INUSE
    set_head(remainder, remainder_size | PREV_INUSE);
    set_foot(remainder, remainder_size);

    check_malloced_chunk(av, victim, nb);//malloc debug 的 assert
    void *p = chunk2mem(victim);
    alloc_perturb(p, bytes);
    return p;
}

当 UnsortBin 中仅剩余上次分割的 chunk（且这个 chunk 大于 nb + MINSIZE，即满足 nb 且剩余部分能形成一个新的 chunk）时，使用这个 chunk，并将剩余部分根据大小插入 SmallBin 或 LargeBin．

/* remove from unsorted list */
if (__glibc_unlikely(bck->fd != victim))// bck 是 victim 的前驱，bck 的后继理应等于 victim
    malloc_printerr("malloc(): corrupted unsorted chunks 3");
unsorted_chunks(av)->bk = bck;//从 UnsortBin 中删除 victim
bck->fd = unsorted_chunks(av);

将当前节点 victim 从 UnsortBin 中删除．

            /* Take now instead of binning if exact fit */

            if (size == nb) {
                set_inuse_bit_at_offset(victim, size);
                if (av != &main_arena)
                    set_non_main_arena(victim);
#if USE_TCACHE
                /* Fill cache first, return to user only if cache fills. We may return one of these chunks later.  */
                if (tcache_nb && tcache->counts[tc_idx] < mp_.tcache_count) {
                    tcache_put(victim, tc_idx);
                    return_cached = 1;
                    continue;
                } else {
#endif
                    check_malloced_chunk(av, victim, nb);
                    void *p = chunk2mem(victim);
                    alloc_perturb(p, bytes);
                    return p;
#if USE_TCACHE
                }
#endif
            }

若当前节点 victim 大小与 nb 相同，则返回当前节点对应的 chunk 对应的指针．结束分配流程．

/* size != bin */
/* place chunk in bin */

if (in_smallbin_range(size)) {
    victim_index = smallbin_index(size);
    bck = bin_at(av, victim_index);
    fwd = bck->fd;
} else {
    victim_index = largebin_index(size);
    bck = bin_at(av, victim_index);
    fwd = bck->fd;

    /* maintain large bins in sorted order */
    if (fwd != bck) {//若 LargeBin 不为空
        /* Or with inuse bit to speed comparisons */
        size |= PREV_INUSE;
        /* if smaller than smallest, bypass loop below */
        assert(chunk_main_arena(bck->bk));//？？？
        if ((unsigned long) (size) < (unsigned long) chunksize_nomask(bck->bk))
        //bck->bk 是最小的 chunk，所以 large 是非递增序列
        {
            fwd = bck;
            bck = bck->bk;
            //fd_next bk_next 不包含 头节点
            victim->fd_nextsize = fwd->fd;
            victim->bk_nextsize = fwd->fd->bk_nextsize;
            fwd->fd->bk_nextsize = victim->bk_nextsize->fd_nextsize = victim;
        } else {
            assert(chunk_main_arena(fwd));
            while ((unsigned long) size < chunksize_nomask(fwd)) {
                fwd = fwd->fd_nextsize;
                assert(chunk_main_arena(fwd));
            }

            if ((unsigned long) size == (unsigned long) chunksize_nomask(fwd))
                /* Always insert in the second position.  */
                fwd = fwd->fd;//？？？
            else {
                victim->fd_nextsize = fwd;
                victim->bk_nextsize = fwd->bk_nextsize;
                if (__glibc_unlikely(fwd->bk_nextsize->fd_nextsize != fwd))
                    malloc_printerr("malloc(): largebin double linked list corrupted (nextsize)");
                fwd->bk_nextsize = victim;
                victim->bk_nextsize->fd_nextsize = victim;
            }
            bck = fwd->bk;
            if (bck->fd != fwd)
                malloc_printerr("malloc(): largebin double linked list corrupted (bk)");
        }
    } else
        //若 LargeBin 为空
        victim->fd_nextsize = victim->bk_nextsize = victim;
}

mark_bin(av, victim_index);
victim->bk = bck;
victim->fd = fwd;
fwd->bk = victim;
bck->fd = victim;

fwd 与 bck 的设置可能为本段代码的阅读带来了一些困扰．在本段代码中，因各个分支均需要实现对 victim 插入链表中，为复用更多的代码，在每种情况中可以只合理的设置 fwd 和 bck，在本段末尾处通过设置好的 fwd 和 bck 集中完成插入．（在将 victim 插入 LargeBin 的情况中，每个分支均需额外设置 fd_nextsize 和 bk_nextsize）

`LargeBin` 检查

 /*
  If a large request, scan through the chunks of current bin in sorted order to find smallest that fits.  Use the skip list for this.
*/

 //是 大请求
 if (!in_smallbin_range(nb)) {
     bin = bin_at(av, idx);

     /* skip scan if empty or largest chunk is too small */
     if ((victim = first(bin)) != bin && (unsigned long) chunksize_nomask(victim) >= (unsigned long) (nb)) {
         victim = victim->bk_nextsize;//victim 现在指向最小的 chunk
         while (((unsigned long) (size = chunksize(victim)) < (unsigned long) (nb)))
             victim = victim->bk_nextsize;

         // 两个 chunk 一样大，用第二个 chunk ？？？
         /* Avoid removing the first entry for a size so that the skip list does not have to be rerouted.  */
         if (victim != last(bin) && chunksize_nomask(victim) == chunksize_nomask(victim->fd))
             victim = victim->fd;

         remainder_size = size - nb;
         unlink_chunk(av, victim);

         /* Exhaust */
         if (remainder_size < MINSIZE) {//剩余部分不足以形成新的 chunk，那么不做切割
             set_inuse_bit_at_offset(victim, size);
             if (av != &main_arena)
                 set_non_main_arena(victim);
         }
         /* Split */
         else {//剩余部分可以形成新的 chunk
             remainder = chunk_at_offset(victim, nb);

             // UnsortBin 最多只遍历 MAX_ITERS 次，无法保证为空
             /* We cannot assume the unsorted list is empty and therefore have to perform a complete insert here.  */
             bck = unsorted_chunks(av);
             fwd = bck->fd;
             if (__glibc_unlikely(fwd->bk != bck))
                 malloc_printerr("malloc(): corrupted unsorted chunks");
             remainder->bk = bck;//在 UnsortBin 的头节点之后插入剩余 chunk
             remainder->fd = fwd;
             bck->fd = remainder;
             fwd->bk = remainder;
             if (!in_smallbin_range(remainder_size)) {//如果不是 SmallChunk 则将 fd_next bk_next 置为 NULL
                 remainder->fd_nextsize = NULL;
                 remainder->bk_nextsize = NULL;
             }
             set_head(victim, nb | PREV_INUSE | (av != &main_arena ? NON_MAIN_ARENA : 0));
             set_head(remainder, remainder_size | PREV_INUSE);
             set_foot(remainder, remainder_size);
         }
         check_malloced_chunk(av, victim, nb);
         void *p = chunk2mem(victim);
         alloc_perturb(p, bytes);
         return p;
     }
 }

若 nb 不在 SmallBin 范围内，则从 LargeBin 中寻找最合适的chunk（大于等于 nb 的最小 chunk）．
若最合适的 chunk 的大小大于 nb + MINSIZE 则进行分割，剩余部分放在 UnsortBin 中．之后返回切分出的 chunk 对应的指针．结束分配流程．

其他情况

 /*
  Search for a chunk by scanning bins, starting with next largest bin.
  This search is strictly by best-fit; i.e., the smallest (with ties going to approximately the least recently used) chunk that fits is selected.

  The bitmap avoids needing to check that most blocks are nonempty.
  The particular case of skipping all bins during warm-up phases when no chunks have been returned yet is faster than it might look.
*/

 ++idx;
 bin = bin_at(av, idx);
 block = idx2block(idx);
 map = av->binmap[block];
 bit = idx2bit(idx);

看到这里，使笔者困惑的是 idx 的值是什么？回顾一下之前的代码．

 if (in_smallbin_range(nb)) {// 准确的表述为：不满足 LargeBin 分配的最小值

     //在fastbin 环节分配失败会进入此处．
     idx = smallbin_index(nb);
     bin = bin_at(av, idx);// bin 是头节点的指针

     /* 有删节 */
 }

 /*
  If this is a large request, consolidate fastbins before continuing.
  While it might look excessive to kill all fastbins before
  even seeing if there is space available, this avoids
  fragmentation problems normally associated with fastbins.
  Also, in practice, programs tend to have runs of either small or
  large requests, but less often mixtures, so consolidation is not
  invoked all that often in most programs. And the programs that
  it is called frequently in otherwise tend to fragment.
*/

 else {
     //FastBin SmallBin 中分配失败不会进入此处
     idx = largebin_index(nb);
     if (atomic_load_relaxed(&av->have_fastchunks))
         malloc_consolidate(av);
 }

可以看到：

若 nb 在 SmallBin 范围内，则 idx 为 nb 在 SmallBin 中对应的下标．
若 nb 在 LargeBin 范围内，则 idx 为 nb 在 LargeBin 中对应的下标．

因为当前 idx 中无法完成内存分配，那么去下一个 idx 中查找．

1 2	++idx; bin = bin_at(av, idx);

那么 idx2block(idx) 有什么含义呢？

/* Conservatively use 32 bits per map word, even if on 64bit system */
#define BINMAPSHIFT 5
#define BITSPERMAP (1U << BINMAPSHIFT)
#define BINMAPSIZE (NBINS / BITSPERMAP)

#define idx2block(i) ((i) >> BINMAPSHIFT)
#define idx2bit(i) ((1U << ((i) & ((1U << BINMAPSHIFT) - 1))))

#define mark_bin(m, i) ((m)->binmap[idx2block(i)] |= idx2bit(i))
#define unmark_bin(m, i) ((m)->binmap[idx2block(i)] &= ~(idx2bit(i)))
#define get_binmap(m, i) ((m)->binmap[idx2block(i)] & idx2bit(i))

idx2block() 是为了将 idx 转换成对应的 bitmap 下标．
为什么 idx2block(i) 将 i 右移 5 位呢？注意看这段代码的英文注释，在看看 bitmap 的定义：

1 2	/* Bitmap of bins */ unsigned int binmap[BINMAPSIZE];

看到这里，笔者恍然大悟：

bitmap 是 unsigned int 类型的变量的数组．
unsigned int 大小为 32 bits．
32 是 $2^5$
i 是无符号数，那么 i >> 5 等于 $\frac{i}{2^5}$ (向下取整)

for (;;) {
    /* Skip rest of block if there are no more set bits in this block.  */
    if (bit > map || bit == 0) {//？？？
        do {
            if (++block >= BINMAPSIZE) /* out of bins */
                goto use_top;
        } while ((map = av->binmap[block]) == 0);

        bin = bin_at(av, (block << BINMAPSHIFT));
        bit = 1;
    }

    /* Advance to bin with set bit. There must be one. */
    while ((bit & map) == 0) {
        bin = next_bin(bin);
        bit <<= 1;
        assert(bit != 0);
    }

    /* Inspect the bin. It is likely to be non-empty */
    victim = last(bin);

    /*  If a false alarm (empty bin), clear the bit. */
    if (victim == bin) {
        av->binmap[block] = map &= ~bit; /* Write through */
        bin = next_bin(bin);
        bit <<= 1;
    }

    else {
        size = chunksize(victim);

        /*  We know the first chunk in this bin is big enough to use. */
        assert((unsigned long) (size) >= (unsigned long) (nb));

        remainder_size = size - nb;

        /* unlink */
        unlink_chunk(av, victim);

        /* Exhaust */
        if (remainder_size < MINSIZE) {
            set_inuse_bit_at_offset(victim, size);
            if (av != &main_arena)
                set_non_main_arena(victim);
        }

        /* Split */
        else {
            remainder = chunk_at_offset(victim, nb);

            /* We cannot assume the unsorted list is empty and therefore have to perform a complete insert here.  */
            bck = unsorted_chunks(av);
            fwd = bck->fd;
            if (__glibc_unlikely(fwd->bk != bck))
                malloc_printerr("malloc(): corrupted unsorted chunks 2");
            remainder->bk = bck;
            remainder->fd = fwd;
            bck->fd = remainder;
            fwd->bk = remainder;

            /* advertise as last remainder */
            if (in_smallbin_range(nb))
                av->last_remainder = remainder;
            if (!in_smallbin_range(remainder_size)) {
                remainder->fd_nextsize = NULL;
                remainder->bk_nextsize = NULL;
            }
            set_head(victim, nb | PREV_INUSE |
                                     (av != &main_arena ? NON_MAIN_ARENA : 0));
            set_head(remainder, remainder_size | PREV_INUSE);
            set_foot(remainder, remainder_size);
        }
        check_malloced_chunk(av, victim, nb);
        void *p = chunk2mem(victim);
        alloc_perturb(p, bytes);
        return p;
    }
}

还是相同的套路：
选定 chunk 后，检测 chunk 大小

若 chunk 大于等于 nb + MINSIZE 则分割，剩余部分形成新的块，放入 UnsortBin．返回分割的 chunk 对应的指针，结束分配流程．
若 chunk 小于 nb + MINSIZE 则全部作为分配的 chunk，返回对应的指针结束分配流程．

使用 Top chunk

use_top:
    /*
     If large enough, split off the chunk bordering the end of memory (held in av->top). Note that this is in accord with the best-fit search rule.
     In effect, av->top is treated as larger (and thus less well fitting) than any other available chunk since it can be extended to be as large as necessary (up to system limitations).
     We require that av->top always exists (i.e., has size >= MINSIZE) after initialization, so if it would otherwise beexhausted by current request, it is replenished.
     (The mainreason for ensuring it exists is that we may need MINSIZE spaceto put in fenceposts in sysmalloc.)
   */

    victim = av->top;
    size = chunksize(victim);

    if (__glibc_unlikely(size > av->system_mem))
        malloc_printerr("malloc(): corrupted top size");

    if ((unsigned long) (size) >= (unsigned long) (nb + MINSIZE)) {
        remainder_size = size - nb;
        remainder = chunk_at_offset(victim, nb);
        av->top = remainder;
        set_head(victim, nb | PREV_INUSE |
                                 (av != &main_arena ? NON_MAIN_ARENA : 0));
        set_head(remainder, remainder_size | PREV_INUSE);

        check_malloced_chunk(av, victim, nb);
        void *p = chunk2mem(victim);
        alloc_perturb(p, bytes);
        return p;
    }

    /* When we are using atomic ops to free fast chunks we can get here for all block sizes.  */
    else if (atomic_load_relaxed(&av->have_fastchunks)) {
        malloc_consolidate(av);
        /* restore original bin index */
        if (in_smallbin_range(nb))
            idx = smallbin_index(nb);
        else
            idx = largebin_index(nb);
    }

    /* Otherwise, relay to handle system-dependent cases */
    else {
        void *p = sysmalloc(nb, av);
        if (p != NULL)
            alloc_perturb(p, bytes);
        return p;
    }

free

终于来到了 free() 的部分．

void __libc_free(void *mem) {
    mstate ar_ptr;
    mchunkptr p; /* chunk corresponding to mem */

    void (*hook)(void *, const void *) = atomic_forced_read(__free_hook);
    if (__builtin_expect(hook != NULL, 0)) {
        (*hook)(mem, RETURN_ADDRESS(0));
        return;
    }

    if (mem == 0) /* free(0) has no effect */
        return;

#ifdef USE_MTAG
    /* Quickly check that the freed pointer matches the tag for the memory. This gives a useful double-free detection.  */
    *(volatile char *) mem;
#endif

    int err = errno;

    p = mem2chunk(mem);

    /* Mark the chunk as belonging to the library again.  */
    (void) TAG_REGION(chunk2rawmem(p), CHUNK_AVAILABLE_SIZE(p) - CHUNK_HDR_SZ);

    if (chunk_is_mmapped(p)) /* release mmapped memory. */
    {
        /* See if the dynamic brk/mmap threshold needs adjusting. Dumped fake mmapped chunks do not affect the threshold.  */
        if (!mp_.no_dyn_threshold && chunksize_nomask(p) > mp_.mmap_threshold && chunksize_nomask(p) <= DEFAULT_MMAP_THRESHOLD_MAX && !DUMPED_MAIN_ARENA_CHUNK(p)) {
            mp_.mmap_threshold = chunksize(p);
            mp_.trim_threshold = 2 * mp_.mmap_threshold;
            LIBC_PROBE(memory_mallopt_free_dyn_thresholds, 2, mp_.mmap_threshold, mp_.trim_threshold);
        }
        munmap_chunk(p);
    } else {
        MAYBE_INIT_TCACHE();

        ar_ptr = arena_for_chunk(p);
        _int_free(ar_ptr, p, 0);
    }

    __set_errno(err);
}

看到这段代码：

void (*hook)(void *, const void *) = atomic_forced_read(__free_hook);
if (__builtin_expect(hook != NULL, 0)) {
    (*hook)(mem, RETURN_ADDRESS(0));
    return;
}

看到这段代码，真是熟悉的流程．在 __libc_malloc() 源码中：

1
2
3

void *(*hook)(size_t, const void *) = atomic_forced_read(__malloc_hook);//使用原子操作读取 __malloc_hook 将其赋值给 hook
if (__builtin_expect(hook != NULL, 0))
    return (*hook)(bytes, RETURN_ADDRESS(0));

是不是非常的相似啊？笔者也这样想．看看 __free_hook 的定义：

1	void weak_variable (__free_hook)(void __ptr, const void *) = NULL;

不太一样的地方出现了：

1	void weak_variable (__malloc_hook)(size_t __size, const void *) = malloc_hook_ini;

__malloc_hook 被初始化为 malloc_hook_ini，而 __free_hook 被初始化为 NULL．

1 2	if (mem == 0) /* free(0) has no effect */ return;

NULL == 0 在大部分环境中都是为真的．也就是说 free(NULL) 是被允许的，并且不会报错．

int err = errno;

p = mem2chunk(mem);

/* Mark the chunk as belonging to the library again.  */
(void) TAG_REGION(chunk2rawmem(p), CHUNK_AVAILABLE_SIZE(p) - CHUNK_HDR_SZ);

if (chunk_is_mmapped(p)) /* release mmapped memory. */
{
    /* See if the dynamic brk/mmap threshold needs adjusting. Dumped fake mmapped chunks do not affect the threshold.  */
    if (!mp_.no_dyn_threshold && chunksize_nomask(p) > mp_.mmap_threshold && chunksize_nomask(p) <= DEFAULT_MMAP_THRESHOLD_MAX && !DUMPED_MAIN_ARENA_CHUNK(p)) {
        mp_.mmap_threshold = chunksize(p);
        mp_.trim_threshold = 2 * mp_.mmap_threshold;
        LIBC_PROBE(memory_mallopt_free_dyn_thresholds, 2, mp_.mmap_threshold, mp_.trim_threshold);
    }
    munmap_chunk(p);
} else {
    MAYBE_INIT_TCACHE();

    ar_ptr = arena_for_chunk(p);
    _int_free(ar_ptr, p, 0);
}

__set_errno(err);

这段代码首先判断内存的来源．

若内存通过 mmap 分配，则在一些检查后执行 munmap．
若内存通过 brk 分配，则执行 _int_free()．

size = chunksize(p);

/* Little security check which won't hurt performance: the allocator never wrapps around at the end of the address space.
 Therefore we can exclude some size values which might appear here by accident or by "design" from some intruder.  */
if (__builtin_expect((uintptr_t) p > (uintptr_t) -size, 0) || __builtin_expect(misaligned_chunk(p), 0))
    malloc_printerr("free(): invalid pointer");
/* We know that each chunk is at least MINSIZE bytes in size or a multiple of MALLOC_ALIGNMENT.  */
if (__glibc_unlikely(size < MINSIZE || !aligned_OK(size)))
    malloc_printerr("free(): invalid size");

check_inuse_chunk(av, p);

常规的安全检查．

    if ((unsigned long) (size) <= (unsigned long) (get_max_fast())

#if TRIM_FASTBINS
        /*
        If TRIM_FASTBINS set, don't place chunks bordering top into fastbins
        */
        && (chunk_at_offset(p, size) != av->top)
#endif
    ) {

        if (__builtin_expect(chunksize_nomask(chunk_at_offset(p, size)) <= CHUNK_HDR_SZ, 0) || __builtin_expect(chunksize(chunk_at_offset(p, size)) >= av->system_mem, 0)) {
            bool fail = true;
            /* We might not have a lock at this point and concurrent modifications of system_mem might result in a false positive.
             Redo the test after getting the lock.
             */
            if (!have_lock) {
                __libc_lock_lock(av->mutex);
                fail = (chunksize_nomask(chunk_at_offset(p, size)) <= CHUNK_HDR_SZ || chunksize(chunk_at_offset(p, size)) >= av->system_mem);
                __libc_lock_unlock(av->mutex);
            }

            if (fail)
                malloc_printerr("free(): invalid next size (fast)");
        }

        free_perturb(chunk2mem(p), size - CHUNK_HDR_SZ);

        atomic_store_relaxed(&av->have_fastchunks, true);
        unsigned int idx = fastbin_index(size);//size 在 FastBin 对应的 idx
        fb = &fastbin(av, idx);                //fb 是指向头指针的指针

        /* Atomically link P to its fastbin: P->FD = *FB; *FB = P;  */
        mchunkptr old = *fb, old2;

        if (SINGLE_THREAD_P) {//单线程
            /* Check that the top of the bin is not the record we are going to add (i.e., double free).  */
            if (__builtin_expect(old == p, 0))//安全检查
                malloc_printerr("double free or corruption (fasttop)");
            /* 在 FastBin 链表的头指针处 */
            p->fd = PROTECT_PTR(&p->fd, old);//为阻碍恶意攻击，把 &p->fd 处理后存到 old
            *fb = p;
        } else//多线程
            do {
                /* Check that the top of the bin is not the record we are going to add (i.e., double free).  */
                if (__builtin_expect(old == p, 0))
                    malloc_printerr("double free or corruption (fasttop)");
                old2 = old;
                p->fd = PROTECT_PTR(&p->fd, old);
            } while ((old = catomic_compare_and_exchange_val_rel(fb, p, old2)) != old2);

        /* Check that size of fastbin chunk at the top is the same as size of the chunk that we are adding.
         We can dereference OLD only if we have the lock, otherwise it might have already been allocated again.
         */
        if (have_lock && old != NULL && __builtin_expect(fastbin_index(chunksize(old)) != idx, 0))
            malloc_printerr("invalid fastbin entry (free)");
    }

若 chunk 在 FastBin 范围内，则在检查后将其插入对应的 FastBin 链表的头指针处．

TCACHE 与多线程环境

__libc_malloc

#if USE_TCACHE
    /* int_free also calls request2size, be careful to not pad twice.  */
    size_t tbytes;
    if (!checked_request2size(bytes, &tbytes)) {
        __set_errno(ENOMEM);
        return NULL;
    }
    size_t tc_idx = csize2tidx(tbytes);//把 size 转换成 tcache 的 idx

    MAYBE_INIT_TCACHE();

    DIAG_PUSH_NEEDS_COMMENT;
    if (tc_idx < mp_.tcache_bins && tcache && tcache->counts[tc_idx] > 0) {
        victim = tcache_get(tc_idx);
        return TAG_NEW_USABLE(victim);
    }
    DIAG_POP_NEEDS_COMMENT;
#endif

根据 bytes 计算添加元数据后符合对齐要求的 chunk 大小 tbytes．
根据 tbytes 计算相应的 tcache 下标 tc_idx．

/* Caller must ensure that we know tc_idx is valid and there's available chunks to remove.  */
static __always_inline void *tcache_get(size_t tc_idx) {
    tcache_entry *e = tcache->entries[tc_idx];
    if (__glibc_unlikely(!aligned_OK(e)))
        malloc_printerr("malloc(): unaligned tcache chunk detected");
    tcache->entries[tc_idx] = REVEAL_PTR(e->next);
    --(tcache->counts[tc_idx]);
    e->key = NULL;
    return (void *) e;
}

根据 tc_idx 定位链表头指针 tcache->entries[tc_idx]，从链表中删除第一个节点，并将其返回．

看完 tcache 相关的内容，再看看多线程相关的内容：

// 是多线程
arena_get(ar_ptr, bytes);// 获得线程的 arena

victim = _int_malloc(ar_ptr, bytes);
/* Retry with another arena only if we were able to find a usable arena before.  */
if (!victim && ar_ptr != NULL) {
    LIBC_PROBE(memory_malloc_retry, 1, bytes);
    ar_ptr = arena_get_retry(ar_ptr, bytes);
    victim = _int_malloc(ar_ptr, bytes);
}

if (ar_ptr != NULL)
    __libc_lock_unlock(ar_ptr->mutex);

victim = TAG_NEW_USABLE(victim);

assert(!victim || chunk_is_mmapped(mem2chunk(victim)) || ar_ptr == arena_for_chunk(mem2chunk(victim)));
return victim;

多线程部分的主要逻辑是：

获取线程的一个 arena 后，并尝试在其中完成内存分配．若失败，换一块 arean 尝试进行内存分配．

进入 _int_malloc()：

#if USE_TCACHE
                /* While we're here, if we see other chunks of the same size, stash them in the tcache.  */
                size_t tc_idx = csize2tidx(nb);
                if (tcache && tc_idx < mp_.tcache_bins) {
                    mchunkptr tc_victim;

                    /* While bin not empty and tcache not full, copy chunks.  */
                    while (tcache->counts[tc_idx] < mp_.tcache_count && (tc_victim = *fb) != NULL) {
                        if (__glibc_unlikely(misaligned_chunk(tc_victim)))
                            malloc_printerr("malloc(): unaligned fastbin chunk detected 3");
                        if (SINGLE_THREAD_P)
                            *fb = REVEAL_PTR(tc_victim->fd);
                        else {
                            REMOVE_FB(fb, pp, tc_victim);
                            if (__glibc_unlikely(tc_victim == NULL))
                                break;
                        }
                        tcache_put(tc_victim, tc_idx);
                    }
                }
#endif

注意本段代码的英文注释：这里给出了一个将 nb 对应的 FastBin 链表中剩余的 chunk 移动到 tcache 的途径．当然， tcache 同样需要受到 tcache_count 的限制，默认情况下 tcache_count 的值为 7．

/* Caller must ensure that we know tc_idx is valid and there's room for more chunks.  */
static __always_inline void tcache_put(mchunkptr chunk, size_t tc_idx) {
    tcache_entry *e = (tcache_entry *) chunk2mem(chunk);

    /* Mark this chunk as "in the tcache" so the test in _int_free will detect a double free.  */
    e->key = tcache;

    e->next = PROTECT_PTR(&e->next, tcache->entries[tc_idx]);
    tcache->entries[tc_idx] = e;
    ++(tcache->counts[tc_idx]);
}

可以看到 tcache_put() 就是简单的链表头插入．

#if USE_TCACHE
            /* While we're here, if we see other chunks of the same size, stash them in the tcache.  */
            size_t tc_idx = csize2tidx(nb);
            if (tcache && tc_idx < mp_.tcache_bins) {
                mchunkptr tc_victim;

                /* While bin not empty and tcache not full, copy chunks over.  */
                while (tcache->counts[tc_idx] < mp_.tcache_count && (tc_victim = last(bin)) != bin) {
                    if (tc_victim != 0) {
                        bck = tc_victim->bk;
                        set_inuse_bit_at_offset(tc_victim, nb);
                        if (av != &main_arena)
                            set_non_main_arena(tc_victim);
                        bin->bk = bck;
                        bck->fd = bin;

                        tcache_put(tc_victim, tc_idx);
                    }
                }
            }
#endif

与 FastBin 中将 chunk 移动至 tcache 的路径相似，SmallBin 也存在类似的路径．
这段代码与前段代码极其相似，产生差异的原因主要在于 FastBin 使用单向链表管理 chunk，而 SmallBin 使用双向循环链表管理 chunk．

#if USE_TCACHE
    INTERNAL_SIZE_T tcache_nb = 0;
    size_t tc_idx = csize2tidx(nb);
    if (tcache && tc_idx < mp_.tcache_bins)
        tcache_nb = nb;
    int return_cached = 0;

    tcache_unsorted_count = 0;
#endif

在简单的检查后，对变量进行了赋值．

            /* Take now instead of binning if exact fit */

            if (size == nb) {
                set_inuse_bit_at_offset(victim, size);
                if (av != &main_arena)
                    set_non_main_arena(victim);
#if USE_TCACHE
                /* Fill cache first, return to user only if cache fills. We may return one of these chunks later.  */
                if (tcache_nb && tcache->counts[tc_idx] < mp_.tcache_count) {
                    tcache_put(victim, tc_idx);
                    return_cached = 1;
                    continue;
                } else {
#endif
                    check_malloced_chunk(av, victim, nb);
                    void *p = chunk2mem(victim);
                    alloc_perturb(p, bytes);
                    return p;
#if USE_TCACHE
                }
#endif
            }

请注意代码中的两处英文注释，在对 UnsortBin 的检测中，若遇到合适的 chunk 将优先填充 tcache，而不是立即返回给用户．
这里存在一个将 UnsortBin 转移至 tcache 的路径．

#if USE_TCACHE
            /* If we've processed as many chunks as we're allowed while filling the cache, return one of the cached ones.  */
            ++tcache_unsorted_count;
            if (return_cached && mp_.tcache_unsorted_limit > 0 && tcache_unsorted_count > mp_.tcache_unsorted_limit) {
                return tcache_get(tc_idx);
            }
#endif

#if USE_TCACHE
        /* If all the small chunks we found ended up cached, return one now.  */
        if (return_cached) {
            return tcache_get(tc_idx);
        }
#endif

这两段代码是如此的相似，都先检查了 return_cached （第一段还检查了其他的条件），满足则从 tcache 中取出 nb 对应的 chunk，并将其返回．

free

在 _int_free() 部分：

#if USE_TCACHE
    {
        size_t tc_idx = csize2tidx(size);
        if (tcache != NULL && tc_idx < mp_.tcache_bins) {
            /* Check to see if it's already in the tcache.  */
            tcache_entry *e = (tcache_entry *) chunk2mem(p);

            /* This test succeeds on double free.
             However, we don't 100% trust it (it also matches random payload data at a 1 in 2^<size_t> chance), so verify it's not an unlikely coincidence before aborting.
             */
            if (__glibc_unlikely(e->key == tcache)) {
                tcache_entry *tmp;
                size_t cnt = 0;
                LIBC_PROBE(memory_tcache_double_free, 2, e, tc_idx);
                for (tmp = tcache->entries[tc_idx]; tmp; tmp = REVEAL_PTR(tmp->next), ++cnt) {
                    if (cnt >= mp_.tcache_count)
                        malloc_printerr("free(): too many chunks detected in tcache");
                    if (__glibc_unlikely(!aligned_OK(tmp)))
                        malloc_printerr("free(): unaligned chunk detected in tcache 2");
                    if (tmp == e)
                        malloc_printerr("free(): double free detected in tcache 2");
                    /* If we get here, it was a coincidence.  We've wasted a few cycles, but don't abort.  */
                }
            }

            if (tcache->counts[tc_idx] < mp_.tcache_count) {
                tcache_put(p, tc_idx);
                return;
            }
        }
    }
#endif

再看源码

graph TD
    BEGIN-->__libc_malloc[\__libc_mallloc\]
    __libc_malloc-->A
    A[分配 x bytes 内存] --> B[计算所需 chunk 大小 tbytes];
    B --> C[计算 tcache 下标 tc_idx];
    C --> D[tcache 对应链表非空];
    D -- True --> tcache_get[\tcache_get\]
    tcache_get --> E[返回指针];
    E --> END;
    D -- False --> F{单线程?};
    F -- True --> G1[\_int_malloc\]
    F -- False --> H[arena_get 为 arena 加锁]
    H --> G2[\_int_malloc\]
    G2 --> L{分配成功?}
    L -- True -->I[解锁 arena]
    L -- False -->M[更换 arean]
    M --> G3[\_int_malloc\]
    G3 -->I
    I --> E
    G1 -->E;

整体流程

-->

_int_malloc

Init 计算需要分配的 chunk 大小
sysmalloc：若无合适的 arena 则调用 sysmalloc 通过 mmap 分配
Fast Bin：相同 chunk 大小、LIFO 顺序，取出头节点
- tcache：若在 FastBin 中分配成功，将同 Fast Bin 中的 chunk 头插入 tcache
Small Bin：取出尾节点
- tcache：若在 SmallBin 中分配成功，将同 Small Bin 中的 chunk 头插入 tcache
Unsorted Bin：
- 特殊规则：对于 Small Bin 范围内的请求，当 Unsorted Bin 仅剩唯一 chunk 且该 chunk 来源于 last remainder，进行分配（能切则切）
- 查找大小相同的 chunk，头插入 tcache，直到 tcache 满
- 大小不同的 chunk 按照大小插入 Large Bin（按照大小，相同大小则将当前 chunk 插入在第 2 个）或者 Small Bin（头插）
- tcache 满了后则从 Unsorted Bin 中拿出相同大小 chunk 返回
在 Unsorted Bin 中若缓存过 chunk 则返回
Large Bin：选择满足大小的 chunk 中最小的，由链表头开始遍历，若有相同大小的 chunk 则用第 2 个，能拆则拆
根据 bitmap 在更大的 Bin 中搜索，
从 Top chunk 分割

_int_free

tcache：未满则放入 tcache
Fast Bin 头插
不是 mmap 分配则尝试合并
mmap 分配则 ummap
-

ptmalloc2 changelog

2.23

2.24

2.25

New

/* Same, except also perform an argument and result check.  First, we check
   that the padding done by request2size didn't result in an integer
   overflow.  Then we check (using REQUEST_OUT_OF_RANGE) that the resulting
   size isn't so large that a later alignment would lead to another integer
   overflow.  */
#define checked_request2size(req, sz) \
({				    \
  (sz) = request2size (req);	    \
  if (((sz) < (req))		    \
      || REQUEST_OUT_OF_RANGE (sz)) \
    {				    \
      __set_errno (ENOMEM);	    \
      return 0;			    \
    }				    \
})

Old

/*  Same, except also perform argument check */

#define checked_request2size(req, sz)                             \
  if (REQUEST_OUT_OF_RANGE (req)) {					      \
      __set_errno (ENOMEM);						      \
      return 0;								      \
    }									      \
  (sz) = request2size (req);

2.26

加入 tcache

2.27

tcache Double Free 检测
New

/* We overlay this structure on the user-data portion of a chunk when
   the chunk is stored in the per-thread cache.  */
typedef struct tcache_entry
{
  struct tcache_entry *next;
  /* This field exists to detect double frees.  */
  struct tcache_perthread_struct *key;
} tcache_entry;
/* 有删节 */
/* Caller must ensure that we know tc_idx is valid and there's room
   for more chunks.  */
static __always_inline void
tcache_put (mchunkptr chunk, size_t tc_idx)
{
  tcache_entry *e = (tcache_entry *) chunk2mem (chunk);
  assert (tc_idx < TCACHE_MAX_BINS);

  /* Mark this chunk as "in the tcache" so the test in _int_free will
     detect a double free.  */
  e->key = tcache;

  e->next = tcache->entries[tc_idx];
  tcache->entries[tc_idx] = e;
  ++(tcache->counts[tc_idx]);
}
/* 有删节 */
static void
_int_free (mstate av, mchunkptr p, int have_lock)
{
/* 有删节 */
  size = chunksize (p);

  /* Little security check which won't hurt performance: the
     allocator never wrapps around at the end of the address space.
     Therefore we can exclude some size values which might appear
     here by accident or by "design" from some intruder.  */
  if (__builtin_expect ((uintptr_t) p > (uintptr_t) -size, 0)
      || __builtin_expect (misaligned_chunk (p), 0))
    malloc_printerr ("free(): invalid pointer");
  /* We know that each chunk is at least MINSIZE bytes in size or a
     multiple of MALLOC_ALIGNMENT.  */
  if (__glibc_unlikely (size < MINSIZE || !aligned_OK (size)))
    malloc_printerr ("free(): invalid size");

  check_inuse_chunk(av, p);

#if USE_TCACHE
  {
    size_t tc_idx = csize2tidx (size);
    if (tcache != NULL && tc_idx < mp_.tcache_bins)
      {
	/* Check to see if it's already in the tcache.  */
	tcache_entry *e = (tcache_entry *) chunk2mem (p);

	/* This test succeeds on double free.  However, we don't 100%
	   trust it (it also matches random payload data at a 1 in
	   2^<size_t> chance), so verify it's not an unlikely
	   coincidence before aborting.  */
	if (__glibc_unlikely (e->key == tcache))
	  {
	    tcache_entry *tmp;
	    LIBC_PROBE (memory_tcache_double_free, 2, e, tc_idx);
	    for (tmp = tcache->entries[tc_idx];
		 tmp;
		 tmp = tmp->next)
	      if (tmp == e)
		malloc_printerr ("free(): double free detected in tcache 2");
	    /* If we get here, it was a coincidence.  We've wasted a
	       few cycles, but don't abort.  */
	  }

	if (tcache->counts[tc_idx] < mp_.tcache_count)
	  {
	    tcache_put (p, tc_idx);
	    return;
	  }
      }
  }
#endif
/* 有删节 */
}

Old

/* We overlay this structure on the user-data portion of a chunk when
   the chunk is stored in the per-thread cache.  */
typedef struct tcache_entry
{
  struct tcache_entry *next;
} tcache_entry;
/* 有删节 */
/* Caller must ensure that we know tc_idx is valid and there's room
   for more chunks.  */
static __always_inline void
tcache_put (mchunkptr chunk, size_t tc_idx)
{
  tcache_entry *e = (tcache_entry *) chunk2mem (chunk);
  assert (tc_idx < TCACHE_MAX_BINS);
  e->next = tcache->entries[tc_idx];
  tcache->entries[tc_idx] = e;
  ++(tcache->counts[tc_idx]);
}
/* 有删节 */
static void
_int_free (mstate av, mchunkptr p, int have_lock)
{
/* 有删节 */
  size = chunksize (p);

  /* Little security check which won't hurt performance: the
     allocator never wrapps around at the end of the address space.
     Therefore we can exclude some size values which might appear
     here by accident or by "design" from some intruder.  */
  if (__builtin_expect ((uintptr_t) p > (uintptr_t) -size, 0)
      || __builtin_expect (misaligned_chunk (p), 0))
    malloc_printerr ("free(): invalid pointer");
  /* We know that each chunk is at least MINSIZE bytes in size or a
     multiple of MALLOC_ALIGNMENT.  */
  if (__glibc_unlikely (size < MINSIZE || !aligned_OK (size)))
    malloc_printerr ("free(): invalid size");

  check_inuse_chunk(av, p);

#if USE_TCACHE
  {
    size_t tc_idx = csize2tidx (size);

    if (tcache
	&& tc_idx < mp_.tcache_bins
	&& tcache->counts[tc_idx] < mp_.tcache_count)
      {
	tcache_put (p, tc_idx);
	return;
      }
  }
#endif
/* 有删节 */
}

malloc_consolidate() 新增 FastBin 相关 check

New

static void malloc_consolidate(mstate av)
{
  /* 有删节 */
  maxfb = &fastbin (av, NFASTBINS - 1);
  fb = &fastbin (av, 0);
  do {
    p = atomic_exchange_acq (fb, NULL);
    if (p != 0) {
      do {
	{
	  unsigned int idx = fastbin_index (chunksize (p));
	  if ((&fastbin (av, idx)) != fb)
	    malloc_printerr ("malloc_consolidate(): invalid chunk size");
	}

	check_inuse_chunk(av, p);
	nextp = p->fd;

    /* 有删节 */
      } while ( (p = nextp) != 0);

    }
  } while (fb++ != maxfb);
}

Old

static void malloc_consolidate(mstate av)
{
  /* 有删节 */
    maxfb = &fastbin (av, NFASTBINS - 1);
    fb = &fastbin (av, 0);
    do {
      p = atomic_exchange_acq (fb, NULL);
      if (p != 0) {
	do {
	  check_inuse_chunk(av, p);
	  nextp = p->fd;

    /* 有删节 */

	} while ( (p = nextp) != 0);

      }
    } while (fb++ != maxfb);
  /* 有删节 */
}

2.28

修复 __libc_malloc 中 tcache 错误
New

void *
__libc_malloc (size_t bytes)
{
  mstate ar_ptr;
  void *victim;

  void *(*hook) (size_t, const void *)
    = atomic_forced_read (__malloc_hook);
  if (__builtin_expect (hook != NULL, 0))
    return (*hook)(bytes, RETURN_ADDRESS (0));
#if USE_TCACHE
  /* int_free also calls request2size, be careful to not pad twice.  */
  size_t tbytes;
  checked_request2size (bytes, tbytes);
  size_t tc_idx = csize2tidx (tbytes);

  MAYBE_INIT_TCACHE ();

  DIAG_PUSH_NEEDS_COMMENT;
  if (tc_idx < mp_.tcache_bins
      && tcache
      && tcache->counts[tc_idx] > 0) //改动在此处
    {
      return tcache_get (tc_idx);
    }
  DIAG_POP_NEEDS_COMMENT;
#endif
/* 有删节 */
}

Old

void *
__libc_malloc (size_t bytes)
{
  mstate ar_ptr;
  void *victim;

  void *(*hook) (size_t, const void *)
    = atomic_forced_read (__malloc_hook);
  if (__builtin_expect (hook != NULL, 0))
    return (*hook)(bytes, RETURN_ADDRESS (0));
#if USE_TCACHE
  /* int_free also calls request2size, be careful to not pad twice.  */
  size_t tbytes;
  checked_request2size (bytes, tbytes);
  size_t tc_idx = csize2tidx (tbytes);

  MAYBE_INIT_TCACHE ();

  DIAG_PUSH_NEEDS_COMMENT;
  if (tc_idx < mp_.tcache_bins
      /*&& tc_idx < TCACHE_MAX_BINS*/ /* to appease gcc */
      && tcache
      && tcache->entries[tc_idx] != NULL) //改动在此处
    {
      return tcache_get (tc_idx);
    }
  DIAG_POP_NEEDS_COMMENT;
#endif
/* 有删节 */

_int_malloc 加入 checks
New

      while ((victim = unsorted_chunks (av)->bk) != unsorted_chunks (av))
        {
          bck = victim->bk;
          size = chunksize (victim);
          mchunkptr next = chunk_at_offset (victim, size);//此处有改动

          if (__glibc_unlikely (size <= 2 * SIZE_SZ)//此处有改动
              || __glibc_unlikely (size > av->system_mem))
            malloc_printerr ("malloc(): invalid size (unsorted)");
          if (__glibc_unlikely (chunksize_nomask (next) < 2 * SIZE_SZ)
              || __glibc_unlikely (chunksize_nomask (next) > av->system_mem))
            malloc_printerr ("malloc(): invalid next size (unsorted)");
          if (__glibc_unlikely ((prev_size (next) & ~(SIZE_BITS)) != size))
            malloc_printerr ("malloc(): mismatching next->prev_size (unsorted)");
          if (__glibc_unlikely (bck->fd != victim)
              || __glibc_unlikely (victim->fd != unsorted_chunks (av)))
            malloc_printerr ("malloc(): unsorted double linked list corrupted");
          if (__glibc_unlikely (prev_inuse (next)))
            malloc_printerr ("malloc(): invalid next->prev_inuse (unsorted)");

          /*
             If a small request, try to use last remainder if it is the
             only chunk in unsorted bin.  This helps promote locality for
             runs of consecutive small requests. This is the only
             exception to best-fit, and applies only when there is
             no exact fit for a small chunk.
           */

          if (in_smallbin_range (nb) &&
              bck == unsorted_chunks (av) &&
              victim == av->last_remainder &&
              (unsigned long) (size) > (unsigned long) (nb + MINSIZE))
            {
              /* split and reattach remainder */
              remainder_size = size - nb;
              remainder = chunk_at_offset (victim, nb);
              unsorted_chunks (av)->bk = unsorted_chunks (av)->fd = remainder;
              av->last_remainder = remainder;
              remainder->bk = remainder->fd = unsorted_chunks (av);
              if (!in_smallbin_range (remainder_size))
                {
                  remainder->fd_nextsize = NULL;
                  remainder->bk_nextsize = NULL;
                }

              set_head (victim, nb | PREV_INUSE |
                        (av != &main_arena ? NON_MAIN_ARENA : 0));
              set_head (remainder, remainder_size | PREV_INUSE);
              set_foot (remainder, remainder_size);

              check_malloced_chunk (av, victim, nb);
              void *p = chunk2mem (victim);
              alloc_perturb (p, bytes);
              return p;
            }

          /* remove from unsorted list */
          if (__glibc_unlikely (bck->fd != victim))//此处有改动
            malloc_printerr ("malloc(): corrupted unsorted chunks 3");
          unsorted_chunks (av)->bk = bck;
          bck->fd = unsorted_chunks (av);

          /* Take now instead of binning if exact fit */

          if (size == nb)
            {
              set_inuse_bit_at_offset (victim, size);
              if (av != &main_arena)
		set_non_main_arena (victim);
#if USE_TCACHE
	      /* Fill cache first, return to user only if cache fills.
		 We may return one of these chunks later.  */
	      if (tcache_nb
		  && tcache->counts[tc_idx] < mp_.tcache_count)
		{
		  tcache_put (victim, tc_idx);
		  return_cached = 1;
		  continue;
		}
	      else
		{
#endif
              check_malloced_chunk (av, victim, nb);
              void *p = chunk2mem (victim);
              alloc_perturb (p, bytes);
              return p;
#if USE_TCACHE
		}
#endif
            }
          /* place chunk in bin */

          if (in_smallbin_range (size))
            {
              victim_index = smallbin_index (size);
              bck = bin_at (av, victim_index);
              fwd = bck->fd;
            }
          else
            {
              victim_index = largebin_index (size);
              bck = bin_at (av, victim_index);
              fwd = bck->fd;

              /* maintain large bins in sorted order */
              if (fwd != bck)
                {
                  /* Or with inuse bit to speed comparisons */
                  size |= PREV_INUSE;
                  /* if smaller than smallest, bypass loop below */
                  assert (chunk_main_arena (bck->bk));
                  if ((unsigned long) (size)
		      < (unsigned long) chunksize_nomask (bck->bk))
                    {
                      fwd = bck;
                      bck = bck->bk;

                      victim->fd_nextsize = fwd->fd;
                      victim->bk_nextsize = fwd->fd->bk_nextsize;
                      fwd->fd->bk_nextsize = victim->bk_nextsize->fd_nextsize = victim;
                    }
                  else
                    {
                      assert (chunk_main_arena (fwd));
                      while ((unsigned long) size < chunksize_nomask (fwd))
                        {
                          fwd = fwd->fd_nextsize;
			  assert (chunk_main_arena (fwd));
                        }

                      if ((unsigned long) size
			  == (unsigned long) chunksize_nomask (fwd))
                        /* Always insert in the second position.  */
                        fwd = fwd->fd;
                      else
                        {
                          victim->fd_nextsize = fwd;
                          victim->bk_nextsize = fwd->bk_nextsize;
                          if (__glibc_unlikely (fwd->bk_nextsize->fd_nextsize != fwd))
                            malloc_printerr ("malloc(): largebin double linked list corrupted (nextsize)");
                          fwd->bk_nextsize = victim;
                          victim->bk_nextsize->fd_nextsize = victim;
                        }
                      bck = fwd->bk;
                      if (bck->fd != fwd)
                        malloc_printerr ("malloc(): largebin double linked list corrupted (bk)");
                    }
                }
              else
                victim->fd_nextsize = victim->bk_nextsize = victim;
            }

          mark_bin (av, victim_index);
          victim->bk = bck;
          victim->fd = fwd;
          fwd->bk = victim;
          bck->fd = victim;

#if USE_TCACHE
      /* If we've processed as many chunks as we're allowed while
	 filling the cache, return one of the cached ones.  */
      ++tcache_unsorted_count;
      if (return_cached
	  && mp_.tcache_unsorted_limit > 0
	  && tcache_unsorted_count > mp_.tcache_unsorted_limit)
	{
	  return tcache_get (tc_idx);
	}
#endif

#define MAX_ITERS       10000
          if (++iters >= MAX_ITERS)
            break;
        }

2.29

2.30

2.31

2.32

2.34

Old:

void *(*hook) (size_t, const void *)
  = atomic_forced_read (__malloc_hook);
if (__builtin_expect (hook != NULL, 0))
  return (*hook)(bytes, RETURN_ADDRESS (0));

New:

1 2	if (!__malloc_initialized) ptmalloc_init ();

在 2.34 中 __malloc_hook 不复存在，于此相关联的劫持 __malloc_hook 的攻击手段也随之无效．
事实上，不只是 __malloc_hook，还有 __free_hook、__realloc_hook 也不复存在． __free_hook 被直接删去．
Old:

void (*hook) (void *, const void *)
  = atomic_forced_read (__free_hook);
if (__builtin_expect (hook != NULL, 0))
  {
    (*hook)(mem, RETURN_ADDRESS (0));
    return;
  }

New:

tcache 的 double free 检测完成了修复
Old:

if (__glibc_unlikely (e->key == tcache))
  {
    tcache_entry *tmp;
    size_t cnt = 0;
    LIBC_PROBE (memory_tcache_double_free, 2, e, tc_idx);
    for (tmp = tcache->entries[tc_idx];
	 tmp;
	 tmp = REVEAL_PTR (tmp->next), ++cnt)
      {
	if (cnt >= mp_.tcache_count)
	  malloc_printerr ("free(): too many chunks detected in tcache");
	if (__glibc_unlikely (!aligned_OK (tmp)))
	  malloc_printerr ("free(): unaligned chunk detected in tcache 2");
	if (tmp == e)
	  malloc_printerr ("free(): double free detected in tcache 2");
	/* If we get here, it was a coincidence.  We've wasted a
	   few cycles, but don't abort.  */
      }
  }

if (__glibc_unlikely (e->key == tcache))
  {
    tcache_entry *tmp;
    size_t cnt = 0;
    LIBC_PROBE (memory_tcache_double_free, 2, e, tc_idx);
    for (tmp = tcache->entries[tc_idx];
	 tmp;
	 tmp = REVEAL_PTR (tmp->next), ++cnt)
      {
	if (cnt >= mp_.tcache_count)
	  malloc_printerr ("free(): too many chunks detected in tcache");
	if (__glibc_unlikely (!aligned_OK (tmp)))
	  malloc_printerr ("free(): unaligned chunk detected in tcache 2");
	if (tmp == e)
	  malloc_printerr ("free(): double free detected in tcache 2");
	/* If we get here, it was a coincidence.  We've wasted a
	   few cycles, but don't abort.  */
      }
  }

New:

if (__glibc_unlikely (e->key == tcache_key))
  {
    tcache_entry *tmp;
    size_t cnt = 0;
    LIBC_PROBE (memory_tcache_double_free, 2, e, tc_idx);
    for (tmp = tcache->entries[tc_idx];
	 tmp;
	 tmp = REVEAL_PTR (tmp->next), ++cnt)
      {
	if (cnt >= mp_.tcache_count)
	  malloc_printerr ("free(): too many chunks detected in tcache");
	if (__glibc_unlikely (!aligned_OK (tmp)))
	  malloc_printerr ("free(): unaligned chunk detected in tcache 2");
	if (tmp == e)
	  malloc_printerr ("free(): double free detected in tcache 2");
	/* If we get here, it was a coincidence.  We've wasted a
	   few cycles, but don't abort.  */
      }
  }

if (__glibc_unlikely (e->key == tcache_key))
  {
    tcache_entry *tmp;
    size_t cnt = 0;
    LIBC_PROBE (memory_tcache_double_free, 2, e, tc_idx);
    for (tmp = tcache->entries[tc_idx];
	 tmp;
	 tmp = REVEAL_PTR (tmp->next), ++cnt)
      {
	if (cnt >= mp_.tcache_count)
	  malloc_printerr ("free(): too many chunks detected in tcache");
	if (__glibc_unlikely (!aligned_OK (tmp)))
	  malloc_printerr ("free(): unaligned chunk detected in tcache 2");
	if (tmp == e)
	  malloc_printerr ("free(): double free detected in tcache 2");
	/* If we get here, it was a coincidence.  We've wasted a
	   few cycles, but don't abort.  */
      }
  }

未完待续

参考资料

¹. 俞甲子.程序员的自我修养[M].北京:电子工业出版社. ↩

2020-12-23发表Dev39 分钟读完 (大约5878个字)

GNU/Linux_C 开发实战--myshell

Linux C 开发实战—myshell

时间过的飞快，不知不觉中离笔者写完myshell已经过了不少时间了．为了进一步的巩固笔者当初从开发实战中学习到的知识，笔者决定还是补上这篇拖延了很久的博客．

需求

支持使用任意数量的 管道
支持使用命令调用其他程序
支持使用任意数量的重定向输入输出
内置 cd 命令
内置 history 命令
支持Tab键 补全
实现光标移动
屏蔽相关信号，防止 Ctrl+C 杀死
界面美观

开发过程

头文件

#include <ctype.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/wait.h>
#include <unistd.h>
#include <readline/history.h>
#include <readline/readline.h>

宏和全局变量

extern char **environ;
struct COMMAND
{
    int argc;           //参数数量
    int Redirect_FD[3]; //标准输入、标准输出、错误输出的重定向情况
    char **argv;
};
char *oldpath;

错误处理

void myerror(char *string, int line)
{
    fprintf(stderr, "\aLine:%d,error:\a\n", line);
    fprintf(stderr, "%s:%s\n", string, strerror(errno));
    exit(EXIT_FAILURE);
}

开发前的分析

多重管道可以使用 分治 的思想逐层处理，化简为单重管道的情况，而单重管道可视为先后发生A >./tmpfile和 B <./tmpfile 的情况，因此管道和重定向符的实现紧密相关．
重定向符有很多种格式，例如：> 、 >> 、 1> 、 1>> 、 2> 、 2>> 、< 、 << 、 1>&2 、 1>>&2 、 2>&1 、 2>>&1 ，但这次练习的重点不是字符串的解析，故此笔者不计划实现最后的五种．
Tab键补全、历史记录的存放等功能均由 readline 库实现(感谢GNU Project为此作出的贡献)．
界面美观的要求通过输出带有颜色的文字和输出对齐的文本来实现
调用其他程序则涉及进程控制的相关内容

获取并解析用户输入

int main(void)
{
    read_history(NULL);

    while (1)
    {
        char *command = readline("MYSHELL$");
        add_history(command);
        write_history(NULL);
        launch(command);
        free(command);
    }

    if (oldpath != NULL)
    {
        free(oldpath);
    }
}

通过 readline 库提供的 readline() 函数，便可轻松的输出 命令提示符 并获取用户输入．

将命令拆分成多段

此时，笔者运用著名的 分治 思想，将形如 A -b cde -f | g -hi | j -k lmn >123.txt 的命令以 | 为界线拆成多段，分别处理．

笔者在前文分析过 管道 可以用两个输入输出重定向来实现．
下面分析实现的具体方法，取一条以|符号为界分为 $n$段的命令( $\forall n \in\mathbb N^+, n \geq 3$ )

考察该命令的第 $1$ 段．管道要求第二段命令的输入为第一段命令的输出．因此可将第 $1$ 段命令的标准输出重定向至临时文件，并将第 $2$ 段命令的标准输入重定向至该临时文件．
考察该命令的第 $i$ 段( $\forall i \in\mathbb N, 1 < i < n $)．该段命令的输入为第 $i-1$ 段的输出，可将第 $i-1$ 段的标准输出重定向至临时文件，并将第 $i$ 段的标准输入重定向至该临时文件；该段命令的输出为第 $i+1$ 段的输入，可将第 $i$ 段的标准输出重定向至临时文件，并将第 $i+1$ 段的标准输入重定向至该临时文件
考察该命令的第 $n$ 段．该段的输入为第 $n-1$ 段的输出．因此可将第 $n-1$ 段的标准输出重定向至临时文件，并将第 $n$ 段的标准输入重定向至该临时文件．

这就是实现管道的全部流程．

但问题来了，如何处理形如 A -b cde -f >./log.txt | g -hi | j -k lmn <123.txt 的命令？在上段中，笔者分析了第 $1$ 段的标准输入要重定向至临时文件，但命令中却要求重定向至 log.txt ．
笔者曾考虑复制一份重定向中产生的临时文件至 ./log.txt 或者用 log.txt 代替临时文件的功能，这样就能上例中的冲突．但是请思考这个例子 ls -al >/dev/null |wc -c ．
这个命令中wc -c命令读的结果根据实现方法会有不同．在笔者的环境中使用 zsh 执行该命令的结果不为 0 ，但使用 GNU bash 执行该命令的结果为 0 ．笔者认为类似上面的命令具有 二义性 ，故此笔者的 myshell 实现中对形如 A -b cde -f >./log.txt | g -hi | j -k lmn >123.txt 、ls -al >/dev/null |wc -c 、ls -alR / |grep test <./result.md 这类命令做报错处理，欢迎读者们在评论区留言和笔者讨论这个问题．

好了，至此笔者说明了本程序的绝大部分设定和思想，下面就可以来讨论 launch() 函数的具体实现了．

首先遍历一遍命令，计算命令中的管道数量．

int pipe = 0; //管道计数器
for (char *pr = command; *pr != '\0'; pr++)
{
    if (*pr == '|')
    {
        pipe++;
    }
}

计算出了管道的数量也就知道了命令需要被分成几段．那么就可以根据分段的数量创建一个 COMMAND 的数组．

struct COMMAND *cmd = (struct COMMAND *)calloc(pipe + 1, sizeof(struct COMMAND));
if (cmd == NULL)
{
    myerror("malloc", __LINE__);
}

然后就是将命令分段的实现了．

char *remain = NULL;
char *part = strtok_r(command, "|", &remain);
for (int i = 0; i <= pipe; i++)
{
    /* 初始化 */
    cmd[i].Redirect_FD[STDIN_FILENO] = -1;
    cmd[i].Redirect_FD[STDOUT_FILENO] = -1;
    cmd[i].Redirect_FD[STDERR_FILENO] = -1;
}

还记得吗？笔者用 Redirect_FD 表示每段命令中重定向的文件的文件描述符．因为合法的文件描述符都是非负的，那么笔者必须要将 Redirect_FD 中的每个元素都初始化为 -1 才能表达不需要重定向的情况．

tip

TIP

笔者猜会有读者对strtok_r()函数的使用产生疑惑．strtok_r()函数的用法与strtok()函数的用法类似，只是多了一个参数．

这两个函数的函数原型为：

char strtok(char str, const char delim);
char strtok_r(char str, const char delim, char **saveptr);

简单的说，strtok_r() 是可重入版本的 strtok() ，就是将使用 static 变量保存的数据保存在了参数里，实现了可重入的需求．
至于为什么要用 strtok_r() 而不是 strtok() ？因为后文还有一处有分割字符串的需求，如果都用 strtok() 来实现，那么在第2处调用(指第一个参数不为NULL的第2处调用)会覆盖先前保存在 static 变量中的数据，无法满足笔者的需求．后文会再次重复该问题．

在这之后，分别处理每段命令．

    for (int i = 0; i <= pipe; i++)
    {

        if (pipe && i < pipe)
        {
            /* 生成临时文件 */
            char TempFile[] = "/tmp/MyShell_XXXXXX";
            int TempFile_FD = mkstemp(TempFile);
            /* 检测生成临时文件是否成功 */
            if (TempFile_FD == -1)
            {
                myerror("mkstemp", __LINE__);
            }
            /* 将本段命令的标准输出重定向至临时文件 */
            cmd[i].Redirect_FD[STDOUT_FILENO] = TempFile_FD;
            /* 将下段命令的标准输入重定向至临时文件 */
            cmd[i + 1].Redirect_FD[STDIN_FILENO] = TempFile_FD;
            unlink(TempFile);
            //删除临时文件（临时文件在被close前依然可用，不会被立即删除）
        }
        analyze(part, &cmd[i]);//分析与检测本段命令中的参数与重定向符
        if (不是内置命令)
        {
            执行本段命令
        }
        if (pipe && i < pipe)
        {
            lseek(cmd[i].Redirect_FD[STDOUT_FILENO], 0, SEEK_SET);
            cmd[i].Redirect_FD[STDOUT_FILENO] = -1;
        }
        part = strtok_r(NULL, "|", &remain);
        for (int IO_Steam = 0; IO_Steam < 3; IO_Steam++)
        {
            if (cmd[i].Redirect_FD[IO_Steam] >= 0)
            {
                close(cmd[i].Redirect_FD[IO_Steam]);
                //关闭文件，释放相关资源
            }
        }

        for (int j = 0; j < cmd[i].argc; j++)
        {
            free(cmd[i].argv[j]);
        }
        free(cmd[i].argv);
    }
    free(cmd);
}

为了便于读者们阅读和理解，第22行和第23行笔者使用了伪码来描述其中的逻辑．具体的实现将在后文说明．

请读者们注意第28 行，该行将文件的读取位置重置为0．以便下一段命令从文件头读取内容．

第29 行，在本段命令执行结束后，将因实现管道产生的重定向中的输出重定向设为-1．为什么要这样做？为了避免 close 临时文件，在第18行已经对临时文件执行了unlink ，close 后临时文件的引用计数递减为0，会导致临时文件被真正的删除，下一段命令将无法完成输入重定向．故此，临时文件只能在完成输入重定向的使命之后关闭．

最终，所有打开的重定向文件都该被将被close．

分析处理命令段

首先将正在处理的命令段复制一份，因为在分析中会更改命令段的值．

char *string = strdup(OriginString);
if (string == NULL)
{
    myerror("malloc", __LINE__);
}

tip

TIP

strdup()的用法等于用strlen()计算源字符串的长度后分配为新字符串分配内存空间并完成复制最终返回原字符串的副本的地址．

srdup() 的函数签名为：

char *strdup(const char *s);

在这之后定义变量char *end = string + strlen(string); 作为一个哨兵指向\0，标记string的结束位置，防止指针越界．

下面就是查找命令段中是否含有输入输出重定向，重定向是否合法，以及解析命令行的参数，将其转换为char **argv; 的形式．

处理标准输出、错误输出重定向

char *result = NULL;
while ((result = strchr(string, '>')) != NULL)
{
    *result = ' ';
    int IO_Steam = 1;

    result--;
    if (result > string && isdigit(*result))
    {
        if (*result - '0' != STDOUT_FILENO && *result - '0' != STDERR_FILENO)
        {
            printf("Unknow COMMAND\n");
            exit(EXIT_FAILURE);
        }
        else
        {
            IO_Steam = *result - '0';
            *result = ' ';
        }
    }
    if (cmd->Redirect_FD[IO_Steam] >= 0)
    {
        printf("Unknow COMMAND\n");
        exit(EXIT_FAILURE);
    }
    result += 2;
    _Bool Append = 0;
    if (result < end && *result == '>')
    {
        Append = 1; //附加模式
        *result = ' ';
    }
    while (result < end && isspace(*result))
    {
        result++;
    }

    if (result < end)
    {
        cmd->Redirect_FD[IO_Steam] = OpenFile(result, O_WRONLY | O_CREAT | (Append ? O_APPEND : O_TRUNC));
    }
    else
    {
        printf("Unknow COMMAND\n");
        exit(EXIT_FAILURE);
    }
}

处理输入重定向

有了标准输出、错误输出重定向的处理方式，那标准输入重定向的处理方式也不会很难．

while ((result = strchr(string, '<')) != NULL)
{
    *result = ' ';
    if (cmd->Redirect_FD[STDIN_FILENO] >= 0)
    {
        printf("Unknow COMMAND\n");
        exit(EXIT_FAILURE);
    }
    while (result < end && isspace(*result))
    {
        result++;
    }
    if (result < end)
    {
        cmd->Redirect_FD[STDIN_FILENO] = OpenFile(result, O_RDONLY);
    }
    else
    {
        printf("Unknow COMMAND\n");
        exit(EXIT_FAILURE);
    }
}

解析命令行参数

int arg_max = 16; //参数上限，在无法满足需求时会自动增加

cmd->argv = (char **)calloc(arg_max, sizeof(char *));
if (cmd->argv == NULL)
{
    myerror("malloc", __LINE__);
}
char *remain = NULL;
result = strtok_r(string, " ", &remain);
while (result != NULL)
{
    if (arg_max < cmd->argc)
    {
        arg_max *= 2;                                                      //参数数量上限扩充至原来的2倍
        cmd->argv = (char **)realloc(cmd->argv, arg_max * sizeof(char *)); //扩充指针数组大小
    }
    cmd->argv[cmd->argc++] = strdup(result);
    if (cmd->argv == NULL)
    {
        myerror("malloc", __LINE__);
    }
    result = strtok_r(NULL, " ", &remain);
}
if (arg_max < cmd->argc)
{
    arg_max++;
    cmd->argv = (char **)realloc(cmd->argv, arg_max * sizeof(char *)); //扩充指针数组大小
    if (cmd->argv == NULL)
    {
        myerror("malloc", __LINE__);
    }
}
cmd->argv[cmd->argc] = NULL; //argv[argv]的值为NULL

好了，这段命令的解析终于是结束了．当然还有一点小小的工作需要完成．free(string); 释放命令段的副本所占用的内存．

打开重定文件

临时文件的打开笔者在上文中已经实现完成．但用户在命令行中指定的重定向文件的打开还需要单独实现．

int OpenFile(char *string, int flags)
{

    size_t len = 0;
    char *pr = string;
    while (!isspace(*pr) && *pr++ != '\0')
    {
        len++;
    }

    char *dest = malloc((len + 1) * sizeof(char));
    if (dest == NULL)
    {
        myerror("malloc failed", __LINE__);
    }
    strncpy(dest, string, len);
    dest[len] = '\0';
    memset(string, ' ', sizeof(char) * len);
    PathAnalyze(&dest);
    int fd = open(dest, flags, S_IRUSR | S_IWUSR);
    if (fd == -1)
    {
        printf("error:fd:%d  path:%s\n", fd, string);
        myerror("open", __LINE__);
    }
    free(dest);
    return fd;
}

传入的string是命令段的副本，这意味着重定向文件的路径后面可能还有以空格分隔的其他参数，这意味这不能直接使用string调用open()．

此处，笔者通过计算空格前的字符数量并将其复制到新的字符串中使字符串中只含有重定向文件的路径．

转换相对路径

遗憾的是，至此依然不能把dest字符串直接当作参数去调用open() ．莫着急，请听笔者慢慢道来．

在此时，string是重定向文件的路径是毫无疑问的．但路径并不都是可被open() 直接使用的．请参考笔者的前作（命令行参数的误区），文中说明了函数接受的路径只能是绝对路径或以.开头的相对路径．但用户输入的路径却不总是符合这里的要求．而将其他的相对路径格式转换为绝对路径是shell的任务．

思考需要转换的两种相对路径格式．

~/123.md 该类相对路径只需要读取HOME环境变量然后通过简单的字符串拼接就可完成转换．
~root/123.md 该类相对路径的处理更加简单，直接完成拼接即可完成转换．

void PathAnalyze(char **path) //处理~开头的相对路径
{
    char *RelativePath = *path;
    if (isalpha(*(RelativePath + 1)))
    {
        *path = malloc(strlen(RelativePath) + 1 + strlen("/home/") + 1);
        if (*path == NULL)
        {
            myerror("malloc", __LINE__);
        }
        strcpy(*path, "/home/");
    }
    else
    {
        char *home = getenv("HOME"); //获得HOME环境变量的值
        *path = malloc(strlen(RelativePath) + 1 + strlen(home) + 1);
        if (*path == NULL)
        {
            myerror("malloc", __LINE__);
        }
        strcpy(*path, home);
    }
    strcat(*path, RelativePath + 1);
    free(RelativePath);
}

至此，只需要根据传入的参数直接调用open()函数便可完成打开．

执行命令段

有了刚才的准备工作，现在是万事俱备了，只需要真正的执行命令段中的命令．

void execute(struct COMMAND *cmd)
{

    pid_t pid = fork();
    if (pid > 0)
    {
        wait(NULL);
        return;
    }
    for (int i = 1; i < cmd->argc; i++)
    {
#ifndef NDEBUG
        printf("DEBUG,pid: %d LINE:%d\n", pid, __LINE__);
#endif
        if (*cmd->argv[i] == '~')
        {
            PathAnalyze(&cmd->argv[i]);
        }
    }
#ifndef NDEBUG
    printf("DEBUG:argv[0]:%s\n", cmd->argv[0]);
#endif
    for (int IO_Steam = 0; IO_Steam < 3; IO_Steam++)
    {
        if (cmd->Redirect_FD[IO_Steam] >= 0 && dup2(cmd->Redirect_FD[IO_Steam], IO_Steam) == -1)
        {
            myerror("dup2", __LINE__);
        }
    }
    execvp(cmd->argv[0], cmd->argv);
    myerror("exec", __LINE__);
}

首先执行fork()，创建子进程，然后子进程根据struct COMMAND的指示完成输入输出的重定向，并在struct COMMAND中找到作为新的进程的调用参数的argv．好了，直接调用即可．如果在未出错的情况下，程序不该执行到第31行，故在执行到第31行时说明程序已出错．

内置命令

_Bool BuiltInCommand(struct COMMAND *cmd)
{
    /* 内建 历史记录命令 */
    if (strcmp(cmd->argv[0], "history") == 0)
    {
        HIST_ENTRY **history = NULL;
        history = history_list();
        for (int i = 0; history[i] != NULL; i++)
        {
            printf("%s\n", history[i]->line);
        }
        return 0;
    }
    /* 内建 切换工作目录命令 */
    if (strcmp(cmd->argv[0], "cd") == 0)
    {
        if (*cmd->argv[1] == '-')
        {
            chdir(oldpath);
        }
        else if (*cmd->argv[1] == '~')
        {
            PathAnalyze(&cmd->argv[1]);
        }
        oldpath = getcwd(NULL, 0);
        chdir(cmd->argv[1]);
        return 0;
    }
    /* 内建 退出命令 */
    if (strcmp(cmd->argv[0], "exit") == 0 || strcmp(cmd->argv[0], "q") == 0)
    {
        exit(EXIT_SUCCESS);
    }
    return 1;
}

收尾工作

屏蔽相关信号

signal(SIGHUP, SIG_IGN);
signal(SIGINT, SIG_IGN);
signal(SIGTTIN, SIG_IGN);
signal(SIGTTOU, SIG_IGN);
signal(SIGTSTP, SIG_IGN);

输出颜色

在 main() 中，笔者希望命令提示符和当前工作目录的输出为红色．因此对代码做了如下的改动:

int main(void)
{
    /* 屏蔽相关信号 */
    signal(SIGHUP, SIG_IGN);
    signal(SIGINT, SIG_IGN);
    signal(SIGTTIN, SIG_IGN);
    signal(SIGTTOU, SIG_IGN);
    signal(SIGTSTP, SIG_IGN);

    read_history(NULL);  //调用 readline 库提供的函数，读取历史记录
    char Prompt[P_SIZE]; //命令提示符
    while (1)
    {
        strcpy(Prompt, RED);
        char *pwd = getcwd(NULL, 0); //getcwd在第一个参数为NULL时会分配内存空间存储工作目录
        strncat(Prompt, pwd, 100);
        free(pwd);
        strcat(Prompt, " MYSHELL$" CLOSE);

        char *command = readline(Prompt);
        add_history(command); //将读取到的命令添加至历史记录
        write_history(NULL);
        launch(command); //执行命令
        free(command);   //readline 为读取的命令分配内存空间，需释放防止内存泄漏
    }

    if (oldpath != NULL)
    {
        free(oldpath); //防止内存泄漏和重复释放
    }
}

反思

必要说明

笔者在本文中launch()的实现很低效，实际上不先行对管道数量进行计数是完全可行的．
在analyze()中不去复制字符串也是完全可行的．
还有，丢弃掉strtok_r()，自己实现查找和分割能比本文中的代码高效不止一点点．
笔者也曾想过是否要把文中的代码做一次重构之后在发出来，这样读者们便能看到一个更好的版本．
但笔者最终没有这样做主要是为了激励自己在日后的程序设计过程中更加深入的思考．当然，笔者相信，这点小小的修改一定难不到聪明的读者们，欢迎读者们修改本文中的代码，实现更高效的程序．

不够友善的错误处理

在本文中，笔者采用了最简单也最不友好的方式处理一切的错误．
但这种处理方式并不总是合理的，例如在myshell中，输入错误的指令导致报错是一个常见但并不致命的错误．但笔者依旧采取了这种最简单的错误处理方式，确实未能人性化的设计程序．

不够合理的调用方式

注意launch()中

if (BuiltInCommand(&cmd[i]))
     {
         execute(&cmd[i]);
     }

笔者认为此处的设计并不合理，笔者认为更合理的做法可能是将 execute() 交由 BuiltInCommand() 在判断出本段命令不是内置命令之后自动调用，而不是判断BuiltInCommand()的返回至然后在调用execute()．

回看近3个月前笔者自己写出的myshell ,笔者不得不承认自己的能力是多么的有限．万幸的是，笔者在这3个月中也得到了足够的提高，才能看出原来写的程序的问题．

测试环境

GNU bash : 5.1.4(1)-release

zsh : 5.8

OS : Arch Linux

Kernel : 5.9.14-arch1-1

附录--完整源码

#define NDEBUG
#include <ctype.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/wait.h>
#include <unistd.h>
//
#include <readline/history.h>
#include <readline/readline.h>
extern char **environ;
#define P_SIZE 128 //命令提示符长度限制
#define RED "\033[31m" //红色
#define CLOSE "\033[0m" //关闭
struct COMMAND
{
    int argc;           //参数数量
    int Redirect_FD[3]; //标准输入、标准输出、错误输出的重定向情况
    char **argv;
};
char *oldpath;
void PathAnalyze(char **path);
int OpenFile(char *string, int flags);
void execute(struct COMMAND *cmd);
_Bool BuiltInCommand(struct COMMAND *cmd);
void launch(char *command);
void myerror(char *string, int line);
_Bool analyze(char *string, struct COMMAND *cmd);
int main(void)
{
    /* 屏蔽相关信号 */
    signal(SIGHUP, SIG_IGN);
    signal(SIGINT, SIG_IGN);
    signal(SIGTTIN, SIG_IGN);
    signal(SIGTTOU, SIG_IGN);
    signal(SIGTSTP, SIG_IGN);
    read_history(NULL);  //调用 readline 库提供的函数，读取历史记录
    char Prompt[P_SIZE]; //命令提示符
    while (1)
    {
        strcpy(Prompt, RED);
        char *pwd = getcwd(NULL, 0); //getcwd在第一个参数为NULL时会分配内存空间存储工作目录
        strncat(Prompt, pwd, 100);
        free(pwd);
        strcat(Prompt, " MYSHELL$" CLOSE);

        char *command = readline(Prompt);
        add_history(command); //将读取到的命令添加至历史记录
        write_history(NULL);
        launch(command); //执行命令
        free(command);   //readline 为读取的命令分配内存空间，需释放防止内存泄漏
    }

    if (oldpath != NULL)
    {
        free(oldpath); //防止内存泄漏和重复释放
    }
}
void launch(char *command)
{
    int pipe = 0; //管道计数器
    for (char *pr = command; *pr != '\0'; pr++)
    {
        if (*pr == '|')
        {
            pipe++;
        }
    }
    struct COMMAND *cmd = (struct COMMAND *)calloc(pipe + 1, sizeof(struct COMMAND));
    if (cmd == NULL)
    {
        myerror("malloc", __LINE__);
    }
    char *remain = NULL;
    char *part = strtok_r(command, "|", &remain);
    for (int i = 0; i <= pipe; i++)
    {
        /* 初始化 */
        cmd[i].Redirect_FD[STDIN_FILENO] = -1;
        cmd[i].Redirect_FD[STDOUT_FILENO] = -1;
        cmd[i].Redirect_FD[STDERR_FILENO] = -1;
    }
    for (int i = 0; i <= pipe; i++)
    {

        if (pipe && i < pipe)
        {
            /* 生成临时文件 */
            char TempFile[] = "/tmp/MyShell_XXXXXX";
            int TempFile_FD = mkstemp(TempFile);
            /* 检测生成临时文件是否成功 */
            if (TempFile_FD == -1)
            {
                myerror("mkstemp", __LINE__);
            }
            /* 将本段命令的标准输出重定向至临时文件 */
            cmd[i].Redirect_FD[STDOUT_FILENO] = TempFile_FD;
            /* 将下段命令的标准输入重定向至临时文件 */
            cmd[i + 1].Redirect_FD[STDIN_FILENO] = TempFile_FD;
            unlink(TempFile);
            //删除临时文件（临时文件在被close前依然可用，不会被立即删除）
        }
        analyze(part, &cmd[i]); //分析与检测本段命令中的参数与重定向符
        if (BuiltInCommand(&cmd[i]))
        {
            execute(&cmd[i]);
        }
        if (pipe && i < pipe)
        {
            lseek(cmd[i].Redirect_FD[STDOUT_FILENO], 0, SEEK_SET);
            cmd[i].Redirect_FD[STDOUT_FILENO] = -1;
        }
        part = strtok_r(NULL, "|", &remain);
        for (int IO_Steam = 0; IO_Steam < 3; IO_Steam++)
        {
            if (cmd[i].Redirect_FD[IO_Steam] >= 0)
            {
                close(cmd[i].Redirect_FD[IO_Steam]);
                //关闭文件，释放相关资源
            }
        }

        for (int j = 0; j < cmd[i].argc; j++)
        {
            free(cmd[i].argv[j]);
        }
        free(cmd[i].argv);
    }
    free(cmd);
}

int OpenFile(char *string, int flags)
{

    size_t len = 0;
    char *pr = string;
    while (!isspace(*pr) && *pr++ != '\0')
    {
        len++;
    }

    char *dest = malloc((len + 1) * sizeof(char));
    if (dest == NULL)
    {
        myerror("malloc failed", __LINE__);
    }
    strncpy(dest, string, len);
    dest[len] = '\0';
    memset(string, ' ', sizeof(char) * len);
    PathAnalyze(&dest);
    int fd = open(dest, flags, S_IRUSR | S_IWUSR);
    if (fd == -1)
    {
        printf("error:fd:%d  path:%s\n", fd, string);
        myerror("open", __LINE__);
    }
    free(dest);
    return fd;
}
void myerror(char *string, int line)
{
    fprintf(stderr, "\aLine:%d,error:\a\n", line);
    fprintf(stderr, "%s:%s\n", string, strerror(errno));
    exit(EXIT_FAILURE);
}
_Bool analyze(char *OriginString, struct COMMAND *cmd)
{
    //string 代表使用管道分割后的「命令段」
    //返回值表示重定向情况,0代表无重定向,1代表有重定向

    char *string = strdup(OriginString);
    if (string == NULL)
    {
        myerror("malloc", __LINE__);
    }
    char *end = string + strlen(string);
    char *result = NULL;
#ifndef NDEBUG
    printf("DEBUG:string:%s\n", string);
#endif
    //处理标准输出、错误输出重定向
    while ((result = strchr(string, '>')) != NULL)
    {
        *result = ' ';
        int IO_Steam = 1;

        result--;
        if (result > string && isdigit(*result))
        {
            if (*result - '0' != STDOUT_FILENO && *result - '0' != STDERR_FILENO)
            {
                printf("Unknow COMMAND\n");
                exit(EXIT_FAILURE);
            }
            else
            {
                IO_Steam = *result - '0';
                *result = ' ';
            }
        }
        if (cmd->Redirect_FD[IO_Steam] >= 0)
        {
            printf("Unknow COMMAND\n");
            exit(EXIT_FAILURE);
        }
        result += 2;
        _Bool Append = 0;
        if (result < end && *result == '>')
        {
            Append = 1; //附加模式
            *result = ' ';
        }
        while (result < end && isspace(*result))
        {
            result++;
        }

        if (result < end)
        {
            cmd->Redirect_FD[IO_Steam] = OpenFile(result, O_WRONLY | O_CREAT | (Append ? O_APPEND : O_TRUNC));
        }
        else
        {
            printf("Unknow COMMAND\n");
            exit(EXIT_FAILURE);
        }
    }
    //处理输入重定向
    while ((result = strchr(string, '<')) != NULL)
    {
        *result = ' ';
        if (cmd->Redirect_FD[STDIN_FILENO] >= 0)
        {
            printf("Unknow COMMAND\n");
            exit(EXIT_FAILURE);
        }
        while (result < end && isspace(*result))
        {
            result++;
        }
        if (result < end)
        {
            cmd->Redirect_FD[STDIN_FILENO] = OpenFile(result, O_RDONLY);
        }
        else
        {
            printf("Unknow COMMAND\n");
            exit(EXIT_FAILURE);
        }
    }

    /* 解析命令行参数 */
    int arg_max = 16; //参数上限，在无法满足需求时会自动增加

    cmd->argv = (char **)calloc(arg_max, sizeof(char *));
    if (cmd->argv == NULL)
    {
        myerror("malloc", __LINE__);
    }
    char *remain = NULL;
    result = strtok_r(string, " ", &remain);
    while (result != NULL)
    {
        if (arg_max < cmd->argc)
        {
            arg_max *= 2;                                                      //参数数量上限扩充至原来的2倍
            cmd->argv = (char **)realloc(cmd->argv, arg_max * sizeof(char *)); //扩充指针数组大小
        }
        cmd->argv[cmd->argc++] = strdup(result);
        if (cmd->argv == NULL)
        {
            myerror("malloc", __LINE__);
        }
        result = strtok_r(NULL, " ", &remain);
    }
    if (arg_max < cmd->argc)
    {
        arg_max++;
        cmd->argv = (char **)realloc(cmd->argv, arg_max * sizeof(char *)); //扩充指针数组大小
        if (cmd->argv == NULL)
        {
            myerror("malloc", __LINE__);
        }
    }
    cmd->argv[cmd->argc] = NULL; //argv[argv]的值为NULL
    free(string);
    return 0;
}
void PathAnalyze(char **path) //处理~开头的相对路径
{
    char *RelativePath = *path;
    if (isalpha(*(RelativePath + 1)))
    {
        *path = malloc(strlen(RelativePath) + 1 + strlen("/home/") + 1);
        if (*path == NULL)
        {
            myerror("malloc", __LINE__);
        }
        strcpy(*path, "/home/");
    }
    else
    {
        char *home = getenv("HOME"); //获得HOME环境变量的值
        *path = malloc(strlen(RelativePath) + 1 + strlen(home) + 1);
        if (*path == NULL)
        {
            myerror("malloc", __LINE__);
        }
        strcpy(*path, home);
    }
    strcat(*path, RelativePath + 1);
    free(RelativePath);
}
void execute(struct COMMAND *cmd)
{

    pid_t pid = fork();
    if (pid > 0)
    {
        wait(NULL);
        return;
    }
    for (int i = 1; i < cmd->argc; i++)
    {
#ifndef NDEBUG
        printf("DEBUG,pid: %d LINE:%d\n", pid, __LINE__);
#endif
        if (*cmd->argv[i] == '~')
        {
            PathAnalyze(&cmd->argv[i]);
        }
    }
#ifndef NDEBUG
    printf("DEBUG:argv[0]:%s\n", cmd->argv[0]);
#endif
    for (int IO_Steam = 0; IO_Steam < 3; IO_Steam++)
    {
        if (cmd->Redirect_FD[IO_Steam] >= 0 && dup2(cmd->Redirect_FD[IO_Steam], IO_Steam) == -1)
        {
            myerror("dup2", __LINE__);
        }
    }
    execvp(cmd->argv[0], cmd->argv);
    myerror("exec", __LINE__);
}

_Bool BuiltInCommand(struct COMMAND *cmd)
{
    /* 内建 历史记录命令 */
    if (strcmp(cmd->argv[0], "history") == 0)
    {
        HIST_ENTRY **history = NULL;
        history = history_list();
        for (int i = 0; history[i] != NULL; i++)
        {
            printf("%s\n", history[i]->line);
        }
        return 0;
    }
    /* 内建 切换工作目录命令 */
    if (strcmp(cmd->argv[0], "cd") == 0)
    {
        if (*cmd->argv[1] == '-')
        {
            chdir(oldpath);
        }
        else if (*cmd->argv[1] == '~')
        {
            PathAnalyze(&cmd->argv[1]);
        }
        oldpath = getcwd(NULL, 0);
        chdir(cmd->argv[1]);
        return 0;
    }
    /* 内建 退出命令 */
    if (strcmp(cmd->argv[0], "exit") == 0 || strcmp(cmd->argv[0], "q") == 0)
    {
        exit(EXIT_SUCCESS);
    }
    return 1;
}

参考资料

¹. 童永清.Linux C 编程实战[M].第1版.北京:人民邮电出版社 ↩

². W.RichardStevens.Stephen.UNIX环境高级编程[M].第3版.戚正伟,译.北京:人民邮电出版社 ↩

³. Linux Programmer’s Manual ↩

⁴. General Commands Manual ↩

⁵. 鸟哥.鸟哥的Linux私房菜[M].第四版.北京:人民邮电出版社 ↩

2020-11-29发表Dev39 分钟读完 (大约5813个字)

GNU/Linux_C 开发实战--myls

GNU/Linux C 开发实战—myls

需求

对不同类型或不同权限的的文件，输出不同颜色的文字
实现ls的以下7个参数任意组合
- -a 不隐藏任何以 . 开始的项目
- -i 显示每个文件的索引编号（inode 号）
- -l 使用较长格式列出信息
- -s 以块数形式显示每个文件分配的尺寸
- -t 按时间排序，最新的最前
- -r 逆序排列
- -R 递归显示子目录

必要的头文件

#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <grp.h>
#include <locale.h>
#include <pwd.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <time.h>
#include <unistd.h>

开发过程

获取并解析用户输入

分别声明7个_Bool类型的全局变量存储解析到的各个参数的使用情况

_Bool Options_a;
_Bool Options_i; //显示i-node
_Bool Options_l;
_Bool Options_r; //逆序
_Bool Options_R;
_Bool Options_s; //以块数形式显示每个文件分配的尺寸
_Bool Options_t; //时间排序

通过判断argv中的指针指向的字符串的首字符是不是-来判断这个字符串是参数还是路径

_Bool p = 0; //表明是否读取到路径
char *path;//指向存储路径字符串的指针
for (int i = 1; i < argc; i++)
{
    if (*argv[i] == '-')                                   //判断是参数还是路径
    {                                                      //是参数
        for (unsigned int n = 1; n < strlen(argv[i]); n++) //遍历每一格字母
            switch (argv[i][n])
            {
            case 'a':
                Options_a = 1;
                break;
            case 'i':
                Options_i = 1;
                break;
            case 'l':
                Options_l = 1;
                break;
            case 'r':
                Options_r = 1;
                break;
            case 'R':
                Options_R = 1;
                break;
            case 's':
                Options_s = 1;
                break;
            case 't':
                Options_t = 1;
                break;
            default: //错误的参数
                printf("%s error:Unknow options: %s\n", __FILE__, argv[i]);
                exit(EXIT_FAILURE);
                break;
            }
    }
    else
    {          //是路径
        p = 1; //表明已经读到了路径
        path = argv[i];
    }
}

info

在ls中，如果用户输入了路径，那么应该输出用户输入的路径下的文件，否则路径的缺省值应该为当前目录

if (!p) //如果没读取到路径（等价于路径是通过getcwd获得的）
{
    path = getcwd(NULL, 0); //获取当前路径
    if (path == NULL)
        myerror("getcwd", " ", __LINE__);
}

上面的代码调用了笔者为了简化错误处理流程写的myerror()函数，该函数定义如下

void myerror(const char *string1, const char *string2, int line)
{
    printf("\033[31mline:%d:file:%s\n%s:%s\033[0m\n", line, string2, string1, strerror(errno));//strerror()需要 string.h

    exit(EXIT_FAILURE);
}

笔者相信细致的读者一定会觉得!p的设计时不必要的，因为可以通过预先执行path=NULL;，然后在解析完成后判断if (path==NULL)区分是否已经读取到路径，从而删去p变量，但这样的做法是有缺陷的．

当用户输入路径时，path指向某一个argv中的某一个指针指向的字符串．不需要执行free(path)．
当用户不输入路径时，path指向由getcwd函数自动分配内存存储的当前路径．需要执行free(path)．

为了区分是否需要执行free，防止产生内存泄漏，笔者设置p变量来完成对是否需要free的检测．

递归打开目录

在需求中的7个参数中，-R的实现无疑是最为困难的．
笔者通过设计一个以存储目标文件夹路径的字符串为参数的函数，并通过递归调用该函数实现 -R 参数．

首先，笔者定义了几个宏：

1
2
3

#define StackPush(x) FileStack[++FileStackTop] = (x)
#define StackTop FileStack[FileStackTop]
#define StackPop free(FileStack[FileStackTop--])

下面是OpenADirectory的大致流程：

warning
TIP
笔者为了方便各位读者理解该函数运行的流程，在下面的代码中笔者省略了很多细节．
请读者们此时更多的关注该函数的「整体流程与思想」，而不是细枝末节．
请不要担心、不要着急，后文中笔者将逐一说明被笔者省略的内容．

typedef struct
{
    struct stat FileStat;
    struct dirent File_di;

} FileInfo;

void OpenADirectory(const char *path)
{
    /* 保存原目录 */
    char *oldpath = getcwd(NULL, 0);
    if (oldpath == NULL)
        myerror("getcwd", " ", __LINE__);

    DIR *CurrentDir = opendir(path);
    /* 此处省略打开目录失败的错误处理 */

    /* 切换目录 */
    if (chdir(path) == -1)
    /* 此处省略切换目录失败的错误处理 */

    /* 文件堆 */
    FileInfo **FileStack = (FileInfo **)malloc(sizeof(FileInfo *) * FileNumberMax);
    if (FileStack == NULL)
        myerror("malloc", " ", __LINE__);
    int FileStackTop = -1;


    /* 文件读取 */
    struct dirent *CurrentFile;
    while ((CurrentFile = readdir(CurrentDir)) != NULL)
    {
        FileInfo *temp = (FileInfo *)malloc(sizeof(FileInfo));
        if (temp == NULL)
            myerror("malloc", "", __LINE__);
        temp->File_di = *CurrentFile;
        if (lstat(CurrentFile->d_name, &(temp->FileStat)) == -1)
        {
            printf("\033[31mError:Line:%d: can't get stat of %s,%s\033[0m\n", __LINE__, CurrentFile->d_name, strerror(errno));
            free(temp);
            continue;
        }
        if (FileStackTop < FileNumberMax)
            StackPush(temp);
        else
            myerror("\033[31mToo much File\033[0m\n", " ", __LINE__);
    }
    /* readdir错误检查 */
    if (errno) //需要 error.h
        printf("\033[31mline:%d:error:%s\033[0m\n", __LINE__, strerror(errno));

该函数在运行的开始，首先保存当前的工作目录的路径，然后打开作为参数的路径中指定的文件夹．

OpenADirectory()新建了一个名叫FileStack的指针，该指针指向指向FileInfo类型的指针，换而言之，FileStack是一个二级指针．由于使用malloc()为其分配了sizeof(FileInfo *) * FileNumberMax字节的空间，即FileNumberMax个FileInfo *类型所占的空间，那么此时，FileStack就相当与一个「内含FileNumberMax个指针元素的数组」．在此，笔者将该数组作为存储path指定的文件夹内每个文件对应的FIleInfo的堆栈．

danger
ERROR
可能会有读者在想，FileStack不就是个指针数组嘛．直接使用FileInfo (*FileStack)[FileNumberMax];便可以自动分配一个指针数组，何必使用malloc()呢？
这不是笔者在使用二级指针故作高深，而是确有必要．FileInfo (*FileStack)[FileNumberMax];语句定义的是自动变量，占用栈区空间，而栈区空间通常较小，在多层递归中容易出现栈溢出的错误．而malloc()分配的空间在堆区上，堆区远大于栈区，这样才能保证程序的正常运行．
还有人可能会问，那能否这样调用malloc呢?

FileInfo *array=malloc(sizeof(FileInfo) * FileNumberMax);

这样的做法，由FileInfo *类型的指针数组改为FileInfo数组，这样确实也不占用栈区空间，也避免了二级指针带来的理解困难，但却存在着更为严重的内存浪费问题．在绝大多数文件夹中，文件数量远远少于FileNumberMax，在相同的文件夹，如果使用指针数组的方案，浪费的空间仅为多个指针所占据的空间，而使用FileInfo数组的方案却浪费了多个FileInfo的空间，要知道FileInfo所占的空间远大于FileInfo *．所以使用FileInfo的方案也不合理．

假设打开文件夹成功，则将程序的工作目录切换至已打开的文件夹（也就是参数中指定的文件夹），这是因为笔者需要调用lstat函数获取文件夹下每个文件的属性．
lstat以文件路径为参数．切换目录后，笔者便可以以文件名作为相对路径直接调用lstat函数．如不切换目录则会找不到文件，当然也可以采取字符串拼接的做法，但这样做需要对文件夹下每个文件都执行一次字符串拼接，效率较低，而且字符串的长度不定，分配空间也易出现浪费或溢出．笔者直接切换目录避免了这些麻烦，也提升了效率．

在此后笔者使用循环遍历文件夹中的每个文件，获取每个文件的属性，并将每个文件对应的struct stat和struct dirent一同存储在的struct FileInfo．
这样做的好处有很多，完成了这步后，输出文件信息所必要的所以内容已被集中在了一个struct FileInfo结构体中，为后面对详细信息的输出和文件信息的排序排序给予了极大的便利．

1	qsort(FileStack, FileStackTop + 1, sizeof(FileInfo *), cmp);

之后笔者使用qsort函数对FileStack进行排序处理，作为函数指针传递的cmp函数要如何写，请容笔者在后文交代．
这这里，需要注意的是，真正被排序的是FileInfo *，而每个FileInfo元素都还存储在原来的位置．排序FileInfo *，代替FileInfo是一个十分有用的小技巧，能提升排序的效率．

    for (int i = FileStackTop; i >= 0; i--)
    {
        /* 此处省略输出文件信息的函数 */
    }

    while (FileStackTop >= 0)
    {
        if (Options_R && S_ISDIR(StackTop->FileStat.st_mode) && strcmp(StackTop->File_di.d_name, ".") != 0 && strcmp(StackTop->File_di.d_name, "..") != 0)
            OpenADirectory(StackTop->File_di.d_name);
        StackPop;
    }

    if (chdir(oldpath) == -1) //切回目录
        myerror("chdir", path, __LINE__);

    /* 释放与关闭 */
    free(oldpath);
    closedir(CurrentDir);
    free(FileStack);
}

如上，笔者使用for循环从堆的顶部遍历每个元素，并输出其中的所需的信息，这样便做到了排序输出．

其后，笔者再次从堆顶逐一访问每个元素，在启用了-R参数时，检测堆栈顶部的元素是否为文件夹，如果堆栈顶部为文件夹，且不是.和..则把堆栈顶部的元素对应的文件夹的路径作为参数递归调用OpenADirectory()．完成后对先free堆栈顶的元素所指向的FileInfo分配的空间并对堆栈执行pop操作．
最终释放堆栈空间及其他内存分配．

secondary
SECONDARY

获取文件属性的函数还有stat，为什么要选择lstat而不是stat呢？

原因很简单lstat函数获取符号链接(Symbolic link)本身的属性，而stat获取被链接的文件的属性．

顺带一提，得益于FileStack已经被qsort函数完成了排序，所以接下来通过递归调用打开子文件夹也是有序的．这使得myls程序运行期间所有文件的输出顺序是正确的．

至此，笔者终于完整的描述了OpenADirectory()的运行的流程．

打开目录过程中的细节

首先需要关注的是错误处理．其中readdir()函数的错误处理需要特别的关注．

tip

TIP

readir()在读到目录结尾和出错时返回NULL．仅在出错时设置errno．

If the end of the directory stream is reached, NULL is returned and errno is not changed. If an error occurs, NULL is returned and errno is set appropriately. To distinguish end of stream from an error, set errno to zero before calling readdir() and then check the value of errno if NULL is returned.

readdir()的返回值NULL具有双重含义，只能使用检测errno的值是否为0来判断readdir()是否执行正常．
在检测前需保证errno==0．

调用opendir时，易因权限不足等原因致使opendir无法正常执行．在发生错误时，笔者并未选择直接退出程序，而是选择报错并跳过打开失败的文件夹．
记得要释放getcwd中为了存储当前工作目录路径的字符串分配的内存空间，清除errno的值．

DIR *CurrentDir = opendir(path);
if (CurrentDir == NULL)
{
    printf("\033[31mLine:%d:readfailed:%s/%s\t %s\033[0m\n", __LINE__, oldpath, path, strerror(errno));
    errno = 0;
    free(oldpath);
    return;
}

切换目录过程中，也可能因权限不足而导致切换失败，例如：用户缺少文件夹的x权限时，便无法进入相应的文件夹．因此，这一步的错误检查同样必不可少．
同样不能忘记释放内存空间、清除errno的值，额外的还需要关闭已打开的文件夹．

/* 切换目录 */
if (chdir(path) == -1)
{
    printf("\033[31mLine:%d:chdir:%s\t %s\033[0m\n", __LINE__, path, strerror(errno));
    errno = 0;
    free(oldpath);
    closedir(CurrentDir);
    return;
}

当然不必笔者多提的就是malloc()的错误处理，相信各位读者一定知道该怎么做，笔者便不再赘述．

在OpenADirectory的结尾，笔者将工作目录切换回去，方便递归中打开后续文件夹．

实现文件详细信息输出

格式化输出文件大小

这部分十分容易实现，只需要从相应的struct stat中访问st_size成员，并将其作为参数传递给相应的格式化输出函数即可．

void FormatBytes(off_t size)
{
    char *array[] = {"B", "KB", "MB", "GB", "TB", "PB"};
    int n = 0;
    while (size >= 1024)
    {
        n++;
        size /= 1024;
    }
    printf("%ld%s\t", size, array[n]);
}

格式化输出修改时间

void FormatTime(time_t mtime)
{
    char string[20];
    struct tm *timeinfo = gmtime(&mtime);
    strftime(string, 17, "%b %e %R", timeinfo);
    printf("%s\t", string);
}

文件的修改时间被存储在struct stat的st_mtim.tv_sec成员中．有必要多说一句的是，为了输出本地时间（UTC +8）,还需要设置本地化的时间，笔者将这部分需求在main函数中实现．

1
2
3

/* 本地化时间设置 */
if (setlocale(LC_TIME, "") == NULL)
    myerror("setlocale", " ", __LINE__);

格式化输出文件所属的用户和用户组

void FormateUserAndGroup(uid_t userid, gid_t groupid)
{
    struct passwd *owner = getpwuid(userid);//#include <pwd.h>

    if (owner == NULL)
    {
        printf("%s\n", getcwd(NULL, 0));
        printf("uid:%u\n", userid);
    }

    struct group *group = getgrgid(groupid);//include <grp.h>
    if (group == NULL)
        myerror("getgruid", " ", __LINE__);

    printf("%s\t%s\t", owner->pw_name, group->gr_name);
}

函数以struct stat中的st_uid成员和st_gid成员为实际参数，分别通过uid和gid调用getpwuid()函数和getgrgid()函数，获取相关结构体，并输出其中的用户名和用户组名称．

tip

TIP

getpwuid()函数由 pwd.h 提供
getgrgid()函数由 grp.h 提供

格式化输出文件权限

文件权限的格式化输出最为简单．只是机械的判断并输出即可．

考录到存在SUID 、 SGID 、SBIT 这些特殊权限的存在，笔者并未尝试使用位移运算符来复用部分代码，使得这部分代码显得很长．

void prauthority(mode_t mode)
{
    if (S_ISREG(mode))
        putchar('-');
    else if (S_ISDIR(mode))
        putchar('d');
    else if (S_ISLNK(mode))
        putchar('l');
    else if (S_ISFIFO(mode))
        putchar('f');
    else if (S_ISBLK(mode))
        putchar('b');
    else if (S_ISCHR(mode))
        putchar('c');
    else if (S_ISSOCK(mode))
        putchar('s');
    //Owner
    if (S_IRUSR & mode)
        putchar('r');
    else
        putchar('-');
    if (S_IWUSR & mode)
        putchar('w');
    else
        putchar('-');
    if (S_IXUSR & mode)
    {
        if (S_ISUID & mode)
            putchar('s');
        else
            putchar('x');
    }
    else
        putchar('-');

    //group
    if (S_IRGRP & mode)
        putchar('r');
    else
        putchar('-');
    if (S_IWGRP & mode)
        putchar('w');
    else
        putchar('-');
    if (S_IXGRP & mode)
    {
        if (S_ISGID & mode)
            putchar('s');
        else
            putchar('x');
    }
    else
        putchar('-');

    //Other
    if (S_IROTH & mode)
        putchar('r');
    else
        putchar('-');
    if (S_IWOTH & mode)
        putchar('w');
    else
        putchar('-');
    if (S_IXOTH & mode)
    {
        if (S_ISVTX & mode)
            putchar('t');
        else
            putchar('x');
    }
    else
        putchar('-');
    putchar('\t');
}

格式化输出文件的`i-node`编号和以块为单位文件的大小

直接从struct stat 中读取相关信息并输出即可．

if (Options_i)
    printf("%-10lu\t", FileStack[i]->FileStat.st_ino);
if (Options_s)
    printf("%-8ld\t", FileStack[i]->FileStat.st_blksize);

根据文件的类型和权限输出不同颜色的文件名

根据struct dirent中的char d_name[256]输出即可．无非是根据不同类型输出不同的颜色而已．

if (S_ISREG(FileStack[i]->FileStat.st_mode) &&
    ((S_IXUSR & FileStack[i]->FileStat.st_mode) ||
     (S_IXGRP & FileStack[i]->FileStat.st_mode) ||
     (S_IXOTH & FileStack[i]->FileStat.st_mode)))
    printf("\033[32m%s\033[0m\n", FileStack[i]->File_di.d_name);
else if (S_ISREG(FileStack[i]->FileStat.st_mode))
    printf("%s\n", FileStack[i]->File_di.d_name);
else if (S_ISDIR(FileStack[i]->FileStat.st_mode))
    printf("\033[34m%s\033[0m\n", FileStack[i]->File_di.d_name);
else if (S_ISLNK(FileStack[i]->FileStat.st_mode))
    printf("\033[31m%s\033[0m\n", FileStack[i]->File_di.d_name);

实现排序输出

在用OpenADirectory()中笔者调用了qsort()．其中，qsort()将cmp进行隐式类型转换为函数指针，完成了对FileStack这个指针数组的排序．

1	qsort(FileStack, FileStackTop + 1, sizeof(FileInfo *), cmp);

在此，笔者来实现cmp()．

int cmp(const void *a, const void *b)
{
    const FileInfo *A = *(FileInfo **)a;
    const FileInfo *B = *(FileInfo **)b;
    int i;
    if (Options_t)
    {
        time_t t = B->FileStat.st_mtim.tv_sec - A->FileStat.st_mtim.tv_sec;
        if (t > 0)
            i = -1;
        else if (t == 0)
            i = 0;
        else
            i = 1;
    }
    else
        i = strcmp(B->File_di.d_name, A->File_di.d_name);
    if (Options_r)
        i = -i;
    return i;
}

其中，根据用户是否输入了参数-r决定是否进行逆序排列，根据用户是否输入了参数-t决定排序的方式．

至此，myls终于完成了，完整的代码见本文末的附录．

反思

动态分配 `FileStack`

在上面的实现中，笔者粗暴的使用了一个FileNumbertMax作为FileStack中指针的数量，但这并非最优解．

大多数文件夹中，文件数量远远小于 FileNumbertMax 意味着浪费了很多空间．

更合理的做法是，为FileStack设置一个大于「大多数文件夹中存放文件数量」的初始值，在遇到FileStack满后，使用realloc()扩充FileStack的空间即可．

当然，这不可避免的是在一定程度上减缓myls的运行速度，这个运行速度与消耗空间的平衡需要读者自行考量．

获取文件属性

warning

WARNING

该部分内容含较多的笔者的未验证个人观点，不保证正确．欢迎读者们指出错误．

OpenADirectory()中，使用readdir()读取目录的记录项，获取的struct dirent中包含文件名与i-node编号．

然后，使用lstat()根据文件路径（笔者使用文件名作为相对路径）读取文件的属性．

在使用i-node的文件系统中，文件的属性存储在i-node中，lstat()可能的读取文件属性的方式为：

打开并遍历文件所在目录
读取目录的记录项，直到找到指定的文件所对应的记录项
从文件所对应的记录项中得到文件的i-node编号
根据文件的i-node编号找到对应的i-node，完成读取文件的属性

读者们一定能发现根据获取的struct dirent已经可以读取到i-node编号了，但使用lstat()函数却还要重复上面的1-3步．

笔者未能想到如何更好的读取文件的属性，欢迎对此有了解的读者告诉笔者．

点击三角形展开附录

附录--完整源码

/* myls.c */
#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <grp.h>
#include <locale.h>
#include <pwd.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <time.h>
#include <unistd.h>

// #define NDEBUG

#define StackPush(x) FileStack[++FileStackTop] = (x)
#define StackTop FileStack[FileStackTop]
#define StackPop free(FileStack[FileStackTop--])
#define FileNumberMax 40960

_Bool Options_a;
_Bool Options_i; //显示i-node
_Bool Options_l;
_Bool Options_r; //逆序
_Bool Options_R;
_Bool Options_s; //以块数形式显示每个文件分配的尺寸
_Bool Options_t; //时间排序

typedef struct
{
    struct stat FileStat;
    struct dirent File_di;

} FileInfo;
void prauthority(mode_t mode);
void myerror(const char *string, const char *filename, int line);
void OpenADirectory(const char *path);
int cmp(const void *a, const void *b);
void FormateUserAndGroup(uid_t userid, gid_t groupid);
void FormatTime(time_t mtime);
void FormatBytes(off_t size);

int main(int argc, char **argv)
{

    /* 本地化时间设置 */
    if (setlocale(LC_TIME, "") == NULL)
        myerror("setlocale", " ", __LINE__);
    signal(SIGTTIN, SIG_IGN); //忽略SIGTTIN信号

    _Bool p = 0; //表明是否读取到路径
    char *path;  //指向存储路径字符串的指针
    for (int i = 1; i < argc; i++)
    {
        if (*argv[i] == '-')                                   //判断是参数还是路径
        {                                                      //是参数
            for (unsigned int n = 1; n < strlen(argv[i]); n++) //遍历每一格字母
                switch (argv[i][n])
                {
                case 'a':
                    Options_a = 1;
                    break;
                case 'i':
                    Options_i = 1;
                    break;
                case 'l':
                    Options_l = 1;
                    break;
                case 'r':
                    Options_r = 1;
                    break;
                case 'R':
                    Options_R = 1;
                    break;
                case 's':
                    Options_s = 1;
                    break;
                case 't':
                    Options_t = 1;
                    break;
                default: //错误的参数
                    printf("%s error:Unknow options: %s\n", __FILE__, argv[i]);
                    exit(EXIT_FAILURE);
                    break;
                }
        }
        else
        {          //是路径
            p = 1; //表明已经读到了路径
            path = argv[i];
        }
    }
    if (!p) //如果没读取到路径（等价于路径是通过getcwd获得的）
    {
        path = getcwd(NULL, 0); //获取当前路径
        if (path == NULL)
            myerror("getcwd", " ", __LINE__);
    }
    OpenADirectory(path);
    if (!p)
        free(path);
}

int cmp(const void *a, const void *b)
{
    const FileInfo *A = *(FileInfo **)a;
    const FileInfo *B = *(FileInfo **)b;
    int i;
    if (Options_t)
    {
        time_t t = B->FileStat.st_mtim.tv_sec - A->FileStat.st_mtim.tv_sec;
        if (t > 0)
            i = -1;
        else if (t == 0)
            i = 0;
        else
            i = 1;
    }
    else
        i = strcmp(B->File_di.d_name, A->File_di.d_name);
    if (Options_r)
        i = -i;
    return i;
}

void OpenADirectory(const char *path)
{
    /* 保存原目录 */
    char *oldpath = getcwd(NULL, 0);
    if (oldpath == NULL)
        myerror("getcwd", " ", __LINE__);

    DIR *CurrentDir = opendir(path);
    if (CurrentDir == NULL)
    {
        printf("\033[31mLine:%d:readfailed:%s/%s\t %s\033[0m\n", __LINE__, oldpath, path, strerror(errno));
        errno = 0;
        free(oldpath);
        return;
    }

    /* 切换目录 */
    if (chdir(path) == -1)
    {
        printf("\033[31mLine:%d:chdir:%s\t %s\033[0m\n", __LINE__, path, strerror(errno));
        errno = 0;
        free(oldpath);
        closedir(CurrentDir);
        return;
    }

    /* 文件堆 */
    FileInfo **FileStack = (FileInfo **)malloc(sizeof(FileInfo *) * FileNumberMax);
    if (FileStack == NULL)
        myerror("malloc", " ", __LINE__);
    int FileStackTop = -1;

    if (Options_R) /* 如果开启了递归显示子目录，则输出切换结果 */
    {
        char *NewPath = getcwd(NULL, 0);
        if (NewPath == NULL)
            myerror("getcwd", " ", __LINE__);
        printf("%s:\n", NewPath);
        free(NewPath);
    }

    /* 文件读取 */
    struct dirent *CurrentFile;
    while ((CurrentFile = readdir(CurrentDir)) != NULL)
    {
        FileInfo *temp = (FileInfo *)malloc(sizeof(FileInfo));
        if (temp == NULL)
            myerror("malloc", "", __LINE__);
        temp->File_di = *CurrentFile;
        if (lstat(CurrentFile->d_name, &(temp->FileStat)) == -1)
        {
            printf("\033[31mError:Line:%d: can't get stat of %s,%s\033[0m\n", __LINE__, CurrentFile->d_name, strerror(errno));
            free(temp);
            continue;
        }
        if (FileStackTop < FileNumberMax)
            StackPush(temp);
        else
            myerror("\033[31mToo much File\033[0m\n", " ", __LINE__);
    }
    /* readdir错误检查 */
    if (errno) //需要 error.h
        printf("\033[31mline:%d:error:%s\033[0m\n", __LINE__, strerror(errno));

    qsort(FileStack, FileStackTop + 1, sizeof(FileInfo *), cmp);

    for (int i = FileStackTop; i >= 0; i--)
    {
        if (Options_a == 0 && *FileStack[i]->File_di.d_name == '.')
            continue;
        if (Options_l)
        {
            if (Options_i)
                printf("%-10lu\t", FileStack[i]->FileStat.st_ino);
            if (Options_s)
                printf("%-8ld\t", FileStack[i]->FileStat.st_blksize);
            prauthority(FileStack[i]->FileStat.st_mode);
            FormateUserAndGroup(FileStack[i]->FileStat.st_uid, FileStack[i]->FileStat.st_gid);
            FormatBytes(FileStack[i]->FileStat.st_size);
            FormatTime(FileStack[i]->FileStat.st_mtim.tv_sec);
        }

        if (S_ISREG(FileStack[i]->FileStat.st_mode) &&
            ((S_IXUSR & FileStack[i]->FileStat.st_mode) ||
             (S_IXGRP & FileStack[i]->FileStat.st_mode) ||
             (S_IXOTH & FileStack[i]->FileStat.st_mode)))
            printf("\033[32m%s\033[0m\n", FileStack[i]->File_di.d_name);
        else if (S_ISREG(FileStack[i]->FileStat.st_mode))
            printf("%s\n", FileStack[i]->File_di.d_name);
        else if (S_ISDIR(FileStack[i]->FileStat.st_mode))
            printf("\033[34m%s\033[0m\n", FileStack[i]->File_di.d_name);
        else if (S_ISLNK(FileStack[i]->FileStat.st_mode))
            printf("\033[31m%s\033[0m\n", FileStack[i]->File_di.d_name);
    }

    while (FileStackTop >= 0)
    {
        if (Options_R && S_ISDIR(StackTop->FileStat.st_mode) && strcmp(StackTop->File_di.d_name, ".") != 0 && strcmp(StackTop->File_di.d_name, "..") != 0)
            OpenADirectory(StackTop->File_di.d_name);
        StackPop;
        // FileStackTop--;
    }

    if (chdir(oldpath) == -1) //切回目录
        myerror("chdir", path, __LINE__);

    /* 释放与关闭 */
    free(oldpath);
    closedir(CurrentDir);
    free(FileStack);
}
void myerror(const char *string1, const char *string2, int line)
{
    printf("\033[31mline:%d:file:%s\n%s:%s\033[0m\n", line, string2, string1, strerror(errno)); //strerror()需要 string.h

    exit(EXIT_FAILURE);
}

void FormatBytes(off_t size)
{
    char *array[] = {"B", "KB", "MB", "GB", "TB", "PB"};
    int n = 0;
    while (size >= 1024)
    {
        n++;
        size /= 1024;
    }
    printf("%ld%s\t", size, array[n]);
}

void FormatTime(time_t mtime)
{
    char string[20];
    struct tm *timeinfo = gmtime(&mtime);
    strftime(string, 17, "%b %e %R", timeinfo);
    printf("%s\t", string);
}

void FormateUserAndGroup(uid_t userid, gid_t groupid)
{
    struct passwd *owner = getpwuid(userid);//#include <pwd.h>

    if (owner == NULL)
    {
        printf("%s\n", getcwd(NULL, 0));
        printf("uid:%u\n", userid);
    }

    struct group *group = getgrgid(groupid);//include <grp.h>
    if (group == NULL)
        myerror("getgruid", " ", __LINE__);

    printf("%s\t%s\t", owner->pw_name, group->gr_name);
}
void prauthority(mode_t mode)
{
    if (S_ISREG(mode))
        putchar('-');
    else if (S_ISDIR(mode))
        putchar('d');
    else if (S_ISLNK(mode))
        putchar('l');
    else if (S_ISFIFO(mode))
        putchar('f');
    else if (S_ISBLK(mode))
        putchar('b');
    else if (S_ISCHR(mode))
        putchar('c');
    else if (S_ISSOCK(mode))
        putchar('s');
    //Owner
    if (S_IRUSR & mode)
        putchar('r');
    else
        putchar('-');
    if (S_IWUSR & mode)
        putchar('w');
    else
        putchar('-');
    if (S_IXUSR & mode)
    {
        if (S_ISUID & mode)
            putchar('s');
        else
            putchar('x');
    }
    else
        putchar('-');

    //group
    if (S_IRGRP & mode)
        putchar('r');
    else
        putchar('-');
    if (S_IWGRP & mode)
        putchar('w');
    else
        putchar('-');
    if (S_IXGRP & mode)
    {
        if (S_ISGID & mode)
            putchar('s');
        else
            putchar('x');
    }
    else
        putchar('-');

    //Other
    if (S_IROTH & mode)
        putchar('r');
    else
        putchar('-');
    if (S_IWOTH & mode)
        putchar('w');
    else
        putchar('-');
    if (S_IXOTH & mode)
    {
        if (S_ISVTX & mode)
            putchar('t');
        else
            putchar('x');
    }
    else
        putchar('-');
    putchar('\t');
}

参考资料

¹. 童永清.Linux C 编程实战[M].第1版.北京:人民邮电出版社 ↩

². W.RichardStevens.Stephen.UNIX环境高级编程[M].第3版.戚正伟,译.北京:人民邮电出版社 ↩

³. Linux Programmer’s Manual ↩

⁴. General Commands Manual ↩

⁵. 鸟哥.鸟哥的Linux私房菜[M].第四版.北京:人民邮电出版社 ↩

2020-10-20发表Dev29 分钟读完 (大约4325个字)

命令行参数的解析

info
LICENSE
本文使用 GNU通用公共许可证 v3（GNU General Public License, version 3）发布．

命令行参数的解析

真巧，在笔者近日的程序设计实践中又涉及到了命令行参数，笔者便再谈谈他．因单篇博客不宜过长，该内容将拆分在一系列博文中，该系列博文中，笔者将只讨论 getopt() 、 getopt_long() 、 getopt_long_only() 的使用，不涉及其他方案．

`getopt()` 的基本信息

#include <unistd.h>
int getopt(int argc, char *const argv[], const char *optstring);
extern char *optarg;
extern int optind, opterr, optopt;

上面有 getopt() 函数的函数原型和相关全局变量，注意使用 getopt() 函数需要包含 unistd.h．

getopt() 被用来处理遵循 Single UNIX Specification 的命令行参数

Single UNIX Specification 要求[1]：

限制每个命令行选项为一个单一的阿拉伯字符
所有选项必须以 - 作为开头字符

举例来说就是getopt()可用于处理 command -t 123 -p 456.txt -uroot 这类命令，而不能用于 command --times 123 --path 456.txt --userroot ．

`getopt()`的参数

`argv`、`argc`

这两个参数即为待解析的命令行参数的计数和指向字符串存储位置的指针数组．这两个参数的实参通常作为int main(int argc,char *argv[])的参数传入程序．对该处有疑问的读者可参考笔者的博文命令行参数的误解．

`optstring`

optstring是一个字符串，用来说明预期的命令行参数的格式．它的作用可能有点类似 scanf() 中转换说明的作用．

optstring is a string containing the legitimate option characters. If such a character is followed by a colon, the option requires an argument, so getopt() places a pointer to the following text in the same argv-element, or the text of the following argv-element, in optarg. Two colons mean an option takes an optional arg
if there is text in the current argv-element (i.e., in the same word as the option name itself, for example, −oarg), then it is returned in optarg, otherwise optarg is set to zero. This is a GNU extension.
If optstring contains W followed by a semicolon, then −W foo is treated as the long option −−foo. (The −W option is reserved by POSIX.2 for implementation extensions.) This behavior is a GNU extension, not available with libraries before glibc 2.

optstring 是包含合法选项字符的字符串．如果此类字符后接 : ，则该选项需要一个参数，因此 getopt() 将指针指向位于同一 argv 元素中的后续文本，或位于 argarg 中的后续 argv 元素的文本． :: 表示一个选项带有一个可选的参数

如果当前argv元素中有文本( 例如与选项名称本身相同的词，例如 -oarg )，则将其以 optarg 返回，否则 optarg 设置为 0．这是一个 GNU 扩展 ．
如果 optstring 包含 W 后跟一个分号，则将 -W foo 视为长选项 --foo ．( -W 选项由 POSIX.2 保留用于实现扩展．)此行为是 GNU 扩展 ，不适用于 glibc 2 之前的库．

`getopt()`的返回值

If an option was successfully found, then getopt() returns the option character.
If all command-line options have been parsed, then getopt() returns −1.
If getopt() encounters an option character that was not in optstring, then ? is returned.
If getopt() encounters an option with a missing argument, then the return value depends on the first character in optstring:
if it is :, then : is returned
otherwise ? is returned.[3]

用笔者糟糕的英语勉强翻译一下：

warning

受限于笔者低劣的英文水平，笔者的翻译可能存在众多谬误，更无法做到 信、达、雅 的要求．笔者提供的翻译内容仅供参考．建议读者自行翻译或直接阅读英文原文．笔者不为本文中翻译内容的准确性负责．

如果一个选项被成功的找到，getopt() 返回这个选项的字母
如果完成了所有的选项的解析，getopt() 将返回 -1
如果发现不在 optstring 中的选项，则返回 ?
如果发现丢失参数的选项，返回值取决于 optstring[0]：
- 如果 optstring[0] 是 :，则返回 :
- 否则返回 ?，即 return optstring[0] == ':' ? ':' : '?';

`getopt()` 的扫描模式

If the first character of optstring is + or the environment variable POSIXLY_CORRECT is set, then option processing stops as soon as a nonoption argument is encountered.
If the first character of optstring is −, then each nonoption argv-element is handled as if it were the argument of an option with character code 1. (This is used by programs that were written to expect options and other argv-elements in any order and that care about the ordering of the two.)
The special argument −− forces an end of option-scanning regardless of the scanning mode.[3]

还是由笔者来翻译一下

如果 optstring[0] 是 + 或者环境变量 POSIXLY_CORRECT 被设置，则 getopt() 将会在遇到非 optsting 中的选项时停止．
如果 optstring[0] 是 - ,则任何一个非选项的 argv 中的元素将被按照 ASCII 编码 为 1 的字符处理．（这常被用于期待选项和 argv 的元素 按某种顺序排列并关注两者的顺序的程序）
特殊的参数 -- 将强制结束选项扫描，无论扫描模式是什么．

请看示例：

/* getopt1.c */
#include <stdio.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
    int ch;
    // opterr = 0;
    while ((ch = getopt(argc, argv, "a:b:c::d::e::fxg:")) != -1)
    {
        printf("    ch\t函数的返回值\t%c\n", ch);
        printf("optarg\t当前选项的参数\t%s\n", optarg);
        printf("optind\t指向下个字符串\t%d\n", optind);
        printf("argv[optind]\t\t%s\n", argv[optind]);
        printf("optopt\t指向出错字符串\t%d\n", optopt);
        printf("opterr\t若出错输出消息\t%d\n", opterr);
        printf("\n");
    }
}

请读者们编译后以参数 -a -- -c-- -- -g 运行程序．

笔者得到的输出内容

    ch  函数的返回值    a
optarg  当前选项的参数  --
optind  指向下个字符串  3
argv[optind]            -c--
optopt  指向出错字符串  0
opterr  若出错输出消息  1

    ch  函数的返回值    c
optarg  当前选项的参数  --
optind  指向下个字符串  4
argv[optind]            --
optopt  指向出错字符串  0
opterr  若出错输出消息  1

请读者注意：-- 作为 选项的参数 被读取时 getopt() 正常的的返回 选项字符 ，但当 -- 不作为 选项的参数 被读取时，getopt() 返回值为 -1 ,循环中止．

danger

The use of + and - in optstring is a GNU extension.[3]

在 optstring中使用+ 和 - 是一个 GNU 扩展．

这意味着使用+ 和 -的程序在不兼容 GNU 扩展 的编译器上可能无法正常的编译或运行．

如果在编译中使用了的-std=c99 、-std=c11 等指定 C语言标准 的编译选项需对应替换成 -std=gnu99 、-std=gnu11 ．

`getopt()` 的错误处理

While processing the option list, getopt() can detect two kinds of errors:
an option character that was not specified in optstring
a missing option argument (i.e., an option at the end of the command line without an expected argument).
Such errors are handled and reported as follows:
By default, getopt() prints an error message on standard error, places the erroneous option character in optopt, and returns ? as the function result.
If the caller has set the global variable opterr to zero, then getopt() does not print an error message. The caller can determine that there was an error by testing whether the function return value is ?. (By default, opterr has a nonzero value.)
If the first character (following any optional + or − described above) of optstring is a colon (:), then getopt() likewise does not print an error message. In addition, it returns : instead of ? to indicate a missing option argument. This allows the caller to distinguish the two different types of errors.[3]

笔者翻译成中文便是

在处理选项列表时， getopt() 可以检测两种错误：

未在 optstring 中指定的选项字符
选项缺少参数（例如，命令行末尾没有预期参数的选项）

这些错误的处理和报告如下：

默认情况下，getopt() 会在标准错误上显示一条错误消息，将错误的选项字符放入 optopt ，然后返回 ? 作为函数结果．
如果调用者已将全局变量 opterr 设置为零，则 getopt() 不会输出错误消息．调用方可以通过测试函数返回值是否为 ? 来确定是否存在错误．(默认情况下， opterr 具有非零值．)
如果optstring的第一个字符（上述任意可选的 + 或 - 之后）是冒号(:)，则getopt()同样不会输出错误消息．另外，它返回:而不是?表示缺少选项参数．这使调用者可以区分两种不同类型的错误．

与 `getopt()` 相关的全局变量

其后，来讨论与 getopt() 相关的 4 个 全局变量 ．

optarg 如果一个选项需要参数，在处理该选项时，getopt会设置optarg指向该选项的参数字符串．
opterr 如果一个选项发生了错误，getopt会默认打印一条出错消息．应用程序可以通过设置opterr参数为0来禁止这个行为．
optind 用来存放下一个要处理的字符串在argv数组里的下标．它从1开始，每处理一个参数，getopt都会对其递增1．
optopt 如果处理选项时发生了错误，getopt会设置optopt指向导致出错的选项字符串．[1]

请看示例：

/* getopt1.c */
#include <stdio.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
    int ch;
    // opterr = 0;
    while ((ch = getopt(argc, argv, "a:b:c::d::e::fxg:")) != -1)
    {
        printf("    ch\t函数的返回值\t%c\n", ch);
        printf("optarg\t当前选项的参数\t%s\n", optarg);
        printf("optind\t指向下个字符串\t%d\n", optind);
        printf("argv[optind]\t\t%s\n", argv[optind]);
        printf("optopt\t指向出错字符串\t%d\n", optopt);
        printf("opterr\t若出错输出消息\t%d\n", opterr);
        printf("\n");
    }
}

这段代码将演示getopt()的使用方法与 getopt()调用中相关的变量的变化．
请读者一定要使用-a 234 -b -c456 -d 789 -e -f -h -g作为参数运行该程序，查看输出逐个分析原因．

笔者得到的输出内容

    ch  函数的返回值    a
optarg  当前选项的参数  234
optind  指向下个字符串  3
argv[optind]            -b
optopt  指向出错字符串  0
opterr  若出错输出消息  1

    ch  函数的返回值    b
optarg  当前选项的参数  -c456
optind  指向下个字符串  5
argv[optind]            -d
optopt  指向出错字符串  0
opterr  若出错输出消息  1

    ch  函数的返回值    d
optarg  当前选项的参数  (null)
optind  指向下个字符串  6
argv[optind]            789
optopt  指向出错字符串  0
opterr  若出错输出消息  1

    ch  函数的返回值    e
optarg  当前选项的参数  (null)
optind  指向下个字符串  8
argv[optind]            -f
optopt  指向出错字符串  0
opterr  若出错输出消息  1

    ch  函数的返回值    f
optarg  当前选项的参数  (null)
optind  指向下个字符串  9
argv[optind]            -h
optopt  指向出错字符串  0
opterr  若出错输出消息  1

./getopt1.out: invalid option -- 'h'
    ch  函数的返回值    ?
optarg  当前选项的参数  (null)
optind  指向下个字符串  10
argv[optind]            -g
optopt  指向出错字符串  104
opterr  若出错输出消息  1

./getopt1.out: option requires an argument -- 'g'
    ch  函数的返回值    ?
optarg  当前选项的参数  (null)
optind  指向下个字符串  11
argv[optind]            (null)
optopt  指向出错字符串  103
opterr  若出错输出消息  1

值得特别关注的是：

第 2 段，-c456 被解释为 选项b 的参数，而不是 选项c 和其参数 456．
第 4 段，定义为含有可选参数的 选项d 的参数为 null ，而不是 789，因为可选参数的选项的参数和选项间不能加空格，要使 789 为 选项d 的参数，则该写为 -d789．

请读者再次以参数 -ab -c123 -de -fx 执行该程序．

笔者得到的输出内容

    ch  函数的返回值    a
optarg  当前选项的参数  b
optind  指向下个字符串  2
argv[optind]            -c123
optopt  指向出错字符串  0
opterr  若出错输出消息  1

    ch  函数的返回值    c
optarg  当前选项的参数  123
optind  指向下个字符串  3
argv[optind]            -de
optopt  指向出错字符串  0
opterr  若出错输出消息  1

    ch  函数的返回值    d
optarg  当前选项的参数  e
optind  指向下个字符串  4
argv[optind]            -fx
optopt  指向出错字符串  0
opterr  若出错输出消息  1

    ch  函数的返回值    f
optarg  当前选项的参数  (null)
optind  指向下个字符串  4
argv[optind]            -fx
optopt  指向出错字符串  0
opterr  若出错输出消息  1

    ch  函数的返回值    x
optarg  当前选项的参数  (null)
optind  指向下个字符串  5
argv[optind]            (null)
optopt  指向出错字符串  0
opterr  若出错输出消息  1

值得特别关注的是：

第 1 段，b 被解释为 选项a 的参数，而不是 选项a 和 选项b ．请将第 1 段和第 4 与第 5 段比较， -fx 被解释为了 选项f 和 选项x ．
第 4 段，定义为可选参数的 选项d的参数为 null ，而不是 789，因为含「可选参数的选项的参数」和选项间不能加空格，要使 789 为 选项d 的参数，则该写为 -d789．
长选项
长选项以两个「-」开头，长选项的参数写法可以为以下两种格式：「--arg=param」或「--arg param」．
基本信息

#include <getopt.h>
struct option
{
    const char *name;
    int has_arg;
    int *flag;
    int val;
};
int getopt_long(int argc, char *const argv[], const char *optstring, const struct option *longopts, int *longindex);
int getopt_long_only(int argc, char *const argv[], const char *optstring, const struct option *longopts, int *longindex);

`getopt_long()`

argc、argv 不必解释含义．
optstring：当程序只接受长选项时，optstring 应该为 ""（空字符串），而不是 NULL．
longopts：是一个 struct option 的数组．

name
is the name of the long option.
has_arg
is: no_argument (or 0) if the option does not take an argument; required_argument (or 1) if the option requires an argument; or optional_argument (or 2) if the option takes an optional argument.
flag
specifies how results are returned for a long option.
If flag is NULL , then getopt_long() returns val . (For example, the calling program may set val to the equivalent short option character.)
Otherwise, getopt_long() returns 0, and flag points to a variable which is set to val if the option is found, but left unchanged if the option is not found.
val
is the value to return, or to load into the variable pointed to by flag .>The last element of the array has to be filled with zeros.[1]

也就是说：

name
选项名．
has_arg
需要为 no_argument（无参数）、required_argument（需要参数）、optional_argument（可选参数）这三个宏中的一个．
flag、val
当解析到该长选项时：
- 如果 flag 为 NULL，getopt_only() 返回 val．（例如，调用者设置 val 为对应的短选项字符）
- 如果 flag 不为 NULL，getopt_only() 返回 0，并且 flag 指向的变量将被设置为 val．当未解析的该选项时，flag 指向的值不变．
  longopts 数组的最后一个元素需要全部字段为 0．

If longindex is not NULL, it points to a variable which is set to the index of the long option relative to longopts.[1]
如果 longinedx 不是 NULL，它指向的值将被设置为解析到的长选项在 longopt 中的索引．

`getopt_long_only()`

getopt_long_only() 与 getopt_long() 是相似的．但 - 开头的选项也被认为是选项，当选项以 - 开头时，getopt_long_only() 将认为他是一个长选项．当选项以 - 开头且不匹配长选项但却能匹配短选项时，getopt_long_only() 将这个选项解析为短选项．

整理整理思路．

对于一个以 - 开头的选项：

getopt_long() 认为他是一个短选项
getopt_long_only() 认为他是一个长选项
换而言之，-ab 在 getopt_long() 眼中解释为：「选项a和选项b」或者「选项a和选项a的参数b」；但是 getopt_long_only() 将他首先解释为「长选项ab」，除非 longopts 的数组中不包含一个 name 为 ab 的长选项．

示例

#include <stdio.h>     /* for printf */
#include <stdlib.h>    /* for exit */
#include <getopt.h>
int
main(int argc, char **argv)
{
   int c;
   int digit_optind = 0;
   while (1) {
       int this_option_optind = optind ? optind : 1;
       int option_index = 0;
       static struct option long_options[] = {
           {"add",     required_argument, 0,  0 },
           {"append",  no_argument,       0,  0 },
           {"delete",  required_argument, 0,  0 },
           {"verbose", no_argument,       0,  0 },
           {"create",  required_argument, 0, 'c'},
           {"file",    required_argument, 0,  0 },
           {0,         0,                 0,  0 }
       };
       c = getopt_long(argc, argv, "abc:d:012",
                long_options, &option_index);
       if (c == -1)
           break;
       switch (c) {
       case 0:
           printf("option %s", long_options[option_index].name);
           if (optarg)
               printf(" with arg %s", optarg);
           printf("\n");
           break;
       case '0':
       case '1':
       case '2':
           if (digit_optind != 0 && digit_optind != this_option_optind)
             printf("digits occur in two different argv-elements.\n");
           digit_optind = this_option_optind;
           printf("option %c\n", c);
           break;
       case 'a':
           printf("option a\n");
           break;
       case 'b':
           printf("option b\n");
           break;
       case 'c':
           printf("option c with value '%s'\n", optarg);
           break;
       case 'd':
           printf("option d with value '%s'\n", optarg);
           break;
       case '?':
           break;
       default:
           printf("?? getopt returned character code 0%o ??\n", c);
       }
   }
   if (optind < argc) {
       printf("non-option ARGV-elements: ");
       while (optind < argc)
           printf("%s ", argv[optind++]);
       printf("\n");
   }
   exit(EXIT_SUCCESS);
}

[3]

编译并以 ./getopt_long -add --append -d34 -1 --verbose 运行，程序的输出为：

option a
option d with value 'd'
option append
option d with value '34'
option 1
option verbose

上面的代码中，如果把第 21 的代码中的 getopt_long 改成 get_long_only 再次编译并以相同的参数运行就会得到如下的输出：

option add with arg --append
option d with value '34'
option 1
option verbose

区别很明显．

getopt_long() 将 -add 解释为了选项a、选项 d、选项 d 的参数 d．

getopt_long_only() 将 -add 解释为了选项 add．

参考书籍

¹. W.RichardStevens.Stephen.UNIX环境高级编程[M].第3版.戚正伟,译.北京:人民邮电出版社 ↩

². C++ 命令行参数解析.[G/OL].CSDN.https://blog.csdn.net/qq_34719188/article/details/83788199 ↩

³. Linux Programmer’s Manual.[G/OL].https://man7.org/linux/man-pages/man3/getopt.3.html ↩

2020-10-15发表Dev20 分钟读完 (大约2952个字)

数据结构对齐

对于大多数 x86-64 指令来说，保持数据对齐能够提高效率，但是它不会影响程序的行为．另一方面，如果数据没有对齐，某些型号的 Intel 和 AMD 处理器对于有些实现多媒体操作的 SSE 指令，就无法正确执行．这些指令对 16 字节数据块进行操作，在 SSE 单元和内存之间传递数据的指令要求内存地址必须是 16 的倍数．任何试图以不满足对齐要求的地址来访问内存都会导致异常，默认的行为是程序终止．[1]

结构体的大小不总是等于各数据成员的大小之和

struct test
{
    char a;
    long b;
    int c;
    char d;
};

结构体的大小不总是等于各数据成员的大小之和，但事实上结构体的成员间常常存在「空隙」．
例如上面的结构体，在笔者的设备上的大小为: 24 byte，但「各成员的大小的和」仅为 14 byte．
经过简单的计算就知道该结构体中有 41.6% 的没有被利用，这不一定是一件坏事，但是在可用内存较小的设备上创建过多的该类结构体可能不是一个好的做法．

对齐的方式

基本数据类型的「对齐要求」

上文说到结构体的数据成员间存在「间隙」，但这个间隙到底是如何分布的？

为此，需要了解先每个基本的数据类型的「对齐要求」．

info
INFO

C11 中为定义了 _Alignof 运算符来输出数据的「对齐要求」， _Alignof 的使用方式与 sizeof 类似．

_Alignof 运算符给出一个类型的对齐要求，在关键字 _Alignof 后面的圆括号中写上类型名即可：
1
size_t d_align = _Alignof(float);
假设 d_align 的值是 4，意思是 float 类型 对象的对齐要求是 4．也就是说，4 是储存该类型值相邻地址的字节数．一般而言，对齐值都应该是 2 的非负整数次幂．[2]

_Alignof(type) 的意义为：「若定义 TYPE a;，则 (unsigned long)&a%_Alignof(type)==0」．

较大的对齐值被称为 stricter 或 stronger，较小的对齐值被称为 weaker．[2]

#include <stdio.h>
int main(void)
{
    printf("char %zu\n", _Alignof(char));
    printf("short %zu\n", _Alignof(short));
    printf("int %zu\n", _Alignof(int));
    printf("void* %zu\n", _Alignof(void *));
    printf("long %zu\n", _Alignof(long));
    printf("long long %zu\n", _Alignof(long));
    printf("float %zu\n", _Alignof(float));
    printf("double %zu\n", _Alignof(double));
    printf("long double %zu\n", _Alignof(long double));

    printf("char %zu\n", _Alignof(const char));
    printf("short %zu\n", _Alignof(const short));
    printf("int %zu\n", _Alignof(const int));
    printf("void* %zu\n", _Alignof(const void *));
    printf("long %zu\n", _Alignof(const long));
    printf("long long %zu\n", _Alignof(const long));
    printf("float %zu\n", _Alignof(const float));
    printf("double %zu\n", _Alignof(const double));
    printf("long double %zu\n", _Alignof(const long double));

    printf("char %zu\n", _Alignof(unsigned char));
    printf("short %zu\n", _Alignof(unsigned short));
    printf("int %zu\n", _Alignof(unsigned int));
    printf("long %zu\n", _Alignof(unsigned long));
    printf("long long %zu\n", _Alignof(unsigned long));
}

可以看到的是：「对齐要求」仅与数据类型本身有关，与 const、signed、unsigned 无关．

结构体的「对齐要求」

一个定义完成的结构体是一个 复合数据类型 ，那么结构体也该有它自己的「对齐要求」．
结构体的对齐要求为其成员的「对齐要求」中的最大值．
因此，下面的结构体的对齐要求为：「1、8、4、1」中的最大值，也就是 8，当然也可以用 _Alignof(struct test) 验证刚才的结论．
由此，得到数据结构对齐的要求之1：「结构体地址 满足 结构体的『对齐要求』」．

struct test
{
    char a;
    long b;
    int c;
    char d;
};

特别需要注意的是：「对于任意数据类型，数据类型的大小应当为其『对齐要求』的整数倍．」
该要求在基本数据类型中没有意义，因为单个元素总是其「对齐要求」的整数倍．在结构体中，结构体的最后一个成员后可能需要添加「空隙」使结构体的大小为其「对齐要求」的整数倍．
由此，得到数据结构对齐的要求之2：「结构体大小 为结构体的『对齐要求』的倍数」．

结构体的「空隙」

讨论完了数据类型的「对齐要求」，现在来看看结构体中的「空隙」究竟是如何分布的．

#include <stddef.h>//提供 offsetof
#include <stdio.h>
struct test
{
    char a;
    long b;
    int c;
    char d;
};
int main(void)
{
    printf("offsetof(struct test, a)\t%lu\n", offsetof(struct test, a));
    printf("offsetof(struct test, b)\t%lu\n", offsetof(struct test, b));
    printf("offsetof(struct test, c)\t%lu\n", offsetof(struct test, c));
    printf("offsetof(struct test, d)\t%lu\n", offsetof(struct test, d));
}

info
INFO

如果你必须确定结构某个成员的实际位置，应该考虑边界对齐因素，可以使用 offsetof 宏（定义于 stddef.h）．
1
offsetof( type, member )
type 就是结构的类型，member 就是你需要的那个成员名．表达式的结果是一个 size_t 值，表示这个指定成员开始存储的位置距离结构开始存储的位置偏移几个字节．[3]

根据每个成员的大小和相对结构体开始处的偏移量，能得到下面的表格．

offset	内容	offset	内容
0	char a	12	long b
1	—	13	long b
2	—	14	long b
3	—	15	long b
4	—	16	int c
5	—	17	int c
6	—	18	int c
7	—	19	int c
8	long b	20	char d
9	long b	21	—
10	long b	22	—
11	long b	23	—

根据上文，一个结构体的「对齐要求」为其成员的「对齐要求」的最大值，又因为「对齐要求」通常为 2的幂，那么结构体的「对齐要求」必然是其成员对齐要求的 最小公倍数．即「结构体的首地址」满足「结构体的任一成员」的「对齐要求」．那么，只需要让结构体中的成员的偏移量为成员的「对齐要求」的倍数，那么成员的地址必将满足其「对齐要求」．
由此，得到数据结构对齐的要求之3：「成员的偏移量为成员『对齐要求』的倍数」．

联合的「对齐要求」

联合与结构体相比在「对齐要求」只有些许不同：「联合的『对齐要求』为其最大的成员的『对齐要求』」．
对下面的联合而言，其「对齐要求」为long y;或者说double z;的「对齐要求」，即 8．

union test
{
    char x;
    long y;
    double z;
};

复合数据结构的嵌套

在考虑数据结构对齐的问题时，如果遇到了「复合数据结构」嵌套的情况，只需要把内层的「复合数据结构」当作一个新的数据类型进行思考即可，在思考时不必关注其成员是 基本数据类型 还是 结构体 亦或是 联合体，只需逐层分析，逐层分析其「对齐要求」．

举个例子吧：

struct TEST
{
    union U1
    {
        char a;
        int b;
        short c;
    } d;
    long e;
    struct S1
    {
        int f;
        union U1 g;
        unsigned long h;
    } i;
    union U2
    {
        struct S1 j;
        union U1 k;
    } l;
    char m;
};

danger
WARNING

上面这段代码仅为了说明「复合数据结构的嵌套」，代码本身无应用价值且难以理解和维护．实际开发中，除非有十分充足的理由，否则不应当写出类似的代码．

现在笔者尝试分析 struct TEST：

首先得出 U1 中最大的成员为 int b;，则 U1 的「对齐要求」为 int b; 的「对齐要求」，即 U1 的「对齐要求」为 4．
又因为 long e; 的「对齐要求」为 8，则 d 与 e 间有 4 bytes 的「间隙」．
现在分析 S1．
1. 从 1 中知 U1 的「对齐要求」为 4．又因为 int f; 的大小为 4，所以 f 与 g 间无「间隙」．
2. unsigned long h; 的「对齐要求」为 8，又因为在 S1 中， h 前面的成员 f 和 g 正好占用了 S1 的前 8 bytes．可知，h 与 g 间无间隙．
3. 此时共占用 S1 的前 16 bytes ，而 S1 的「对齐要求」为 unsigned long h; 的「对齐要求」，即为 8．可知 h 后无「空隙」．
4. 又因为 S1 的「对齐要求」为 8，而 d 与 e 共占用 struct TEST 的前 16 bytes，则 i 与 e 间无「间隙」．
现在分析 union U2;
1. 由上：「 struct S1 j; 的『对齐要求』为 8；union U1 k; 的『对齐要求』为 4 」，则 union U2; 的「对齐要求」为：「4、8」中的最大值，即为 8．
2. k 的「对齐要求」为 4，j 占据了 U2 的前 16 bytes ，则 k 与 j 间无「间隙」．且 k 后无「间隙」．
3. i 与 l 的「对齐要求」均为 8 ，则i 与 l 间无「间隙」．
char m; 的「对齐要求」为 1，而 l 的「对齐要求」为8，故此 l 与 m 间无「间隙」．
由上，struct TEST 的「对齐要求」为：「4、8、8、8、1」中的最大值，即为 8．
最终得到，m 后有 7 bytes 的「空隙」．

调整结构体的成员的顺序

有了上面一大堆的铺垫，笔者相信读者们 数据结构对齐 有了自己的理解．但是还有一个遗留的问题值得在此共同探讨：怎么排列成员才能提高结构体的空间利用率．
答案很简单：「将成员按照其『对齐要求』降序排列」．
重新回到最开始的示例：

struct test
{
    char a;
    long b;
    int c;
    char d;
};

将其成员按照「对齐要求」降序排列便得到了：

struct test
{
    long b;
    int c;
    char a;
    char d;
};

经过简单的重新排序，struct test 现在只需要占用 16 bytes，节省了 8 bytes．

但是这种做法并不一定是最好的，有时结构体的成员的排列具有逻辑顺序，具有便于开发者理解的作用，重排可能会打破原有的逻辑顺序．

tip
TIP

有时，我们有充分的理由，决定不对结构的成员进行重排以减少因对齐带来的空间损失．例如，我们可能想把相关的结构成员存储在一起，提高程序的可维护性和可读性．但是，如果不存在这样的理由，结构的成员应该根据它们的边界需要进行重排，减少因边界对齐而造成的内存损失．
当程序将创建几百个甚至几千个结构时，减少内存浪费的要求就比程序的可读性更为急迫．在这种情况下，在声明中增加注释可能避免可读性方面的损失．[3]

参考资料

¹. Randal E.Bryant.深入理解计算机系统[M].第三版.龚奕利,译.北京:机械工业出版社 ↩

². Stephen Prata.C Primer Plus[M].第六版.姜佑,译.北京:人民邮电出版社 ↩

³. Kenneth.A.Reek.C和指针[M].徐波,译.北京:人民邮电出版社 ↩

2020-10-04发表CyberspaceSecurity / Dev6 分钟读完 (大约833个字)

在 GNU/Linux 中用 C语言计算文件的 Hash

在今日之前，笔者从未想到使用 C/C++ 在 GNU/Linux 计算文件的 Hash （例如：SHA-1、MD5 、 SHA-256 等）会这样的麻烦．

笔者以为会有 char * sha256sum(int fd) 或 char * sha256sum(FILE *stream) 类似的函数来轻松的获取文件的 Hash ．但事实并非如此，获取文件的 Hash 的方法远比笔者想象中的做法要复杂．

方案1 自行实现`Hash`函数

这种方案是最为麻烦的，但有着不依赖第三方库和程序的优点．至于如何实现 Hash 函数不是本文重点，笔者对此也不做说明．

方案2 调用`Openssl` 库

笔者的 openssl 版本为1.1.1i 8 Dec 2020 ， /usr/include/openssl 中提供的头文件可点击下方的小三角查看．

点此查看更多信息

aes.h asn1err.h asn1.h asn1_mac.h asn1t.h asyncerr.h async.h bioerr.h bio.h blowfish.h bnerr.h bn.h buffererr.h buffer.h camellia.h cast.h cmac.h cmserr.h cms.h comperr.h comp.h conf_api.h conferr.h conf.h cryptoerr.h crypto.h cterr.h ct.h des.h dherr.h dh.h dsaerr.h dsa.h dtls1.h ebcdic.h ecdh.h ecdsa.h ecerr.h ec.h engineerr.h engine.h e_os2.h err.h evperr.h evp.h hmac.h idea.h kdferr.h kdf.h lhash.h md2.h md4.h md5.h mdc2.h modes.h objectserr.h objects.h obj_mac.h ocsperr.h ocsp.h opensslconf.h opensslv.h ossl_typ.h pem2.h pemerr.h pem.h pkcs12err.h pkcs12.h pkcs7err.h pkcs7.h rand_drbg.h randerr.h rand.h rc2.h rc4.h rc5.h ripemd.h rsaerr.h rsa.h safestack.h seed.h sha.h srp.h srtp.h ssl2.h ssl3.h sslerr.h ssl.h stack.h storeerr.h store.h symhacks.h tls1.h tserr.h ts.h txt_db.h uierr.h ui.h whrlpool.h x509err.h x509.h x509v3err.h x509v3.h x509_vfy.h

#include <openssl/sha.h>
#include <stdio.h>

int main(void)
{
    unsigned char result[SHA256_DIGEST_LENGTH];
    char *filename = "README.md";

    FILE *file = fopen(filename, "rb");
    SHA256_CTX hash;

    if (file == NULL)
    {
        perror("fopen");
        return 1;
    }
    SHA256_Init(&hash);

    ssize_t size;
    unsigned char buf[4096];

    while ((size = fread(buf, 1, 4096, file)) != 0)
    {
        SHA256_Update(&hash, buf, size);
    }
    SHA256_Final(result, &hash);
    for (size_t i = 0; i < SHA256_DIGEST_LENGTH; i++)
    {
        printf("%02x", result[i]);
    }
    fclose(file);
    return 0;
}

本方案调用了 openssl 提供的 sha.h 比自行实现 Hash函数能方便一点点．使用本方案的程序在编译时需要使用 -lssl 和 -lcrypto 参数链接相关的库．

方案3 进程间通信调用其他程序

在 GNU/Linux 中通常含有 sha256sum 、 sha512sum 、 md5sum 等程序，并支持以类似 sha256sum <path> 的格式直接调用．那么，就可以使用 popen 函数完成进程间通信，直接获取文件的 Hash ．

#include <stdio.h>
int main(void)
{
    FILE *target;
    target = popen("sha256sum ~/README.md", "r");
    if (target == NULL)
    {
        perror("popen");
    }
    char hash[65];
    fscanf(target, "%64s", hash);
    printf("%s\n", hash);
    pclose(target);
    return 0;
}

这种方式的好处显而易见「方便」，这种方法是也最容易理解的．

值得多说一句的是：Hash 函数生成的 Hash 是一个由函数决定的常数（例如：SHA-256的结果以字符串输出有 64 个 可打印字符 ），这个特性使得 数组的长度 与 读取的输出长度 是确定的．

测试环境

OS : Arch Linux

Kernel : 5.9.14-arch1-1

openssl : 1.1.1i 8 Dec 2020

gcc : 10.2.0

参考资料

¹. W.RichardStevens.Stephen.UNIX环境高级编程[M].第3版.戚正伟,译.北京:人民邮电出版社 ↩

2020-10-01发表Dev9 分钟读完 (大约1361个字)

命令行参数的误解

前言

我们都知道C语言中允许main函数拥有0个或2个参数，但也存在部分操作系统向程序传入更多的参数，还有部分实现中对标准进行扩展，允许main函数拥有更多的参数．
命令行参数作为main函数的两个参数被传递给程序，这两个参数通常被命名为int argc,char **argv，其中argc为参数的数量，argv为一个指向内含 argc + 1个 char 类型指针 的指针数组．

但仅用这段话进行描述可能难以对命令行参数有一个正确的认识，这种描述可能对命令行参数的理解不利．
我们先来分析一个程序．

/* ShowCommandLineArgument.c   */
#include <stdio.h>
int main(int argc, char **argv)
{
    printf("argc:%d\n", argc);
    for (int i = 0; i < argc; i++)
        printf("argv[%d]:%s\n", i, argv[i]);
    //argv[i]就是*(argv+i)，很明显是一个指向char的指针
    //程序并不以%s打印argv[argc]，而是退出循环．

    //请不要忘记：表达式(argv[argc]==NULL)为真
    printf("argv[%d]:%p\n", argc, argv[argc]);
    return 0;
}

在笔者的电脑中，该文件被存储在/home/admin/blog/ShowCommandLineArgument.c，输入命令 gcc ShowCommandLineArgument.c 进行编译，得到a.out，并以cd && ./blog/a.out -f ~/bolg/test1.md >./blog/test2.md /home/admin/blog/test3.md ./blog/test4.md执行该程序．

请思考，该程序会输出什么内容？你是否认为程序的输出为

argc:6
argv[0]:a.out
argv[1]:-f
argv[2]:~/bolg/test1.md
argv[3]：>./blog/test2.md
argv[4]:/home/admin/blog/test3.md
argv[5]:./blog/test4.md
argv[6]:(nil)

什么？你说没看到输出？请认真查看笔者输入的指令，其中包括了 >./blog/test2.md 意味把 a.out 的 标准输出 重定向至文件./blog/test2.md ．所以笔者使用 cat >./blog/test2.md 查看输出的内容，该程序在笔者的设备上的输出为：

argc:5
argv[0]:./blog/a.out
argv[1]:-f
argv[2]:/home/admin/bolg/test1.md
argv[3]:/home/admin/blog/test3.md
argv[4]:./blog/test4.md
argv[5]:(nil)

是不是和你的预期不尽相同，请听笔者逐一解释．

常见误区

误区1—-「认为 argv[0] 存储文件名」

实际上，argv[0] 会存储调用的指令中的第一个字符串，而不是文件名，strcmp(argv[0],__FILE__)并不总为0．

误区2—-「认为命令行参数总是被原样传递」

在上面的例子中可以发现，相对路径 ~/blog/test1.md 作为命令行参数传给程序，程序收到的实际上是文件的绝对路径 /home/admin/blog/test3.md ．
但同为相对路径的./blog/a.out和./blog/test4.md却可以正常传递给程序，而不被转换为绝对路径．
其他的相对路径写法是否能被正常传递？笔者在此使用由 ShowCommandLineArgument.c 编译得到的 a.out 文件继续测试．使用的指令为 ~/blog/a.out ./test/../blog/test1.md ../test2.md ~admin/blog/test3.md 由这两次测试，笔者大胆猜测只有以 ~ 开头的相对路径会被转换为绝对路径 然后才传递给程序．

argc:4
argv[0]:/home/admin/blog/a.out
argv[1]:./test/../blog/test1.md
argv[2]:../test2.md
argv[3]:/home/admin/blog/test3.md
argv[4]:(nil)

为什么要这么做呢？

请分析笔者的这个程序．

#include <stdio.h>
int main(void)
{
    char path1[] = "./blog/test2.md";
    char path2[] = "~/blog/test2.md";
    char path3[] = "~admin/blog/test2.md";
    char path4[] = "~/blog/test2.md";
    char path5[] = "../blog/test2.md";

    if (fopen(path1, "r") == NULL)
        perror("path1");
    else
        printf("1Success\n");
    if (fopen(path2, "r") == NULL)
        perror("path2");
    else
        printf("2Success\n");
    if (fopen(path3, "r") == NULL)
        perror("path3");
    else
        printf("3Success\n");
    if (fopen(path4, "r") == NULL)
        perror("path4");
    else
        printf("4Success\n");
    if (fopen(path5, "r") == NULL)
        perror("path5");
    else
        printf("5Success\n");
    return 0;
}

笔者用cd && ./blog/a.out调用该程序编译得到的可执行文件，得到的输出为：

1Success
path2: No such file or directory
path3: No such file or directory
path4: No such file or directory
path5: No such file or directory

我们可以惊讶的发现只有第一次成功的打开了文件，其他4次操作全部报错．当然，其中第五次打开文件的操作失败是理所当然的，因为确实没有这个文件存在．笔者复制该可执行文件至~/test/a.out后重新执行该程序即发现，第5次文件打开操作成功了．

path1: No such file or directory
path2: No such file or directory
path3: No such file or directory
path4: No such file or directory
5Success

这说明：fopen()无法识别以~开头的相对路径，也体现了命令行参数在传递过程中，转换以~开头的相对路径为绝对路径的必要性．

误区3—「认为重定向是命令行参数」

重定向虽然也在命令行参数的位置，但和命令行参数具有本质的区别．

实践说明重定向指令不会被当中命令行参数传递给程序．

在开发中应该小心，防止误认，也需防止命令行参数中出现相关符号被系统当做重定向指令，导致命令行参数传递错误．

测试环境

OS: Arch Linux
Kernel: x86_64 Linux 5.8.14-arch1-1

参考书籍

¹. Stephen Prata.C Primer Plus[M].第六版.姜佑,译.北京:人民邮电出版社 ↩

². Kenneth.A.Reek.C和指针[M].徐波,译.北京:人民邮电出版社.2008 ↩

Lint

动态分析

Sanitizers

Thread Safety Analysis

Valgrind

网络分析

单元测试

日志

Debuger

参考资料

x86_64 函数调用

前置知识

栈

PUSH

POP

LEAVE

无返回值与参数的「函数调用与返回过程」

有返回值和参数的「函数调用与返回过程」上

有返回值和参数的「函数调用与返回过程」下

x86-64 调用约定

Microsoft x64 calling convention

System V AMD64 ABI

结构体的按值传递

C++ 与参数传递

非虚成员函数

虚成员函数

参考资料

malloc

fastbin 检查

smallbin 检查

UnsortBin 检查

LargeBin 检查

其他情况

使用 Top chunk

free

TCACHE 与多线程环境

__libc_malloc

free

再看源码

_int_malloc

_int_free

ptmalloc2 changelog

2.23

2.24

2.25

2.26

2.27

2.28

2.29

2.30

2.31

2.32

2.34

参考资料

Linux C 开发实战—myshell

需求

开发过程

头文件

宏和全局变量

错误处理

开发前的分析

获取并解析用户输入

将命令拆分成多段

分析处理命令段

处理标准输出、错误输出重定向

处理输入重定向

解析命令行参数

打开重定文件

转换相对路径

执行命令段

内置命令

收尾工作

输出颜色

反思

必要说明

不够友善的错误处理

不够合理的调用方式

测试环境

参考资料

GNU/Linux C 开发实战—myls

`PUSH`

`POP`

`LEAVE`

`fastbin` 检查

`smallbin` 检查

`UnsortBin` 检查

`LargeBin` 检查

格式化输出文件的`i-node`编号和以块为单位文件的大小

动态分配 `FileStack`

`getopt()` 的基本信息

`getopt()`的参数

`argv`、`argc`

`optstring`

`getopt()`的返回值

`getopt()` 的扫描模式

`getopt()` 的错误处理

与 `getopt()` 相关的全局变量

`getopt_long()`

`getopt_long_only()`

方案1 自行实现`Hash`函数

方案2 调用`Openssl` 库