《C++ Primer Plus》8. 函数探幽

内联函数

内联与宏的区别

宏无法实现值传递
宏在预处理阶段执行文本替换，内联函数在编译阶段进行处理

这一点对 OIer 来说应该印象非常深刻，想要写一个更高效的比较函数，如果写一个这样的宏：

#define MIN(x, y) (x < y ? x : y)

它有两个缺点：一是运算符优先级，如果传入的 x 和 y 是表达式，那么可能会出问题；二是多次调用，如果参数是函数调用，那么三目运算符会调用两次这个函数。

如何使用内联

内联函数的链接性：内部链接性

多文件

在 C++ 中，一个 cpp 文件是一个编译单元。在链接时，链接器是不知道函数的具体实现的，所以在一个 cpp 文件中定义一个内联函数，而在另一个 cpp 文件中使用这个函数是不行的，编译器不会将这个内联函数放到符号表中。例如：

// test.cpp
inline int func() {
    return 1;
}

// main.cpp
#include <iostream>
inline int func();

int main() {
    int x = func();
    std::cout << x;
    return 0;
}

执行 g++ test.cpp main.cpp -o a.exe，会得到一个链接错误：

main.cpp:2:12: warning: inline function 'int func()' used but never defined
    2 | inline int func();
      |            ^~~~
D:/ToolChain/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/14.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\Users\31070\AppData\Local\Temp\cc1Eckmn.o:main.cpp:(.text+0xe): undefined reference to `func()'
collect2.exe: error: ld returned 1 exit status

然后我们看一下 test.cpp 这个文件。

g++ test.cpp -S 编译 test.cpp 文件生成的汇编如下：

	.file	"test.cpp"
	.text
	.ident	"GCC: (x86_64-posix-seh-rev0, Built by MinGW-Builds project) 14.2.0"

如果将 test.cpp 里面的 inline 删去，生成的汇编如下：

	.file	"test.cpp"
	.text
	.globl	_Z4funcv
	.def	_Z4funcv;	.scl	2;	.type	32;	.endef
	.seh_proc	_Z4funcv
_Z4funcv:
.LFB0:
	pushq	%rbp
	.seh_pushreg	%rbp
	movq	%rsp, %rbp
	.seh_setframe	%rbp, 0
	.seh_endprologue
	movl	$1, %eax
	popq	%rbp
	ret
	.seh_endproc
	.ident	"GCC: (x86_64-posix-seh-rev0, Built by MinGW-Builds project) 14.2.0"

可以看到，使用内联后，汇编代码没有了 .globl 字段，这就是上面链接错误的原因。

虽然如此，但如果在 test.cpp 中添加一个函数来调用 func() 函数，如下：

// test.cpp
inline int func() {
    return 1;
}

int fc() {
    func();
}

// main.cpp
#include <iostream>
inline int func();

int main() {
    int x = func();
    std::cout << x;
    return 0;
}

这样，不开优化选项的话编译器仍然会将 func 添加到符号表，执行 g++ test.cpp -S 结果如下：

	.file	"test.cpp"
	.text
	.section	.text$_Z4funcv,"x"
	.linkonce discard
	.globl	_Z4funcv
	.def	_Z4funcv;	.scl	2;	.type	32;	.endef
	.seh_proc	_Z4funcv
_Z4funcv:
.LFB0:
	pushq	%rbp
	.seh_pushreg	%rbp
	movq	%rsp, %rbp
	.seh_setframe	%rbp, 0
	.seh_endprologue
	movl	$1, %eax
	popq	%rbp
	ret
	.seh_endproc
	.text
	.globl	_Z2fcv
	.def	_Z2fcv;	.scl	2;	.type	32;	.endef
	.seh_proc	_Z2fcv
_Z2fcv:
.LFB1:
	pushq	%rbp
	.seh_pushreg	%rbp
	movq	%rsp, %rbp
	.seh_setframe	%rbp, 0
	subq	$32, %rsp
	.seh_stackalloc	32
	.seh_endprologue
	call	_Z4funcv
	ud2
	.seh_endproc
	.ident	"GCC: (x86_64-posix-seh-rev0, Built by MinGW-Builds project) 14.2.0"

有两个 .globl 字段，编译器将 func 和 fc 都加入了符号表

如果打开优化选项，编译器才会执行内联，执行 g++ test.cpp -S -O1：

	.file	"test.cpp"
	.text
	.globl	_Z2fcv
	.def	_Z2fcv;	.scl	2;	.type	32;	.endef
	.seh_proc	_Z2fcv
_Z2fcv:
.LFB1:
	.seh_endprologue
	.seh_endproc
	.ident	"GCC: (x86_64-posix-seh-rev0, Built by MinGW-Builds project) 14.2.0"

综上所述，不要跨文件使用内联，开优化后编译器才会真的执行内联。

此外，在包含头文件 .h 和源文件 .cpp 的结构中，内联函数一定要在头文件中声明并实现。

单文件

对于下面的代码：

// main.cpp
#include <iostream>
inline int func();

int func() {
    return 1;
}

int main() {
    int x = func();
    std::cout << x;
    return 0;
}

实践证明（仅 mingw64），函数的声明和定义只要有一个添加了 inline，并且开了优化选项，编译器就会执行内联

引用变量

在编译产物中，引用变量和指针并没有什么不同，引用变量是对裸指针的抽象并成为 C++ 类型系统的一部分。

引用和指针有一些不同，例如：引用必须在声明时进行初始化，而指针不需要。

引用更接近 const 指针，也就是说：int &a = b; 实际上与 int * const a = &b; 等价。

如果不想要更改引用的值，那么应该传入 const int & a，

// main.cpp
struct S {
    mutable int a;
    int b;
};

void run(const S &s) {
    s.a = 1;
    // s.b = 2;
}

int main() {
    S s{10, 20};
    run(s);
    return 0;
}

这样写编译器不会报错，可见 mutable 关键字对 const S & 与 const S 都有效

临时变量

当参数为 const 引用时，如果实参与引用参数不匹配，C++ 将生成临时变量。例如下面的代码：

void run(const int &x) {}

int main() {
    long long a = 1;
    run(a);
    return 0;
}

这样设计是可以理解的，因为函数并不会更改引用的值，创建一个临时变量也无妨。需要注意的是，编译器并不会对这种临时变量进行警告，即使是大转小（例如 double 转 int）。

还有一种情况编译器会创建临时变量，参照《C++ Primer Plus》，如果参数为 const 引用，那么当：

实参类型正确，但不是左值（实际上类型参数可以不正确，但必须可以转化为正确的类型）
实参类型不正确，但可以转化为正确的类型

时编译器会创建临时变量。

第 2 种情况就是前面说的，那么第 1 种情况是在干什么呢？

左值指的是可以被引用的数据对象，例如变量、数组元素、结构成员、引用和解引用的指针，const 常量也是左值。非左值包括字面常量（字符串除外）和包含多项的表达式。

左值的概念与操作系统程序的“段”有联系。在程序中，字符串字面值存储在常量段，它们是可以被引用的；而诸如 1、1.34 这样的字面值是直接写到代码段的，所以它们不能被引用；含多项的表达式需要进行计算，它们显然不能被引用。

第 1 种情况实际上就是让 const 引用可以引用不能被引用的值，可以把 const 引用当成 const 变量来使用

此外，const char * 或 char * 可以传递给 const string &，也就是说可以写出下面的代码：

// main.cpp
#include <string>
void run(const std::string &s) {}

int main() {
    run("123");
    return 0;
}

返回引用

使用传统返回机制中，执行 g++ main.cpp -S --std=c++11 -fno-elide-constructors（关闭返回值优化。C++ 17 会强制打开返回值优化，所以使用 C++ 11）得到汇编代码：

// main.cpp
struct S {
    int x1;
    int x2;
    int x3;
};

S func() {
    return S{1};
}

int main() {
    S s = func();
    return 0;
}

	.file	"main.cpp"
	.text
	.section	.text$_ZN1SC1EOS_,"x"
	.linkonce discard
	.align 2
	.globl	_ZN1SC1EOS_
	.def	_ZN1SC1EOS_;	.scl	2;	.type	32;	.endef
	.seh_proc	_ZN1SC1EOS_
_ZN1SC1EOS_:
.LFB3:
	pushq	%rbp
	.seh_pushreg	%rbp
	movq	%rsp, %rbp
	.seh_setframe	%rbp, 0
	.seh_endprologue
	movq	%rcx, 16(%rbp)
	movq	%rdx, 24(%rbp)
	movq	16(%rbp), %rax
	movq	24(%rbp), %rdx
	movq	(%rdx), %rcx
	movq	%rcx, (%rax)
	movl	8(%rdx), %edx
	movl	%edx, 8(%rax)
	nop
	popq	%rbp
	ret
	.seh_endproc
	.text
	.globl	_Z4funcv
	.def	_Z4funcv;	.scl	2;	.type	32;	.endef
	.seh_proc	_Z4funcv
_Z4funcv:
.LFB0:
	pushq	%rbp
	.seh_pushreg	%rbp
	movq	%rsp, %rbp
	.seh_setframe	%rbp, 0
	subq	$48, %rsp
	.seh_stackalloc	48
	.seh_endprologue
	movq	%rcx, 16(%rbp)
	movq	$0, -12(%rbp)
	movl	$0, -4(%rbp)
	movl	$1, -12(%rbp)
	leaq	-12(%rbp), %rax
	movq	16(%rbp), %rcx
	movq	%rax, %rdx
	call	_ZN1SC1EOS_
	movq	16(%rbp), %rax
	addq	$48, %rsp
	popq	%rbp
	ret
	.seh_endproc
	.globl	main
	.def	main;	.scl	2;	.type	32;	.endef
	.seh_proc	main
main:
.LFB4:
	pushq	%rbp
	.seh_pushreg	%rbp
	movq	%rsp, %rbp
	.seh_setframe	%rbp, 0
	subq	$64, %rsp
	.seh_stackalloc	64
	.seh_endprologue
	call	__main
	leaq	-12(%rbp), %rax
	movq	%rax, %rcx
	call	_Z4funcv
	leaq	-12(%rbp), %rdx
	leaq	-24(%rbp), %rax
	movq	%rax, %rcx
	call	_ZN1SC1EOS_
	movl	$0, %eax
	addq	$64, %rsp
	popq	%rbp
	ret
	.seh_endproc
	.def	__main;	.scl	2;	.type	32;	.endef
	.ident	"GCC: (x86_64-posix-seh-rev0, Built by MinGW-Builds project) 14.2.0"

其中，_ZN1SC1EOS_ 是 S 的移动构造函数（如果是 C++ 98 的话这里是 _ZN1SC1ERKS_ 拷贝构造，但在这里没什么区别），需要传入参数地址、目标地址

在上面的汇编代码中，

func 函数：

调用移动构造函数（call _ZN1SC1EOS_）
将参数的地址从栈中取了出来放到寄存器中（movl 16(%rbp), %rax）并返回

main 函数：

调用 func 函数（call _Z4funcv）获得参数地址
将参数地址取出放到寄存器中，计算出局部变量 s 的地址，放到寄存器中（两个 leaq 和一个 movq）
调用移动构造函数（call _ZN1SC1EOS_）

调用了两次构造函数

然后我们修改一下源代码，返回引用（代码实际上是错误的，仅仅为了说明），然后执行g++ main.cpp -S --std=c++11 -fno-elide-constructors 得到汇编代码：

// main.cpp
struct S {
    int x1;
    int x2;
    int x3;
};

S & func() {
    S s = S{1};
    return s;
}

int main() {
    S s = func();
    return 0;
}

	.file	"main.cpp"
	.text
	.globl	_Z4funcv
	.def	_Z4funcv;	.scl	2;	.type	32;	.endef
	.seh_proc	_Z4funcv
_Z4funcv:
.LFB0:
	pushq	%rbp
	.seh_pushreg	%rbp
	movq	%rsp, %rbp
	.seh_setframe	%rbp, 0
	subq	$16, %rsp
	.seh_stackalloc	16
	.seh_endprologue
	movq	$0, -12(%rbp)
	movl	$0, -4(%rbp)
	movl	$1, -12(%rbp)
	movl	$0, %eax
	addq	$16, %rsp
	popq	%rbp
	ret
	.seh_endproc
	.section	.text$_ZN1SC1ERKS_,"x"
	.linkonce discard
	.align 2
	.globl	_ZN1SC1ERKS_
	.def	_ZN1SC1ERKS_;	.scl	2;	.type	32;	.endef
	.seh_proc	_ZN1SC1ERKS_
_ZN1SC1ERKS_:
.LFB7:
	pushq	%rbp
	.seh_pushreg	%rbp
	movq	%rsp, %rbp
	.seh_setframe	%rbp, 0
	.seh_endprologue
	movq	%rcx, 16(%rbp)
	movq	%rdx, 24(%rbp)
	movq	16(%rbp), %rax
	movq	24(%rbp), %rdx
	movq	(%rdx), %rcx
	movq	%rcx, (%rax)
	movl	8(%rdx), %edx
	movl	%edx, 8(%rax)
	nop
	popq	%rbp
	ret
	.seh_endproc
	.text
	.globl	main
	.def	main;	.scl	2;	.type	32;	.endef
	.seh_proc	main
main:
.LFB4:
	pushq	%rbp
	.seh_pushreg	%rbp
	movq	%rsp, %rbp
	.seh_setframe	%rbp, 0
	subq	$48, %rsp
	.seh_stackalloc	48
	.seh_endprologue
	call	__main
	call	_Z4funcv
	movq	%rax, %rdx
	leaq	-12(%rbp), %rax
	movq	%rax, %rcx
	call	_ZN1SC1ERKS_
	movl	$0, %eax
	addq	$48, %rsp
	popq	%rbp
	ret
	.seh_endproc
	.def	__main;	.scl	2;	.type	32;	.endef
	.ident	"GCC: (x86_64-posix-seh-rev0, Built by MinGW-Builds project) 14.2.0"

其中，_ZN1SC1ERKS_ 为 S 的拷贝构造函数，它有两个参数表示源地址和目标地址。

在 func 函数中，返回的是地址（movl $0, %eax），这里编译器认为不能返回局部变量的地址，所以变成了 0

在 main 函数中：

调用 func 函数（call _Z4funcv）并将返回地址复制到寄存器（movq %rax, %rdx）
算出目标局部变量的地址（leaq -4(%rbp), %rax），并将其放到寄存器（movq %rax, %rcx）
调用拷贝构造函数（call _ZN1SC1ERKS_）

可以看到，使用引用作为返回值时，编译器会进行一次拷贝构造，相对于传统返回机制，减少了一次构造。

当然，如果打开返回值优化（编译器默认打开），那么情况又会不一样。对于第一次的代码，执行 g++ main.cpp -S --std=c++11 得到汇编代码：

	.file	"main.cpp"
	.text
	.globl	_Z4funcv
	.def	_Z4funcv;	.scl	2;	.type	32;	.endef
	.seh_proc	_Z4funcv
_Z4funcv:
.LFB0:
	pushq	%rbp
	.seh_pushreg	%rbp
	movq	%rsp, %rbp
	.seh_setframe	%rbp, 0
	.seh_endprologue
	movq	%rcx, 16(%rbp)
	movq	16(%rbp), %rax
	movq	$0, (%rax)
	movl	$0, 8(%rax)
	movq	16(%rbp), %rax
	movl	$1, (%rax)
	movq	16(%rbp), %rax
	popq	%rbp
	ret
	.seh_endproc
	.globl	main
	.def	main;	.scl	2;	.type	32;	.endef
	.seh_proc	main
main:
.LFB1:
	pushq	%rbp
	.seh_pushreg	%rbp
	movq	%rsp, %rbp
	.seh_setframe	%rbp, 0
	subq	$48, %rsp
	.seh_stackalloc	48
	.seh_endprologue
	call	__main
	leaq	-12(%rbp), %rax
	movq	%rax, %rcx
	call	_Z4funcv
	movl	$0, %eax
	addq	$48, %rsp
	popq	%rbp
	ret
	.seh_endproc
	.def	__main;	.scl	2;	.type	32;	.endef
	.ident	"GCC: (x86_64-posix-seh-rev0, Built by MinGW-Builds project) 14.2.0"

可以看到，编译器执行了返回值优化（Return Value Optimization），省略了拷贝或移动构造。

此外还有一个命名返回值优化（Named Return Value Optimization），这个就不展开了

引用与继承

前面提到 const 引用可以创建临时变量，但是对象比较特殊，它有继承和多态体系，这里我们先不讨论多态。

现在有下面的代码：

// main.cpp
#include <iostream>
using namespace std;

class Base {
public:
    int x;
    void run() const {
        cout << "Base " << x << endl;
    }
};

class A : public Base {
public:
    int x;
    void run() const {
        cout << "Base A " << x << endl;
    }
};

void run(const Base &a) {
    a.run();
    cout << a.x << endl;
}

int main() {
    A a;
    a.x = 3;
    run(a);
    return 0;
}

输出如下：

Base 1520310784
1520310784

这是个没有初始化的值。

由此可见，不论是方法还是字段都用的是基类的。去掉 const 也是如此

函数重载

函数签名不区分 const 与非 const
当传入参数为左值时，编译器无法确定调用引用还是非引用
当传入参数为非左值时，编译器无法确定调用 const 引用还是非引用

例如，下面的代码无法通过编译：

// main.cpp
void run1(const int a) {}
void run1(int a) {}

void run2(int &a) {}
void run2(int a) {}

void run3(const int &a) {}
void run3(int a) {}

int main() {
    int x = 1;
    run2(x);
    run3(1);
}

报错如下：

main.cpp:3:6: error: redefinition of 'void run1(int)'
    3 | void run1(int a) {}
      |      ^~~~
main.cpp:2:6: note: 'void run1(int)' previously defined here
    2 | void run1(const int a) {}
      |      ^~~~
main.cpp: In function 'int main()':
main.cpp:13:9: error: call of overloaded 'run2(int&)' is ambiguous
   13 |     run2(x);
      |     ~~~~^~~
main.cpp:5:6: note: candidate: 'void run2(int&)'
    5 | void run2(int &a) {}
      |      ^~~~
main.cpp:6:6: note: candidate: 'void run2(int)'
    6 | void run2(int a) {}
      |      ^~~~
main.cpp:14:9: error: call of overloaded 'run3(int)' is ambiguous
   14 |     run3(1);
      |     ~~~~^~~
main.cpp:8:6: note: candidate: 'void run3(const int&)'
    8 | void run3(const int &a) {}
      |      ^~~~
main.cpp:9:6: note: candidate: 'void run3(int)'
    9 | void run3(int a) {}
      |      ^~~~

注意，第一个和第二、三个报的错不相同：

第一种错误时在创建函数的时候出现的
第二、三种是在重载解析时出现的，下面会讨论

函数模板

函数模板不会创建函数，它只是告诉编译器如何创建一类函数，编译器按照需要创建函数。例如：

// main.cpp
template <typename T>
void run(T x);

int main() {
    run(1);
    run(1.1f);
}

编译器发现传入的参数有 int 和 float，所以这段代码等价于：

// main.cpp
void run(int x);
void run(float x);

int main() {
    run(1);
    run(1.1f);
}

注意，当声明和定义分开到两个 cpp 文件时，不能用模板，例如：

// test.cpp
template <typename T>
void run(T x) {}

// main.cpp
template <typename T>
void run(T x);

int main() {
    run(1);
    run(1.1f);
}

这段代码会报链接错误，因为在 test.cpp 中，编译器不知道创建哪些函数的定义，应该显式指出：

// test.cpp
template <typename T>
void run(T x) {}

template void run<int>(int);
template void run<float>(float);

// main.cpp
template <typename T>
void run(T x);

int main() {
    run(1);
    run(1.1f);
}

不过一般会将模板函数的声明和定义放到 hpp 头文件中，而不是显式指出。这样做的缺点就是会增大编译产物的体积，因为每个 cpp 都会编译出它需要的所有函数，而不是在链接时链接，好处就是方便

注意：template void run<int>(int); 与 void run(int); 在编译器看来不同，Name Mangling 也不同

重载与模板重载

根据实践，在 C++ 中，如果同时出现普通函数和模板函数（不管有没有显式指定），编译器会优先使用普通函数而不是模板函数，例如：

// main.cpp
#include <iostream>
using namespace std;

template <typename T>
void run(T x) {
    cout << x << " from template run" << endl;
}

void run(const int &x) {
    cout << x << " from run" << endl;
}

int main() {
    int x = 1;
    run(x);
    run(2);
}

输出为：

1 from run
2 from run

函数模板的重载：当出现多个函数模板匹配同一个函数调用时，编译器会报错，就和普通函数重载相同。例如：

// main.cpp
#include <iostream>
using namespace std;

template <typename T>
void run(T x) {
    cout << x << " from template run" << endl;
}

template <typename T>
void run(const T &x) {
    cout << x << " from template run" << endl;
}

int main() {
    int x = 1;
    run(x);
    run(2);
}

报错为：

main.cpp: In function 'int main()':
main.cpp:17:8: error: call of overloaded 'run(int&)' is ambiguous
   17 |     run(x);
      |     ~~~^~~
main.cpp:6:6: note: candidate: 'void run(T) [with T = int]'
    6 | void run(T x) {
      |      ^~~
main.cpp:11:6: note: candidate: 'void run(const T&) [with T = int]'
   11 | void run(const T &x) {
      |      ^~~
main.cpp:18:8: error: call of overloaded 'run(int)' is ambiguous
   18 |     run(2);
      |     ~~~^~~
main.cpp:6:6: note: candidate: 'void run(T) [with T = int]'
    6 | void run(T x) {
      |      ^~~
main.cpp:11:6: note: candidate: 'void run(const T&) [with T = int]'
   11 | void run(const T &x) {
      |      ^~~

重载解析

上面可以引出 C++ 的函数调用策略，称为重载解析（overloading resolution），参考《C++ Primer Plus》：

创建候选函数列表，包含与被调用函数名称相同的函数和模板函数
使用候选函数列表创建可行函数列表，包含所有可以被调用的函数。参数数量很好判断，判断参数类型时有一个隐式转换序列，包含实参类型与形参类型完全匹配的情况（可以转换后匹配，例如 double 可以转换为 float，从而与 float 匹配）
确定是否有最佳的可行函数，有则使用，否则报错。最佳到最差的顺序如下：
1. 完全匹配，但常规函数优于模板
2. 向上转换（如 int 转 long long）
3. 向下转换（如 long long 转 int）

对于完全匹配，《C++ Primer Plus》有一个表（完全匹配允许的无关紧要转换）：

从实参	到形参
`Type`	`Type &`
`Type &`	`Type`
`Type[]`	`* Type`
`Type` 参数列表	`Type (*)` 参数列表
`Type`	`const Type`
`Type`	`volatile Type`
`Type *`	`const Type *`
`Type *`	`volatile Type *`

如果有多个匹配的原型，那么编译器就会报 ambiguous 错误，剩下的奇奇怪怪的情况就不看了

C++ 还允许用户手动选择需要的函数，例如《C++ Primer Plus》中的示例：

// choices.cpp -- choosing a template
#include <iostream>

template <typename T>
T lesser(T a, T b) {
    return a < b ? a : b;
}

int lesser(int a, int b) {
    a = a < 0 ? -a : a;
    b = b < 0 ? -b : b;
    return a < b ? a : b;
}

int main() {
    using namespace std;
    int m = 20;
    int n = -30;
    double x = 15.5;
    double y = 25.9;
    
    cout << lesser(m, n) << endl;      // use #2
    cout << lesser(x, y) << endl;      // use #1 with double
    cout << lesser<>(m, n) << endl;    // use #1 with int
    cout << lesser<int>(x, y) << endl; // use #1 with int
    
    return 0;
}

`decltype` 关键字

decltype 用于推断类型，语法为：decltype(expression) var;

判断过程如下：

如果 expersion 是一个没有用括号括起的标识符，则 var 与该标识符类型相同
如果 expersion 是一个函数调用，则 var 与函数返回类型相同（并不会实际调用函数）
如果 expersion 是一个左值（需要用括号括起来），则 var 为其指向类型的引用
如果前面条件都不满足，则 var 类型与 expersion 类型相同

C++ 11 的后置返回类型

可以实现返回值类型的编译期动态指定，示例：

template <typename T1, typename T2>
auto h(T1 x, T2 y) -> decltype(x + y) {
    return x + y;
}