[C++] 多线程 / 异步编程简易笔记

前言

在 C++ 11 之前，C++ 标准库并没有提供内置的多线程支持，实现多线程编程通常需要引入外部第三方库，Linux 上常用 pthread.h（POSIX Threads），而 Windows 上则需要调用 Windows API，可以使用 pthreads-win32 库或 Boost.Thread 库来实现，跨平台非常麻烦。在 C++ 11 中引入了官方的多线程库 <thread>，大大方便了代码的编写。

不过如果在 Windows 上使用 MinGW-GCC 的话，若根据网上的旧教程使用 MinGW 8.1.0 版本的话需要注意安装时选择 Posix 而非 Win32，否则 <thread> 不完整，推荐去 Github 下载最新版。

`std::thread`

在 C++ 11 中使用 std::thread 创建线程，头文件 <thread>

先来看个简单的例子：

#include <cstdio>
#include <thread>
#include <functional>

void Func1() {
    puts("Func1");
}

void Func3(int a, int& b) {
    printf("Func3 %d %d\n", a, b);
}

void Func4() {
    std::this_thread::sleep_for(std::chrono::seconds(1));
    puts("Func4");
}

void Func5() {
    std::this_thread::sleep_for(std::chrono::seconds(1));
}

int main() {
    std::thread t0; // 空线程
    std::thread t1(Func1);  // 创建新线程（入口为 Func1）并立即开始
    std::thread t2([]{ puts("Func2"); }); // 创建新线程（入口为 Lambda 函数）并立即开始

    int a = 1;
    int b = 2;
    std::thread t3(Func3, a, std::ref(b));  // 创建新线程同时传递参数，注意引用需要使用 std::ref 或 std::cref

    t1.join();  // 若 t1 未结束则阻塞主线程，等待 t1 结束
    t2.join();
    t3.join();

    std::thread t4(Func4);
    t4.join();      // 等待线程完成

    std::thread t5(Func5);
    t5.detach();    // 直接分离线程，该线程后台运行，无法再取得控制权

    return 0;
}

std::thread 第一个参数接受一个函数，作为新线程的入口，之后依次跟着该函数的参数，注意引用需要使用 std::ref 或 std::cref（const ref）（头文件为 <functional>）

线程创建后会立即开始，可以调用 join() 等待完成或 detach() 分离线程。线程创建后必须调用其中一个函数，否则在主线程结束后将引发异常。

std::thread 也可以结合 std::function 和 std::bind 使用，这里不展开讲诉。

此外还有一些其它的成员函数，如：

.get_id() 获取此线程 ID
.joinable() 此线程是否可 join()
std::this_thread::get_id() 获取当前线程 ID
std::this_thread::yield() 切换线程
std::this_thread::sleep_for 搭配 std::chrono 实现休眠指定时间
std::this_thread::sleep_until 同上

`std::mutex` （互斥锁）及其它锁

关于为什么需要使用 std::mutex 这里不进行详细展开，可以去操作系统课或多线程编程相关资料了解一下，大概就是多个线程同时对一个资源操作可能引发异常。这时候就需要能够保护变量不被其它线程使用的方法，即锁。

首先是 std::mutex 互斥锁，头文件 <mutex>。

使用非常简单，见下面的代码：

#include <cstdio>
#include <mutex>
#include <thread>

int x = 0;
std::mutex m;

void x_plus_1() {
    m.lock();
    x += 1;
    m.unlock();
}

int main()
{
    std::thread t1(x_plus_1);
    std::thread t2(x_plus_1);
    t1.join();
    t2.join();
    printf("%d\n", x);
    return 0;
}

成员函数：

.lock() 尝试获取锁，不成功则阻塞线程直到获取锁
.unlock() 解锁，若不是此线程上锁则会引发异常
.try_lock() 尝试获取锁，成功则上锁并返回 true

但通常不直接使用 std::mutex，因为使用不当可能产生死锁（如上方代码忘记 .unlock()），根据 C++ 的 RAII 思想，标准库封装了 std::lock_guard，std::unique_lock，std::shared_lock（C++ 14）及 std::scoped_lock（C++ 17）等类型来管理锁。以 std::lock_guard 为例：

#include <cstdio>
#include <mutex>
#include <thread>

int x = 0;
std::mutex m;

void x_plus_1() {
    std::lock_guard<std::mutex> guard(m);
    x += 1;
    // RAII: guard.~lock_guard();
}

int main()
{
    std::thread t1(x_plus_1);
    std::thread t2(x_plus_1);
    t1.join();
    t2.join();
    printf("%d\n", x);
    return 0;
}

不同于 std::lock_guard，std::unique_lock 比较灵活，但效率上会差一点，通常较少使用。而 std::unique_lock 可以带第二个参数：

std::adopt_lock 表示在声明 std::unique_lock 之前已经 lock 了目标锁，因此 std::unique_lock 不会再尝试 lock，仅负责 unlock。std::lock_guard 也可以携带这个参数。
std::try_to_lock 尝试 lock，但不一定成功，不会阻塞线程。此时可以调用成员函数 .owns_lock() 判断是否拥有控制权。
std::defer_lock
使用这个参数的前提是目标没有上锁，否则会引发异常。

表示不主动上锁，需要自己调用成员函数上锁，好处是可以中途手动控制解锁一段时间。部分成员函数：
- .lock() 上锁
- .unlock() 解锁（可以不调用，在 std::unique_lock 销毁时会自动调用）
- .try_lock() 尝试上锁并返回是否成功
- .release() 释放该 std::mutex 的所有权，返回 std::mutex*

传递 std::unique_lock 时可以使用 std::move。

此外，C++ 还有如 std::timed_mutex（允许获取锁时的时间限制），std::recursive_mutex（允许同一线程递归），std::recursive_timed_mutex，std::shared_timed_mutex（C++ 14）等锁类型，感兴趣的话可以查查资料。

`std::atomic`

为了减轻多线程使用锁的负担，C++ 标准库提供了一些原子操作（不会在中途被打断）以方便使用。

最简单的例子，std::atomic_int（即 std::atomic<int>），头文件 <atomic>：

#include <atomic>
#include <cstdio>
#include <thread>

std::atomic_int x = 0;

void x_plus_1() {
    x += 1;
}

int main()
{
    std::thread t1(x_plus_1);
    std::thread t2(x_plus_1);
    t1.join();
    t2.join();
    printf("%d\n", (int)x);
    return 0;
}

其它类型及成员函数见 cplusplus，其实大多数时候就当成普通变量使用，注意 std::atomic 不支持浮点数。

`std::condition_variable`

C++11 中的 std::condition_variable（头文件 <condition_variable>）主要用于在多线程环境中实现同步和通信，需要和 std::unique_lock<std::mutex> 联合使用。

简单例子：

#include <condition_variable>
#include <iostream>
#include <thread>

std::mutex m;
std::condition_variable cv;
bool data_ready = false;

void worker_thread() {
    // 等待条件变量
    std::unique_lock<std::mutex> lk(m);
    cv.wait(lk, []{return data_ready;});

    // 条件变量被唤醒，处理数据
    std::cout << "Worker thread is processing data" << std::endl;
}

int main() {
    std::thread worker(worker_thread);

    // 主线程准备数据
    std::cout << "Main thread is preparing data" << std::endl;
    std::this_thread::sleep_for(std::chrono::seconds(1));

    // RAII
    {
        std::lock_guard<std::mutex> lk(m);
        data_ready = true;
    }

    // 通知 worker 线程
    cv.notify_one();
    worker.join();
    return 0;
}

接下来看看相关的成员函数：

.wait()
两种形式：
- .wait(std::unique_lock<std::mutex> &__lock)
  调用后释放互斥锁、等待在条件变量上、在其它线程调用 .notify() 时再次获取互斥锁并向下执行
- .wait(std::unique_lock<std::mutex> &__lock, _Predicate __p)
  调用 .notify() 时会检测第二个参数（通常是个函数）是否成立，不成立则继续等待
.wait_for()
有时间限制的 .wait()
.wait_until()
同上
.notify_one()
随机提醒一个正在等待的线程
.notify_all()
提醒所有正在等待的线程

`std::async` / `std::future` / `std::promise` 异步编程

所属头文件：<future>

std::async （这里指的是 C++ 11，而非 C++ 20 <coroutine> 协程中的 async）是 std::thread 的进一步封装。

不同于 std::thread，std::async 实际是一个函数，返回一个 std::future 类。比起说明，直接看一个例子更为直观：

#include <future>
#include <iostream>
#include <thread>

int main() {
    // 异步运行一个函数，并获取返回值的 future
    std::future<int> f = std::async([]() {
        // 在这里进行一些耗时的操作
        std::this_thread::sleep_for(std::chrono::seconds(2));
        return 42;
    });

    // 等待异步操作完成，并获取返回值
    std::cout << "The answer is " << f.get() << std::endl;

    return 0;
}

而 std::async 具体是异步还是同步根据操作系统而定，在部分操作系统会在调用 std::future.get() 之类的函数时才执行。

也可以通过新增一个参数手动指定 std::async 的方式：std::async(std::launch::async, ...)

其中第一个参数有以下几种值：

std::launch::async (0x1) 立即异步启动
std::launch::deferred (0x2) 实际调用（如 std::future.get()）时启动
std::launch::async | std::launch::deferred (0x3) 由操作系统决定

与 std::thread 相同，如果函数带参数的话就需要依次加在后面，引用需要 std::ref 或 std::cref，也可以结合 std::bind 使用：

#include <future>
#include <iostream>
#include <thread>

template <typename T>
T add(T a, T b) {
    // 在这里进行一些耗时的操作
    std::this_thread::sleep_for(std::chrono::seconds(2));
    return a + b;
}

int main()
{
    // 异步运行 add 函数，并获取返回值的 future
    std::future<int> f = std::async(std::launch::async, add<int>, 3, 4);

    // 等待异步操作完成，并获取返回值
    std::cout << "The answer is " << f.get() << std::endl;

    return 0;
}

而回到 std::future<> 类，也有几个简单的成员函数：

.get() 阻塞当前线程，获取结果（只能调用一次，若需多次调用需要 std::shared_future）
.wait() 阻塞当前线程，等待结果
.wait_for()
等待指定时间，根据目标线程是否结束返回 std::future_status::ready 或 std::future_status::timeout
若 std::async 启动方式为 std::launch::deferred 则直接返回 std::future_status::deferred
.wait_until() 同理
.share() 产生一份类型为 std::shared_future 的拷贝

讲完了 std::async 和 std::future，那么 std::promise 是干什么用的呢？

在特定场景下，我们可能需要获取 std::thread 的返回值，那么怎么做呢？显然的方法是传递引用或者通过全局变量获取返回值，~~但这样看起来就很不高端~~，这时 std::promise 就派上了用场。

std::promise 可以理解成一个 std::future 的包装，通过 .get_future() 成员函数可以获得，并且内部有一个值，可以通过接口修改，具体见例子：

#include <future>
#include <iostream>
#include <thread>

void workerThread(std::promise<int>&& promise) {
    std::this_thread::sleep_for(std::chrono::seconds(1));
    promise.set_value(42);
}

int main() {
    std::promise<int> promise;
    std::future<int> future = promise.get_future();
    std::thread worker(workerThread, std::move(promise));

    int result = future.get();
    std::cout << "Result: " << result << std::endl;
    worker.join();
    return 0;
}

以上只是一种用法，也可以以引用形式传入线程。

具体的成员函数：

.set_value() 设置 std::promise 的值，并且 future_status 变为 std::future_status::ready
.get_future() 获取 std::future 对象
.set_exception() 设置异常，std::future.get() 将会引发异常
.set_value_at_thread_exit() 在当前线程结束后设置值