c/c++开发之数据竞争检测工具tsan的介绍-EW帮帮网

Tsan(Thread Snitizer)简介

Tsan是一种用于检测多线程程序中的数据竞争和线程安全问题的工具。他是一种动态检查工具(动态检测工具是在程序运行期间分析程序行为、捕捉错误的工具。)，能够帮助开发者识别和修复并发过程中潜在的错误。

Tsan主要特性

1. 数据竞争检测

tsan能够检查数据竞争问题，这种情况发生在多线程并发访问同一内存位置时，其中至少有一个线程进行写操作，而这些访问之间没有适当的同步，tsan能够报告数据竞争的详细信息，包括涉及的线程、内存位置、堆栈跟踪和代码位置。

2.线程安全问题检测

tsan能够帮助发现与锁相关的问题，比如锁的使用不当、死锁等。能够检查线程间的同步操作是否正确，如锁的获取和释放是否匹配。

3.插桩与运行时检查

通过在编译时插入监控代码，tsan能够在程序运行时跟踪线程间的内存访问。在程序运行时进行动态检查，以确保线程间的访问符合预期的同步要求。

tsan的基本使用(以gcc/g++下的Tsan为例)

1.要求gcc/g++版本大于4.8，确保系统中安装了tsan支持库，通常包含在gcc工具链中
2.编译程序：启用-fsanitize=thread

g++ -fsanitize=thread -g -o example example.cpp

-fsanitize=thread：启用tsan

-g：包含调试信息，以便tsan提供详细的错误报告

-o：指定输出的可执行文件名称

3.运行编译后的可执行文件

./example

运行时，tsan会在检测到错误时输出报告

4.错误报告解析

示例：example.cpp

int counter = 0;  // 共享变量

void increment() {
    for (int i = 0; i < 10000; ++i) {
        ++counter;  // 未加锁，存在数据竞争
    }
}

int main() {
    std::thread t1(increment);
    std::thread t2(increment);

    t1.join();
    t2.join();

    std::cout << "Counter: " << counter << std::endl;
    return 0;
}

开启tsan编译并运行：

g++ -fsanitize=thread -g example.cpp -o example -pthread
./example

tsan检测结果：

==================
WARNING: ThreadSanitizer: data race (pid=12345)
  Write of size 4 at 0x0000004000a0 by thread T1:
    #0 increment() example.cpp:9
    #1 std::thread::_State_impl<...>::_M_run() ...

  Previous write of size 4 at 0x0000004000a0 by thread T2:
    #0 increment() example.cpp:9
    #1 std::thread::_State_impl<...>::_M_run() ...

  Location is global 'counter' of size 4 at 0x0000004000a0

  Thread T1 (tid=12345, running) created by main thread at:
    #0 std::thread::thread() ...
    #1 main example.cpp:14

  Thread T2 (tid=12346, finished) created by main thread at:
    #0 std::thread::thread() ...
    #1 main example.cpp:15
==================

报错信息解析：

部分标题	内容解释
`WARNING: ThreadSanitizer: data race`	检测到了数据竞争
`Write of size 4 at ... by thread T1`	表示 T1 线程对某地址进行了写操作，字节数是 4
`#0 increment() example.cpp:9`	问题发生在 `increment()` 函数的第 9 行
`Previous write of size 4 at ... by thread T2`	T2 线程也对同一地址进行了写操作，这是竞争的另一端
`Location is global 'counter'`	表示这两个线程访问的是全局变量 `counter`
`Thread T1 ... created by main thread`	给出 T1 是如何创建的，指出创建位置和线程关系
`Thread T2 ... created by main thread`	同样地，说明 T2 的创建源头

线程安全相关错误示例的tsan验证

1.读写竞争：simple race

示例：

int shared_var = 0;

void writer() {
    shared_var = 42;  // 写操作
}

void reader() {
    std::cout << "Read: " << shared_var << std::endl;  // 读操作
}

int main() {
    std::thread t1(writer);
    std::thread t2(reader);

    t1.join();
    t2.join();
    return 0;
}

问题描述：

这个程序中，shared_var 是一个共享的全局变量，writer 线程对其进行写操作，reader 线程则进行读操作。由于两个线程是并发执行的，且在访问共享变量时没有任何同步机制（如锁或内存屏障），因此会发生典型的读写数据竞争，即一个线程在写，另一个线程同时在读，可能导致读取到不一致或中间状态的值，属于未定义行为。

tsan检测结果：data race

==================
WARNING: ThreadSanitizer: data race (pid=12345)
  Write of size 4 at 0x0000004000a0 by thread T1:
    #0 writer() simple_race.cpp:8
    #1 std::thread::_State_impl<...>::_M_run() ...

  Previous read of size 4 at 0x0000004000a0 by thread T2:
    #0 reader() simple_race.cpp:12
    #1 std::thread::_State_impl<...>::_M_run() ...

  Location is global 'shared_var' of size 4 at 0x0000004000a0

  Thread T1 (tid=12345, running) created by main thread at:
    #0 std::thread::thread() ...
    #1 main simple_race.cpp:16

  Thread T2 (tid=12346, finished) created by main thread at:
    #0 std::thread::thread() ...
    #1 main simple_race.cpp:17
==================

2.双重锁检查：double-checked locking

示例代码：

class Singleton {
public:
    static Singleton* getInstance() {
        if (instance == nullptr) {                // 第一次检查（非同步）
            std::lock_guard<std::mutex> lock(mtx);
            if (instance == nullptr) {            // 第二次检查（同步）
                instance = new Singleton();       // 分配对象
            }
        }
        return instance;
    }

private:
    Singleton() { std::cout << "Singleton created\n"; }
    static Singleton* instance;
    static std::mutex mtx;
};

Singleton* Singleton::instance = nullptr;
std::mutex Singleton::mtx;

void accessSingleton() {
    Singleton* s = Singleton::getInstance();
}

int main() {
    std::thread t1(accessSingleton);
    std::thread t2(accessSingleton);
    t1.join();
    t2.join();
    return 0;
}

问题描述：

这个程序实现了经典的“双重锁检查”单例模式。虽然第二次检查时使用了锁，但第一次检查是无锁的，当两个线程几乎同时进入 getInstance()，同时看到 instance == nullptr，再进入加锁块中就可能都去执行 new Singleton()。由于对象分配和赋值不是原子操作，这种优化会引发竞态条件，导致返回未初始化或部分构造的对象，是典型的并发 bug。

在 双重锁检查 模式下，我们希望通过减少锁的粒度来优化性能，使得在多线程环境下，只有当 instance 为 nullptr 时才进入加锁区域。这样做的目的是为了避免每次访问单例时都需要加锁，从而提高性能。然而，内存分配（new Singleton()）和对象初始化 并不是原子操作，它们分成了两步：

内存分配：这是 new Singleton() 的第一部分，它会在堆上分配一块内存，用来存放 Singleton 对象。
对象初始化：这部分是构造函数的调用，它会初始化对象的成员变量。

因为内存分配和构造函数调用是两个不同的步骤，并且它们没有任何同步机制来确保对象完全初始化，所以可能发生一个线程还在进行对象初始化的过程中，另一个线程就已经开始使用该对象了，从而导致访问一个未完全初始化的对象。

tsan检测结果：data race

==================
WARNING: ThreadSanitizer: data race (pid=12345)
  Write of size 8 at 0x0000004000a0 by thread T1:
    #0 Singleton::getInstance() double_checked.cpp:10
    ...

  Previous read of size 8 at 0x0000004000a0 by thread T2:
    #0 Singleton::getInstance() double_checked.cpp:7
    ...

  Location is global 'Singleton::instance' of size 8 at 0x0000004000a0

  Thread T1 (tid=12345, running) created by main thread at:
    #0 std::thread::thread ...
    #1 main double_checked.cpp:29

  Thread T2 (tid=12346, running) created by main thread at:
    #0 std::thread::thread ...
    #1 main double_checked.cpp:30
==================

3.vptr竞争：data race on vptr

示例：

class Base {
public:
    virtual void show() {
        std::cout << "Base class\n";
    }
};

class Derived : public Base {
public:
    void show() override {
        std::cout << "Derived class\n";
    }
};

Base* ptr;

void writer() {
    ptr = new Derived();  // 写操作，修改虚函数表指针
}

void reader() {
    ptr->show();  // 读操作，调用虚函数
}

int main() {
    std::thread t1(writer);  // 创建一个线程来写操作
    std::thread t2(reader);  // 创建一个线程来读操作

    t1.join();
    t2.join();

    delete ptr;
    return 0;
}

问题描述：

在这个程序中，存在一个 虚函数表指针（vptr） 竞争问题。虚函数表指针（vptr）是 C++ 中实现多态的机制。当一个类声明了虚函数时，编译器会为该类分配一个虚函数表（vtable），并为每个类对象创建一个指向该虚函数表的指针（vptr）。

问题发生在：

writer线程对 ptr 进行写操作，分配一个 Derived 类型的对象并赋值给 ptr。
reader线程尝试调用 ptr->show()，在 ptr 被修改之前就访问它。

由于 ptr 指向的是一个虚函数表的指针，线程间竞争 ptr 的写入与读取操作会导致数据竞争，使得 ptr 在没有同步的情况下被两个线程同时访问。

关键问题：

writer 线程修改了 ptr，可能会改变虚函数表指针 vptr，使得 ptr 指向不同的虚函数表。
reader 线程尝试读取 ptr->show() 时，它依赖于 ptr 所指向的虚函数表来调用正确的 show 方法，但如果 ptr 在此时被 writer 线程改变，可能会导致访问到不一致或未定义的虚函数表，从而引发未定义行为。

这会导致程序在多线程环境下无法保证稳定的行为，可能导致崩溃、访问无效内存、或错误的函数调用。

tsan检测结果：data race on vptr

=================================================================
WARNING: ThreadSanitizer: data race on vptr
  Location 1: write of size 8 at 0x7fb5c081a010 by thread T1 (writer):
    #0 writer() /path/to/example.cpp:14 (example+0x104)
    #1 std::thread::_State_impl<std::thread::_Invoker<std::tuple<> > >::_M_run() /usr/include/c++/10/thread (libstdc++.so+0x10b09)
    #2 start_thread /usr/lib/x86_64-linux-gnu/libpthread.so.0 (libpthread.so.0+0x76ba)
    #3 __clone (libc.so.6+0x1040f0)

  Location 2: previous read of size 8 at 0x7fb5c081a010 by thread T2 (reader):
    #0 reader() /path/to/example.cpp:17 (example+0x123)
    #1 std::thread::_State_impl<std::thread::_Invoker<std::tuple<> > >::_M_run() /usr/include/c++/10/thread (libstdc++.so+0x10b09)
    #2 start_thread /usr/lib/x86_64-linux-gnu/libpthread.so.0 (libpthread.so.0+0x76ba)
    #3 __clone (libc.so.6+0x1040f0)

  Location is heap block of size 8 at 0x7fb5c081a010 allocated by thread T1 (writer):
    #0 operator new(unsigned long) /path/to/example.cpp:13 (example+0x13d)
    #1 writer() /path/to/example.cpp:14 (example+0x104)
    #2 std::thread::_State_impl<std::thread::_Invoker<std::tuple<> > >::_M_run() /usr/include/c++/10/thread (libstdc++.so+0x10b09)

  Thread T1 (writer):
    #0 writer() /path/to/example.cpp:14 (example+0x104)
    #1 std::thread::_State_impl<std::thread::_Invoker<std::tuple<> > >::_M_run() /usr/include/c++/10/thread (libstdc++.so+0x10b09)
    #2 start_thread /usr/lib/x86_64-linux-gnu/libpthread.so.0 (libpthread.so.0+0x76ba)
    #3 __clone (libc.so.6+0x1040f0)

  Thread T2 (reader):
    #0 reader() /path/to/example.cpp:17 (example+0x123)
    #1 std::thread::_State_impl<std::thread::_Invoker<std::tuple<> > >::_M_run() /usr/include/c++/10/thread (libstdc++.so+0x10b09)
    #2 start_thread /usr/lib/x86_64-linux-gnu/libpthread.so.0 (libpthread.so.0+0x76ba)
    #3 __clone (libc.so.6+0x1040f0)

=================================================================

4.对象发布时缺乏同步：publishing objects without synchronization

示例：

class Data {
public:
    int value;
};

Data* shared_data = nullptr;

void writer() {
    Data* d = new Data();
    d->value = 42;           // 初始化对象
    shared_data = d;         // 发布对象（无同步）
}

void reader() {
    if (shared_data) {
        std::cout << shared_data->value << std::endl;  // 读共享对象（无同步）
    }
}

int main() {
    std::thread t1(writer);
    std::thread t2(reader);

    t1.join();
    t2.join();

    delete shared_data;
    return 0;
}

问题描述：

这个程序展示了一个典型的 “发布对象但缺乏同步” 的错误模式。writer 线程通过 shared_data = d 将堆上构造的对象发布给其他线程；reader 线程可能并发读取该对象。但没有任何同步手段（如 mutex、原子变量、memory barrier）来保证对象构造完成后才被可见。因此，reader 线程可能观察到 shared_data 非空但其 value 成员尚未初始化，进而造成未定义行为。这是典型的发布时数据竞争。

tsan检测结果：data-race

==================
WARNING: ThreadSanitizer: data race (use-of-uninitialized value)
  Write of size 8 at 0x7f3d34005010 by thread T1:
    #0 writer() /path/to/example.cpp:13 (example+0x105)
    #1 std::thread::_State_impl<...>::_M_run() ... (libstdc++.so+0xabc12)
    #2 start_thread (libpthread.so.0+0x76ba)
    #3 __clone (libc.so.6+0xfef50)

  Previous read of size 4 at 0x7f3d34005010 by thread T2:
    #0 reader() /path/to/example.cpp:18 (example+0x124)
    #1 std::thread::_State_impl<...>::_M_run() ... (libstdc++.so+0xabc12)
    #2 start_thread (libpthread.so.0+0x76ba)
    #3 __clone (libc.so.6+0xfef50)

  Location is heap block of size 8 at 0x7f3d34005010 allocated by thread T1:
    #0 operator new(unsigned long) (libtsan.so+0x12345)
    #1 writer() /path/to/example.cpp:12 (example+0x0f3)

  Thread T1 (writer):
    #0 writer() /path/to/example.cpp:13 (example+0x105)

  Thread T2 (reader):
    #0 reader() /path/to/example.cpp:18 (example+0x124)

SUMMARY: ThreadSanitizer: data race /path/to/example.cpp:18 in reader()
==================

5.初始化对象不同步：initializing objects without synchronization

示例：

struct A {
    int val;
    A() {}
};

A* shared = nullptr;

void init_once() {
    if (!shared) {
        shared = new A();  // 没有同步保护的初始化
    }
}

int main() {
    std::thread t1(init_once);
    std::thread t2(init_once);

    t1.join();
    t2.join();

    delete shared;
    return 0;
}

问题描述：

该程序中 shared 是一个全局裸指针，两个线程同时调用 init_once()，如果恰巧都发现 shared == nullptr，于是都去执行 shared = new A();。

由于缺乏互斥保护：

可能两个线程几乎同时执行 new A()，都会构造一个对象；
然后都把地址写到 shared 中；
最终只有一个构造对象被保留下来，程序中delete的是被保留下来的对象，而另一个构造出的对象指针被覆盖，从未被delete，造成内存泄漏；

tsan检测结果：data-race

==================
WARNING: ThreadSanitizer: data race (pid=12345)
  Write of size 8 at 0x7ffff7f9a0c0 by thread T1:
    #0 operator new(unsigned long) <...>
    #1 A::A() ./double_init_race.cpp:7
    #2 init_once() ./double_init_race.cpp:14

  Previous write of size 8 at 0x7ffff7f9a0c0 by thread T2:
    #0 operator new(unsigned long) <...>
    #1 A::A() ./double_init_race.cpp:7
    #2 init_once() ./double_init_race.cpp:14

  Location is global 'shared' of size 8 at 0x7ffff7f9a0c0

  Thread T1 (tid=123456):
    #0 init_once() ./double_init_race.cpp:14
    ...

  Thread T2 (tid=123457):
    #0 init_once() ./double_init_race.cpp:14
    ...

SUMMARY: ThreadSanitizer: data race ./double_init_race.cpp:14 in init_once()
==================

6.对位域的不安全并发操作

示例：

struct Flags {
    unsigned int flag1 : 1;
    unsigned int flag2 : 1;
};

Flags shared_flags;

void setFlag() {
    shared_flags.flag1 = 1;  // 写 flag1
}

void readFlag() {
    int val = shared_flags.flag2;  // 读 flag2
    std::cout << "flag2 = " << val << std::endl;
}

int main() {
    std::thread t1(setFlag);
    std::thread t2(readFlag);

    t1.join();
    t2.join();
    return 0;
}

问题描述：

虽然 flag1 和 flag2 看起来是两个独立的字段，但它们是位域，在内存中很可能共享同一个字节或同一个整型变量的不同位。C++ 标准并不保证它们的独立性。这就可能导致两个线程分别对 flag1 和 flag2 的访问，在底层实际上变成了对同一个内存字的读写，从而形成数据竞争。

tsan检测结果：data-race

==================
WARNING: ThreadSanitizer: data race (store/load on bitfield)
  Write of size 4 at 0x7f99b5800000 by thread T1:
    #0 setFlag() /path/to/bitfield.cpp:11 (bitfield+0x105)
    #1 std::thread::_State_impl<...>::_M_run() ... (libstdc++.so+0xabc12)
    #2 start_thread (libpthread.so.0+0x76ba)
    #3 __clone (libc.so.6+0xfef50)

  Previous read of size 4 at 0x7f99b5800000 by thread T2:
    #0 readFlag() /path/to/bitfield.cpp:15 (bitfield+0x124)
    #1 std::thread::_State_impl<...>::_M_run() ... (libstdc++.so+0xabc12)
    #2 start_thread (libpthread.so.0+0x76ba)
    #3 __clone (libc.so.6+0xfef50)

  Location is global 'shared_flags' of size 4 at 0x7f99b5800000

  Thread T1 (setFlag):
    #0 setFlag() /path/to/bitfield.cpp:11

  Thread T2 (readFlag):
    #0 readFlag() /path/to/bitfield.cpp:15

SUMMARY: ThreadSanitizer: data race /path/to/bitfield.cpp:15 in readFlag()
==================

7.锁顺序反转

示例：

std::mutex mutex1;
std::mutex mutex2;

void threadA() {
    std::lock_guard<std::mutex> lock1(mutex1);
    std::this_thread::sleep_for(std::chrono::milliseconds(10)); // 模拟延迟
    std::lock_guard<std::mutex> lock2(mutex2);
}

void threadB() {
    std::lock_guard<std::mutex> lock2(mutex2);
    std::this_thread::sleep_for(std::chrono::milliseconds(10)); // 模拟延迟
    std::lock_guard<std::mutex> lock1(mutex1);
}

int main() {
    std::thread t1(threadA);
    std::thread t2(threadB);

    t1.join();
    t2.join();
    return 0;
}

问题描述：

这个程序存在典型的“锁顺序反转”（Lock Order Inversion）问题：

threadA() 先获取 mutex1，再获取 mutex2
threadB() 则相反，先获取 mutex2，再获取 mutex1

如果两个线程同时运行，可能造成互相等待对方释放锁，引发死锁。

虽然程序不一定每次都死锁，但这是一个严重的并发设计缺陷。ThreadSanitizer 能检测出这种潜在的锁顺序不一致问题，即 lock-order-inversion。

tsan检测结果：lock-order-inversion

==================
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)

Cycle in lock order graph: M1 (mutex1) => M2 (mutex2) => M1

  Mutex M1 acquired here while holding M2:
    #0 std::__1::mutex::lock() ...
    #1 std::lock_guard<std::mutex>::lock_guard(...) ...
    #2 threadB() /path/to/lock_inversion.cpp:16

  Mutex M2 previously acquired by the same thread:
    #0 std::__1::mutex::lock() ...
    #1 std::lock_guard<std::mutex>::lock_guard(...) ...
    #2 threadB() /path/to/lock_inversion.cpp:14

  Mutex M2 acquired here while holding M1:
    #0 std::__1::mutex::lock() ...
    #1 std::lock_guard<std::mutex>::lock_guard(...) ...
    #2 threadA() /path/to/lock_inversion.cpp:11

  Mutex M1 previously acquired by the same thread:
    #0 std::__1::mutex::lock() ...
    #1 std::lock_guard<std::mutex>::lock_guard(...) ...
    #2 threadA() /path/to/lock_inversion.cpp:9

SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock)
==================

注意：上述问题均可采用某些方法来解决，如引入同步机制等，如有需求可参考《c++并发编程篇》

c/c++开发之数据竞争检测工具tsan的介绍

Tsan(Thread Snitizer)简介

Tsan主要特性

1. 数据竞争检测

2.线程安全问题检测

3.插桩与运行时检查

tsan的基本使用(以gcc/g++下的Tsan为例)

线程安全相关错误示例的tsan验证

1.读写竞争：simple race

2.双重锁检查：double-checked locking

3.vptr竞争：data race on vptr

4.对象发布时缺乏同步：publishing objects without synchronization

5.初始化对象不同步：initializing objects without synchronization

6.对位域的不安全并发操作

7.锁顺序反转

网站公告

今日签到

热门文章

最新发布