课后作业(编码) 在本节中，我们将编写一些简单的多线程程序，并使用一个叫helgrind的特定工具来查找程序中的问题。阅读作业中的 README 文件，以获取有关如何构建程序和运行helgrind的详细信息。

问题:

1.首先构建main-race.c,查看代码，以便您可以在代码中看到（非常明显的）数据竞争。现在运行 helgrind（通过输入valgrind --tool=helgrind main-race）来查看其追踪结果。它指向正确的代码行吗？它还能为您提供什么其他信息？

输入make main-race构建,代码中的竞争条件很明显,多线程同时修改balance变量

输入valgrind --tool=helgrind ./main-race 查看结果,

跟踪结果:

==25676== Helgrind, a thread error detector
==25676== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al.
==25676== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==25676== Command: ./main-race
==25676== 
==25676== ---Thread-Announcement------------------------------------------
==25676== 
==25676== Thread #1 is the program's root thread
==25676== 
==25676== ---Thread-Announcement------------------------------------------
==25676== 
==25676== Thread #2 was created
==25676==    at 0x49B2282: clone (clone.S:71)
==25676==    by 0x48752EB: create_thread (createthread.c:101)
==25676==    by 0x4876E0F: pthread_create@@GLIBC_2.2.5 (pthread_create.c:817)
==25676==    by 0x4842917: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==25676==    by 0x109513: Pthread_create (mythreads.h:51)
==25676==    by 0x1095F1: main (main-race.c:14)
==25676== 
==25676== ----------------------------------------------------------------
==25676== 
==25676== Possible data race during read of size 4 at 0x10C014 by thread #1
==25676== Locks held: none
==25676==    at 0x1095F2: main (main-race.c:15)
==25676== 
==25676== This conflicts with a previous write of size 4 by thread #2
==25676== Locks held: none
==25676==    at 0x1095A6: worker (main-race.c:8)
==25676==    by 0x4842B1A: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==25676==    by 0x4876608: start_thread (pthread_create.c:477)
==25676==    by 0x49B2292: clone (clone.S:95)
==25676==  Address 0x10c014 is 0 bytes inside data symbol "balance"
==25676== 
==25676== ----------------------------------------------------------------
==25676== 
==25676== Possible data race during write of size 4 at 0x10C014 by thread #1
==25676== Locks held: none
==25676==    at 0x1095FB: main (main-race.c:15)
==25676== 
==25676== This conflicts with a previous write of size 4 by thread #2
==25676== Locks held: none
==25676==    at 0x1095A6: worker (main-race.c:8)
==25676==    by 0x4842B1A: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==25676==    by 0x4876608: start_thread (pthread_create.c:477)
==25676==    by 0x49B2292: clone (clone.S:95)
==25676==  Address 0x10c014 is 0 bytes inside data symbol "balance"
==25676== 
==25676== 
==25676== Use --history-level=approx or =none to gain increased speed, at
==25676== the cost of reduced accuracy of conflicting-access information
==25676== For lists of detected and suppressed errors, rerun with: -s
==25676== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

可以看到有竞争状态提示:Possible data race during read of size 4 at 0x10C014 by thread #1

代码行提示:at 0x1095A6: worker (main-race.c:8)

地址提示: Address 0x10c014 is 0 bytes inside data symbol "balance"

2.删除有问题的代码行之一会发生什么？现在，在一个共享变量的更新附近添加锁，然后在所有变量更新周围添加锁。在每种情况下，Helgrind报告什么？

删除第 15 行代码或第 8 行代码,程序正确运行

加一个锁报错,依然有竞争条件, 所有变量更新周围添加锁:cd code && make && valgrind --tool=helgrind ./main-race 结果正确

3.现在让我们看一下main-deadlock.c. 查看代码。代码中有一个死锁的问题（我们将在下一章中对此进行更深入的讨论）。您知道它可能有什么问题吗？

死锁情况:

线程 0 获取锁 1,中断,线程 1 执行, 获取锁 2,造成死锁

4.现在运行helgrind检查这段代码。 Helgrind报告什么？

追踪:

make main-deadlock 
valgrind --tool=helgrind ./main-deadlock

结果:

==28961== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 7 from 7)

5.现在使用main-deadlock-global.c运行helgrind。查看代码；它有和main-deadlock.c有一样的问题吗？ Helgrind是否应该报告相同的错误？对于helgrind之类的工具,结果说明了什么？

执行:

make main-deadlock-global
valgrind --tool=helgrind ./main-deadlock-global

执行结果:

ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 7 from 7)

main-deadlock-global.c 代码是没有问题的,但 valgrind 工具依然报错, 因此推断 helgrind 靠执行周期与上下文切换次数判断死锁,因此,helgrind 并不能很好地判断死锁

6.接下来让我们看一下main-signal.c。这段代码使用变量（done）来表示子进程已完成，并且父线程现在可以继续运行了。为什么这段代码效率低下？（父线程最终会花时间做什么，特别是当子线程需要很长时间才能执行完成时？）

执行:

make main-signal
valgrind --tool=helgrind ./main-signal

父线程使用大量时间自旋等待

7.现在在这个程序上运行helgrind。它报告什么?代码正确吗?

==30421== Possible data race during write of size 1 at 0x52861A5 by thread #1
==30421==    by 0x109633: main (main-signal.c:17)
==30421==    by 0x1095CC: worker (main-signal.c:8)
=30326== ERROR SUMMARY: 23 errors from 2 contexts (suppressed: 40 from 36)

竞争条件是 done 变量, 工具指示到 printf 去了,

glibc 的 printf 函数是线程安全的函数,参考:stackoverflow

8.现在看一下main-signal.c稍微修改的版本:main-signal-cv.c。该版本使用条件变量来发送信号（并进行加锁）。为什么此代码比以前的版本更好？是正确，还是性能，或两者兼而有之？

执行:

make main-signal-cv
valgrind --tool=helgrind ./main-signal-cv

结果正确,性能比之前的版本要好, 对 done 变量加锁, 且将自旋等待替换为 Pthread_cond_wait, 释放锁并让出 CPU,收到信号时被唤醒,重新获取锁

9.再一次在main-signal-cv.c上运行helgrind。它报告错误吗?

没有

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions-答案.md

Questions-答案.md

Files

Questions-答案.md

Latest commit

History

Questions-答案.md

File metadata and controls