When you author software to simultaneously handle multiple tasks, you may use multithreaded programming—programs with constructs such as multiple entry points, interleaving of threads, and asynchronous interrupts. However, multithreaded programming is highly complex and introduces subtle defects such as data races and deadlocks. When such a defect occurs, it can take a long time to reproduce the issue and even longer to identify the root cause and fix it.
Deadlocks are one of the most subtle defects in this category. Deadlocks occur when multiple tasks or threads cannot make progress because each task is waiting for a lock held by another task that is also stuck.
Example of a Deadlock
Suppose that two tasks, Task 1
and Task 2
, write values to the shared resources, sharedVar1
and sharedVar2
. The write operations are protected with two mutual exclusion locks, Mutex #1
and Mutex #2
, held in sequence. The important thing to note is that the sequence of lock acquisition is reversed in the two tasks.
You can see this full code example to review the details.
If you build and run this code, the execution freezes at run-time sporadically. The following example show a program executed several times, and on the eighth run, the console becomes nonresponsive.
The program freeze at execution traces back to a deadlock in the code. The deadlock results when the following events occur in this sequence:
Task 1
acquires the lockMutex #1
and continues up to the point whereMutex #2
needs to be acquired.- The execution switches to
Task 2
, which first acquires the lockMutex #2
. ThenTask 2
continues up to the point whereMutex #1
needs to be acquired.
At this point, Task 2
is waiting for Task 1
to release Mutex #1
, while Task 1
is waiting for Task 2
to release Mutex #2
. Neither task can proceed past this state and release the lock required by the other task. This situation is a classic example of a deadlock.
Since a deadlock depends strictly on the execution order of instructions, you may not hit one in every run. If the possibility of a deadlock exists in your program, you will sporadically encounter the program freezing inexplicably. Because of the sporadic nature of this issue, deadlocks can escape undetected in normal testing. Even when detected, deadlocks are sometimes difficult to reproduce, making them appear like a Heisenbug, a bug that seems to disappear under scrutiny.
How to Detect and Fix Deadlocks
Unlike dynamic testing, a static analysis tool analyzes the structure of your program and can cover a wide variety of execution scenarios. Deadlocks like the previous ones are detectable in a single run of a static analysis tool.
In the example below, we used the concurrency checkers of the static analysis tool, Polyspace Bug Finder, on the previous program. Not only does the tool detect the deadlock, but it also provides the sequence of instructions that leads to the deadlock.
In the event list, each item accompanying a deadlock is clickable and takes you to a relevant location in the code. With the help of the events, you can navigate to the points where locks are acquired to understand how the deadlock happened.
There are many ways to fix a deadlock. In the previous example, you can fix the deadlock by simply reversing the order in which the locks are acquired in one of the tasks. For instance, Task 2
can acquire Mutex #1
first and then Mutex #2
and release them in the reverse order (just like Task 1
). In general, if more than two locks are involved in a deadlock cycle, you can set a fixed acquisition order of the locks and ensure that the order is observed in all the tasks. However, this approach also requires careful consideration of the program logic and might not be possible in all situations.
Another fix can block some tasks from starting while other tasks are running by assigning priorities to the tasks. After implementing the fix, you can see the deadlock is resolved if your static analysis tool supports task priorities. You can set up Polyspace Bug Finder to recognize that certain tasks have higher priorities than others.
Summary
Whatever the fix is, the first step is to detect the presence of a deadlock. Simple dynamic testing cannot do the job. You need to use an advanced static analysis tool that can recognize the various concepts in multithreaded programming and detect issues related to a concurrent execution model.
Polyspace Bug Finder has a host of concurrency checkers that can detect issues with the placement of locks, such as deadlocks, double locks, and unresolved data races. For more examples, see Concurrency Defects in Polyspace Bug Finder.
Written by Yoo Yong-chul and Anirban Gangopadhyay.
Yoo Yong-chul works as an application engineer at MathWorks Korea and is responsible for code verification products.
Anirban Gangopadhyay works as a documentation writer at MathWorks US. He oversees technical documentation of Polyspace® products.
Original post: Naver blog post
MathWorks Korea 2020.5.19