Common Multi-Threading Issues
1.
Data
Race
A
data race is a specific type of concurrency issue when two or more threads
access and manipulate the same shared resource without proper synchronization
and sufficient protections, and at least one of the accesses is a write
operation. This leads to undefined or unpredictable behavior.
A
race condition can occur without a data race, while a data race can occur
without a race condition. For example, the order of events can be consistent,
but if there’s always a read at the same time as a write, there’s still a data
race.
Data race can be addressed by using synchronization primitives such as locks,
semaphores, or atomic operations to ensure that only one thread can access the
shared resource at a time. Additionally, programming languages and
frameworks provide constructs such as mutexes, monitors, and concurrent
collections to help developers prevent data races.
Data
Race Example: A programmer attempts to use a mutex to
synchronize access to the global variable globalX in the following code.
However, there is still a storage conflict on this variable.
#include <stdio.h>
#include <pthread.h>
#define NTHREADS 4
int globalX = 0;
void * increment (void *arg)
{
pthread_mutex_t cs;
pthread_mutex_init (&cs, 0);
pthread_mutex_lock (&cs);
globalX++;
pthread_mutex_unlock (&cs);
pthread_mutex_destroy (&cs);
return 0;
}
int main (int argc, char *argv[])
{
pthread_t h[NTHREADS];
int rc;
int i;
printf ("START\n");
for (i = 0; i < NTHREADS; i++)
{
rc = pthread_create (&h[i], 0,
increment, 0);
}
for (i = 0; i < NTHREADS; i++)
{
rc = pthread_join (h[i], 0);
}
printf ("TOTAL = %d\n", globalX);
printf ("STOP\n");
}
The
following strategies can be implemented to prevent a data race:
·
Use synchronization primitives:
Synchronization primitives such as locks, semaphores, or atomic operations can
be used to ensure that only one thread can access a shared resource at a time.
By using these primitives, you can ensure that conflicting updates are avoided.
·
Use thread-safe programming
constructs: Programming languages and frameworks provide constructs such as
mutexes, monitors, and concurrent collections to help developers prevent data
races.
·
Use immutable data structures:
Immutable data structures are read-only data structures that cannot be modified
once they are created. By using immutable data structures, you can avoid data
races altogether, since they are guaranteed to be thread-safe.
·
Avoid shared mutable state: Shared
mutable state is a common cause of data races. If possible, you should design
your program to avoid shared mutable state and use other approaches such as
message passing or immutability to ensure thread-safety.
·
Use memory barriers: Memory barriers
can be used to ensure that the order of operations is enforced between threads,
preventing unexpected behavior due to out-of-order execution.
2.
Dead Lock
A
dead lock occurs when two or more threads are blocked and unable to proceed
because they are each waiting for the other to release a resource that they
need to continue execution. For example, a group of people sit around a
circular table with a fork between each pair of them. Each person needs two
forks to eat. When each person grabs one fork at a time, a deadlock can occur if
each person is waiting for his neighbor to release the fork he needs to
continue eating.
Dead
Lock Example: The following code illustrates the potential for deadlock in
a bad locking hierarchy. It is possible for one thread to lock
both critical sections and avoid deadlock. However, concurrent programs
that rely on a particular order-of-execution without enforcing that order will
eventually fail.
#include
<stdio.h>
#include <pthread.h>
pthread_mutex_t cs0, cs1;
int globalX = 0;
int globalY = 0;
void * work0 (void
*arg)
{
pthread_mutex_lock (&cs0);
globalX++;
pthread_mutex_lock (&cs1);
globalY++;
pthread_mutex_unlock (&cs1);
pthread_mutex_unlock (&cs0);
return
0;
}
void * work1 (void
*arg)
{
pthread_mutex_lock (&cs1);
globalX++;
pthread_mutex_lock (&cs0);
globalY++;
pthread_mutex_unlock (&cs0);
pthread_mutex_unlock (&cs1);
return
0;
}
int main (int argc,
char *argv[])
{
pthread_t h[2];
int rc;
pthread_mutex_init (&cs0, 0);
pthread_mutex_init (&cs1, 0);
printf
("START\n");
rc =
pthread_create (&h[0], 0, work0, 0);
rc =
pthread_create (&h[1], 0, work1, 0);
printf
("TOTAL = (%d,%d)\n", globalX, globalY);
rc =
pthread_join (h[0], 0);
rc =
pthread_join (h[1], 0);
printf
("STOP\n");
pthread_mutex_destroy (&cs0);
pthread_mutex_destroy (&cs1);
}
The
following strategies can be implemented to prevent a dead lock:
·
Avoid circular wait: One of the main
causes of deadlocks is circular wait, where two or more threads hold resources
and wait for other threads to release resources they need. To avoid circular
wait, you can ensure that resources are acquired and released in a specific
order.
·
Use timeouts: If a thread is waiting
for a resource for an extended period of time, it may be necessary to implement
a timeout mechanism that allows the thread to exit and retry later.
·
Use resource ordering: This involves
acquiring resources in a specific order to avoid deadlock scenarios. For
example, if multiple threads need to acquire two resources A and B, they can
agree to always acquire them in the order A then B, to avoid circular wait.
·
Use deadlock detection: Some
operating systems and programming languages provide deadlock detection
mechanisms that can detect and resolve deadlocks automatically.
·
Use lock-free data structures:
Lock-free data structures such as non-blocking algorithms and wait-free algorithms
can be used to eliminate the need for locks and reduce the likelihood of
deadlocks.
·
Keep critical sections short: To
minimize the time that threads are holding resources, critical sections should
be kept as short as possible, so that other threads are not blocked for an
extended period of time.
3.
Stall
A
stall occurs when a thread is blocked and unable to proceed, often due to a
long-running operation, a synchronization bottleneck, a wait for I/O
operations, a wait for other threads to complete their work, or a wait for
locks or other synchronization primitives. If a thread is stalled for a long
period of time, it can cause delays in the overall execution of the program and
potentially impact the responsiveness and performance of the system.
Stall
Example:
#include <pthread.h>
#include <stdio.h>
#define ITERS (1024*1024)
double sum = 0;
pthread_mutex_t global_lock;
void * work( void *arg )
{
int i;
sleep( 1 );
pthread_mutex_lock( & global_lock );
for (i = 0; i < ITERS; ++i) {
sum = sum + 1;
}
pthread_mutex_unlock( & global_lock );
return 0;
}
void stall_test ( void )
{
pthread_t thr1, thr2;
pthread_attr_t attr;
double arg1, arg2;
printf( "START\n" );
arg1 = 1; arg2 = 2;
pthread_attr_init( & attr );
pthread_attr_setstacksize( &attr, 1024 * 1024 );
pthread_mutex_init( & global_lock, NULL );
pthread_mutex_lock( & global_lock );
pthread_create( &thr1, &attr, work, &arg1 );
pthread_create( &thr2, &attr, work, &arg2 );
/* wait for the error to happen */
sleep( 15 );
pthread_cancel( thr1 );
pthread_cancel( thr2 );
pthread_mutex_destroy( &global_lock );
printf( "STOP\n" );
}
int main ( int argc, char *argv[] )
{
stall_test();
exit( 0 );
}
The
following strategies can be implemented to prevent a stall:
·
Use asynchronous I/O: Asynchronous
I/O allows threads to continue processing while waiting for I/O operations to
complete, avoiding stalls that would otherwise occur.
·
Use thread pooling: Thread pooling
involves creating a pool of worker threads that can be reused to handle
multiple tasks, reducing the need for creating and destroying threads and
minimizing the overhead of thread creation.
·
Use non-blocking algorithms:
Non-blocking algorithms use synchronization primitives that do not block
threads, allowing them to continue processing while waiting for resources to
become available.
·
Use lock-free data structures:
Lock-free data structures such as non-blocking algorithms and wait-free
algorithms can be used to eliminate the need for locks and reduce the
likelihood of stalls.
·
Use timeouts: If a thread is waiting
for a resource for an extended period of time, it may be necessary to implement
a timeout mechanism that allows the thread to exit and retry later.
·
Optimize performance: By optimizing
the performance of your code and reducing the amount of time spent in critical
sections, you can minimize the likelihood of stalls occurring.
Comments
Post a Comment