Common Multi-Threading Issues

1.   Data Race

A data race is a specific type of concurrency issue when two or more threads access and manipulate the same shared resource without proper synchronization and sufficient protections, and at least one of the accesses is a write operation. This leads to undefined or unpredictable behavior.

A race condition can occur without a data race, while a data race can occur without a race condition. For example, the order of events can be consistent, but if there’s always a read at the same time as a write, there’s still a data race.

Data race can be addressed by using synchronization primitives such as locks, semaphores, or atomic operations to ensure that only one thread can access the shared resource at a time. Additionally, programming languages and frameworks provide constructs such as mutexes, monitors, and concurrent collections to help developers prevent data races. 

Data Race ExampleA programmer attempts to use a mutex to synchronize access to the global variable globalX in the following code. However, there is still a storage conflict on this variable.

#include <stdio.h>
#include <pthread.h>
#define NTHREADS 4
int globalX = 0;
void * increment (void *arg)
{
    pthread_mutex_t cs;
    pthread_mutex_init (&cs, 0);
    pthread_mutex_lock (&cs);
    globalX++;
    pthread_mutex_unlock (&cs);
    pthread_mutex_destroy (&cs);
    return 0;
}

int main (int argc, char *argv[])
{
    pthread_t h[NTHREADS];
    int rc;
    int i;
    printf ("START\n");
    for (i = 0; i < NTHREADS; i++)
    {
        rc = pthread_create (&h[i], 0, increment, 0);
    }
    for (i = 0; i < NTHREADS; i++)
    {
        rc = pthread_join (h[i], 0);
    }
    printf ("TOTAL = %d\n", globalX);
    printf ("STOP\n");
}

The following strategies can be implemented to prevent a data race:

·        Use synchronization primitives: Synchronization primitives such as locks, semaphores, or atomic operations can be used to ensure that only one thread can access a shared resource at a time. By using these primitives, you can ensure that conflicting updates are avoided.

·        Use thread-safe programming constructs: Programming languages and frameworks provide constructs such as mutexes, monitors, and concurrent collections to help developers prevent data races.

·        Use immutable data structures: Immutable data structures are read-only data structures that cannot be modified once they are created. By using immutable data structures, you can avoid data races altogether, since they are guaranteed to be thread-safe.

·        Avoid shared mutable state: Shared mutable state is a common cause of data races. If possible, you should design your program to avoid shared mutable state and use other approaches such as message passing or immutability to ensure thread-safety.

·        Use memory barriers: Memory barriers can be used to ensure that the order of operations is enforced between threads, preventing unexpected behavior due to out-of-order execution. 

2. Dead Lock 

A dead lock occurs when two or more threads are blocked and unable to proceed because they are each waiting for the other to release a resource that they need to continue execution. For example, a group of people sit around a circular table with a fork between each pair of them. Each person needs two forks to eat. When each person grabs one fork at a time, a deadlock can occur if each person is waiting for his neighbor to release the fork he needs to continue eating.

Dead Lock ExampleThe following code illustrates the potential for deadlock in a bad locking hierarchy. It is possible for one thread to lock both critical sections and avoid deadlock. However, concurrent programs that rely on a particular order-of-execution without enforcing that order will eventually fail.

#include <stdio.h>

#include <pthread.h>

pthread_mutex_t cs0, cs1;

int globalX = 0;

int globalY = 0;

void * work0 (void *arg)

{

    pthread_mutex_lock (&cs0);

    globalX++;

    pthread_mutex_lock (&cs1);

    globalY++;

    pthread_mutex_unlock (&cs1);

    pthread_mutex_unlock (&cs0);

    return 0;

} 

void * work1 (void *arg)

{

    pthread_mutex_lock (&cs1);

    globalX++;

    pthread_mutex_lock (&cs0);

    globalY++;

    pthread_mutex_unlock (&cs0);

    pthread_mutex_unlock (&cs1); 

    return 0;

} 

int main (int argc, char *argv[])

{

    pthread_t h[2];

    int rc; 

    pthread_mutex_init (&cs0, 0);

    pthread_mutex_init (&cs1, 0); 

    printf ("START\n"); 

    rc = pthread_create (&h[0], 0, work0, 0);

    rc = pthread_create (&h[1], 0, work1, 0); 

    printf ("TOTAL = (%d,%d)\n", globalX, globalY); 

    rc = pthread_join (h[0], 0);

    rc = pthread_join (h[1], 0); 

    printf ("STOP\n"); 

    pthread_mutex_destroy (&cs0);

    pthread_mutex_destroy (&cs1);

}

The following strategies can be implemented to prevent a dead lock:

·        Avoid circular wait: One of the main causes of deadlocks is circular wait, where two or more threads hold resources and wait for other threads to release resources they need. To avoid circular wait, you can ensure that resources are acquired and released in a specific order.

·        Use timeouts: If a thread is waiting for a resource for an extended period of time, it may be necessary to implement a timeout mechanism that allows the thread to exit and retry later.

·        Use resource ordering: This involves acquiring resources in a specific order to avoid deadlock scenarios. For example, if multiple threads need to acquire two resources A and B, they can agree to always acquire them in the order A then B, to avoid circular wait.

·        Use deadlock detection: Some operating systems and programming languages provide deadlock detection mechanisms that can detect and resolve deadlocks automatically.

·        Use lock-free data structures: Lock-free data structures such as non-blocking algorithms and wait-free algorithms can be used to eliminate the need for locks and reduce the likelihood of deadlocks.

·        Keep critical sections short: To minimize the time that threads are holding resources, critical sections should be kept as short as possible, so that other threads are not blocked for an extended period of time.

 

 3. Stall 

A stall occurs when a thread is blocked and unable to proceed, often due to a long-running operation, a synchronization bottleneck, a wait for I/O operations, a wait for other threads to complete their work, or a wait for locks or other synchronization primitives. If a thread is stalled for a long period of time, it can cause delays in the overall execution of the program and potentially impact the responsiveness and performance of the system.

Stall Example:

#include <pthread.h>
#include <stdio.h>
#define ITERS       (1024*1024)
double sum = 0;
pthread_mutex_t global_lock;
void * work( void *arg )
{
    int i;
    sleep( 1 );
    pthread_mutex_lock( & global_lock );
    for (i = 0; i < ITERS; ++i) {
        sum = sum + 1;
    }
    pthread_mutex_unlock( & global_lock );
    return 0;
}
void stall_test ( void )
{
    pthread_t thr1, thr2;
    pthread_attr_t attr;
    double arg1, arg2;
    printf( "START\n" );
    arg1 = 1; arg2 = 2;
    pthread_attr_init( & attr );
    pthread_attr_setstacksize( &attr, 1024 * 1024 );
    pthread_mutex_init( & global_lock, NULL );
    pthread_mutex_lock( & global_lock );
    pthread_create( &thr1, &attr, work, &arg1 );
    pthread_create( &thr2, &attr, work, &arg2 );
    /* wait for the error to happen */
    sleep( 15 );
    pthread_cancel( thr1 );
    pthread_cancel( thr2 );
    pthread_mutex_destroy( &global_lock );
    printf( "STOP\n" );
}
int main ( int argc, char *argv[] )
{
    stall_test();
    exit( 0 );
}

The following strategies can be implemented to prevent a stall:

·        Use asynchronous I/O: Asynchronous I/O allows threads to continue processing while waiting for I/O operations to complete, avoiding stalls that would otherwise occur.

·        Use thread pooling: Thread pooling involves creating a pool of worker threads that can be reused to handle multiple tasks, reducing the need for creating and destroying threads and minimizing the overhead of thread creation.

·        Use non-blocking algorithms: Non-blocking algorithms use synchronization primitives that do not block threads, allowing them to continue processing while waiting for resources to become available.

·        Use lock-free data structures: Lock-free data structures such as non-blocking algorithms and wait-free algorithms can be used to eliminate the need for locks and reduce the likelihood of stalls.

·        Use timeouts: If a thread is waiting for a resource for an extended period of time, it may be necessary to implement a timeout mechanism that allows the thread to exit and retry later.

·        Optimize performance: By optimizing the performance of your code and reducing the amount of time spent in critical sections, you can minimize the likelihood of stalls occurring.

 


Comments

Popular posts from this blog

QUALITY MANAGEMENT PRINCIPLES & PRACTICES

KPIs EXAMPLES

Firmware Development and Debugging