Peterson 的解决方案实现在 C 中不起作用

Question

我有以下代码试图理解彼得森的解决方案。当我运行这个循环的小值直到 9999 的实现时，输出正确显示为 0，但是当我用更高的循环值（如 9999999）测试它时，我得到的值接近 0，但不是 0，是否可能递增和递减线程都可以在 (*(a))++; 部分执行吗？为什么下面的实现不起作用？我的程序有错误吗？

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <semaphore.h>

#define LOOP 9999999

int interest[2] = {0,0};
int turn = 0;
int loops[2] = {LOOP,LOOP};

void increment(int *a) {
  printf("Incrementing %p\n",a);
  for(int i=0;i<LOOP;i++) {
    interest[0] = 1;
    turn = 1;
    while(interest[1] == 1 && turn == 1);
    (*(a))++;
    loops[0]--;
    interest[0] = 0;
  }
}

void decrement(int *a) {
  printf("Decrementing %p\n",a);
  for(int i=0;i<LOOP;i++) {
    interest[1] = 1;
    turn = 0;
    while(interest[0] == 1 && turn == 0);
    (*(a))--;
    loops[1]--;
    interest[1] = 0;
  }
}

void print_val(int *a) {
  while(1) {
    getchar();
    printf("value at mem %d\niInc(%d) iDec(%d)\n",*a,loops[0],loops[1]);
  }
}

int main() {

  pthread_t t1, t2, t3;
  int *mem = malloc(sizeof(int));

  pthread_create(&t1, NULL, (void*)decrement, (void*)mem);
  pthread_create(&t2, NULL, (void*)increment, (void*)mem);
  pthread_create(&t3, NULL, (void*)print_val, (void*)mem);

  pthread_join(t1,NULL);  
  pthread_join(t2,NULL);  
  printf("operation complete\n");
  pthread_join(t3,NULL);  

  return 0;
}

输出：

$:~/Arena/sem $ gcc op.c -pthread && ./a.out
Incrementing 0xd16010
Decrementing 0xd16010
operation complete

value at mem -2
iInc(0) iDec(0)
^Z
[13]+  Stopped                 ./a.out
$:~/Arena/sem $ gcc op.c -pthread && ./a.out
Decrementing 0x2432010
Incrementing 0x2432010
operation complete

value at mem 16
iInc(0) iDec(0)
^Z
[14]+  Stopped                 ./a.out
meow:~/Arena/sem $

编辑：

我已经尝试过 Wrong implementation of Peterson's algorithm? 并且添加 volatile 没有帮助，我也没有像上一个线程中所说的那样使用 ++ 操作。
Implementation of Peterson's solution doesn't work properly and Peterson algorithm in Java? 不在 C 中，我的线程不是骗子。

查看答案：

大多数答案都暗示编译器可能会进行一些重新排序，我已经添加了程序集转储，有人能帮我理解这两个进程是如何在关键部分结束的吗？

自增函数

 for(int i=0;i<LOOP;i++) {
  25:   c7 45 f4 00 00 00 00    mov    DWORD PTR [ebp-0xc],0x0
  2c:   eb 50                            jmp    7e <increment+0x7e>
    interest[0] = 1;
    turn = 1;
  2e:   c7 05 00 00 00 00 01   mov    DWORD PTR ds:0x0,0x1
  35:   00 00 00 
    while(interest[1] == 1 && turn == 1);
  38:   c7 05 00 00 00 00 01   mov    DWORD PTR ds:0x0,0x1
  3f:   00 00 00 
    //while(turn == 1 && interest[1] == 1);
  42:   90                                 nop
  43:   a1 04 00 00 00             mov    eax,ds:0x4
  48:   83 f8 01                        cmp    eax,0x1
  4b:   75 0a                            jne    57 <increment+0x57>
  4d:   a1 00 00 00 00             mov    eax,ds:0x0
  52:   83 f8 01                        cmp    eax,0x1
  55:   74 ec                             je     43 <increment+0x43>
    (*(a->location))++;
  57:   8b 45 08                        mov    eax,DWORD PTR [ebp+0x8]
  5a:   8b 00                             mov    eax,DWORD PTR [eax]
  5c:   8b 10                             mov    edx,DWORD PTR [eax]
  5e:   83 c2 01                        add    edx,0x1
  61:   89 10                             mov    DWORD PTR [eax],edx
    loops[0]--;
  63:   a1 00 00 00 00              mov    eax,ds:0x0
  68:   83 e8 01                        sub    eax,0x1
  6b:   a3 00 00 00 00              mov    ds:0x0,eax
    interest[0] = 0;
  70:   c7 05 00 00 00 00 00    mov    DWORD PTR ds:0x0,0x0
  77:   00 00 00

递减函数

for(int i=0;i<LOOP;i++) {
  8f:   8b 45 08                       mov    eax,DWORD PTR [ebp+0x8]
  92:   8b 50 04                      mov    edx,DWORD PTR [eax+0x4]
  95:   8b 45 08                      mov    eax,DWORD PTR [ebp+0x8]
  98:   8b 00                           mov    eax,DWORD PTR [eax]
  9a:   89 54 24 08                 mov    DWORD PTR [esp+0x8],edx
  9e:   89 44 24 04                 mov    DWORD PTR [esp+0x4],eax
  a2:   c7 04 24 28 00 00 00  mov    DWORD PTR [esp],0x28
  a9:   e8 fc ff ff ff                 call   aa <decrement+0x21>
    interest[1] = 1;
  ae:   c7 45 f4 00 00 00 00   mov    DWORD PTR [ebp-0xc],0x0
  b5:   eb 4f                            jmp    106 <decrement+0x7d>
    turn = 0;
    while(interest[0] == 1 && turn == 0);
  b7:   c7 05 04 00 00 00 01  mov    DWORD PTR ds:0x4,0x1
  be:   00 00 00 
    //while(turn == 0 && interest[0] == 1);
  c1:   c7 05 00 00 00 00 00  mov    DWORD PTR ds:0x0,0x0
  c8:   00 00 00 
    (*(a->location))--;
  cb:   90                                nop
  cc:   a1 00 00 00 00            mov    eax,ds:0x0
  d1:   83 f8 01                      cmp    eax,0x1
  d4:   75 09                           jne    df <decrement+0x56>
  d6:   a1 00 00 00 00            mov    eax,ds:0x0
  db:   85 c0                           test   eax,eax
  dd:   74 ed                           je     cc <decrement+0x43>
    loops[1]--;
  df:   8b 45 08                       mov    eax,DWORD PTR [ebp+0x8]
  e2:   8b 00                            mov    eax,DWORD PTR [eax]
  e4:   8b 10                            mov    edx,DWORD PTR [eax]
  e6:   83 ea 01                       sub    edx,0x1
  e9:   89 10                            mov    DWORD PTR [eax],edx
    interest[1] = 0;
  eb:   a1 04 00 00 00             mov    eax,ds:0x4
  f0:   83 e8 01                        sub    eax,0x1
  f3:   a3 04 00 00 00              mov    ds:0x4,eax

Answer 1

我不确定 POSIX，但至少在标准 C11 中，您需要使用 atomic_int 来计算轮次和利息。不是 volatile.

我也可能弄错了（只知道 c++ 的规则）但是 print_val 函数中的非同步访问可能会导致未定义的行为，如果你在打印 operation complete 之前使用它。

Answer 2

你有竞争条件而reading/writing变量interest（内存位置interest[0]和interest[1]），turn 和 *a。即递增和递减线程都对这些变量进行非同步访问。 print_val 也访问 *a 而其他线程可能正在更新它。

您需要一种同步机制，例如互斥锁或对这些变量的原子访问，以便您的程序运行。

Answer 3

您似乎在使用 pthreads，它是 POSIX 标准的一部分。 POSIX 不允许您像那样 "roll your own" 同步原语 - 它在 4.11 Memory Synchronization 中有这样的说法：

Applications shall ensure that access to any memory location by more than one thread of control (threads or processes) is restricted such that no thread of control can read or modify a memory location while another thread of control may be modifying it. Such access is restricted using functions that synchronize thread execution and also synchronize memory with respect to other threads.

这样做的实际结果是允许编译器进行转换和重新排序，这可能会破坏您的假设。

例如，我的编译器将 while() 循环优化为无限循环，因为它可以看到 turn 在设置它和在循环中测试它之间无法合法修改，因为没有 POSIX 同步函数被调用。这显然不会发生在你身上，但其他类似的问题也是可能的。

在这种情况下，困扰您的可能不是编译器优化，而是未能遵守 CPU 内存模型。 This article 描述了 Peterson 锁如何要求内存栅栏在多处理器 x86 上是正确的。（POSIX 同步原语的实现将包括必要的内存栅栏，但您的代码不包括）。

就您的代码而言，interest[1] 的负载可以在写入 interest[0] 之前由 x86 重新排序，并且 interest[0] 的负载同样可以在写入 interest[0] 之前重新排序写入 interest[1]。这意味着两个线程可以同时看到interest[0] == 0和interest[1] == 0，并且都进入临界区

Peterson 的解决方案实现在 C 中不起作用

Peterson's solution implementation not working in C

c

concurrency

operating-system

pthreads

concurrent-programming

编辑：

查看答案：

自增函数

递减函数