接受连接后打开两个线程时,C 无限循环以代码 141 中断

C infinite loop breaks with code 141 when opening two threads after accepting connection

在一个简单的 C 程序中,每当接受 TCP 套接字上的传入连接以异步处理客户端输入时,我都会打开一个新线程。接受发生在无限循环中。客户端数据在传递给 pthread_create 的回调函数中进行处理。当客户端发送数据并断开连接时,套接字将立即关闭。

只要我使用 telnet 客户端连接到程序侦听的端口,程序就准备好接受新连接。到目前为止一切顺利。

现在,当我同时连接两个客户端并一个接一个地给它一些输入时,主程序将以 code 141.

退出

服务器控制台

cehrig@devbox /home/cehrig/projects/SystemMonitor/build $ ./sysmon 
Thu Apr  9 12:03:17 2015: Finished reading configuration file
Thu Apr  9 12:03:17 2015: Initializing server socket
Thu Apr  9 12:03:17 2015: Accepting connections...
Thu Apr  9 12:03:19 2015: Inbound connection from  127.0.0.1
Using Thread: 0
Thu Apr  9 12:03:19 2015: Accepting connections...
Thu Apr  9 12:03:22 2015: Inbound connection from  127.0.0.1
Using Thread: 1
Thu Apr  9 12:03:22 2015: Accepting connections...
Client msg: asdf
Client msg: asdfasdfdsfadsf
cehrig@devbox /home/cehrig/projects/SystemMonitor/build $ echo $?
141
cehrig@devbox /home/cehrig/projects/SystemMonitor/build $ 

客户端控制台 1

cehrig@devbox /home/cehrig/projects/SystemMonitor/build $ telnet 127.0.0.1 50231
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
asdfasdfdsfadsf
Connection closed by foreign host.
cehrig@devbox /home/cehrig/projects/SystemMonitor/build $ 

客户端控制台 2

cehrig@devbox /home/cehrig/projects/SystemMonitor/build $ telnet 127.0.0.1 50231
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
asdf
Message was: asdf
Connection closed by foreign host.
cehrig@devbox /home/cehrig/projects/SystemMonitor/build $ 

这是用于接受连接的函数的片段。

int connections = 0;
pthread_t * newthread = malloc(sizeof(pthread_t));

while(1) {
    _print(stdout, "messages.socketacceppt", cfg, 1);
    newsockfd = accept(sockfd, (struct sockaddr *) &cli_addr, &clilen);

    thrpass_st thr_pass;
    thr_pass.sockfd = newsockfd;
    thr_pass.cfg = cfg;

    _print(stdout, "messages.socketreceived", cfg, 0);
    fprintf(stdout, "%d.%d.%d.%d\n",
        cli_addr.sin_addr.s_addr&0xFF,
        (cli_addr.sin_addr.s_addr&0xFF00)>>8,
        (cli_addr.sin_addr.s_addr&0xFF0000)>>16,
        (cli_addr.sin_addr.s_addr&0xFF000000)>>24);

    printf("Using Thread: %d\n", connections);
    pthread_create(newthread+connections, NULL, &read_socket, (void *) &thr_pass);

    newthread = (pthread_t *) realloc(newthread, (++connections+1)*sizeof(pthread_t));
}

这是每个新线程的回调函数/入口点。

void * read_socket(void * args)
{
    thrpass_st * _args = (thrpass_st *) malloc(sizeof(thrpass_st));
    _args = (thrpass_st *) args;

    int n;
    char * _buf = (char *) malloc(512*sizeof(char));
    char * _cor = (char *) malloc(512*sizeof(char));
    char * _out = _cor;

    bzero(_buf, 512);
    bzero(_cor, 512);


    size_t bread = 0;
    do {
        if((n = read(_args->sockfd, _buf+bread, 512-bread)) < 0) {
            _print(stderr, "messages.socketreadfail", _args->cfg, 1);
            _exit(0);
        }
        bread+=n;
    } while(strchr(_buf, '\n') == NULL && bread <= 512);


    int x = 0;
    while(*_buf != '\n' && x++ <= 512) {
        *_cor++ = *_buf++;
    }

    printf("Client msg: %s\n", _out);
    fflush(stdout);

    FILE * sstream = fdopen(_args->sockfd, "w+");
    fprintf(sstream, "Message was: %s\n", _out);
    fflush(sstream);
    shutdown(_args->sockfd, 2);
}

我认为问题出在这个函数末尾的某个地方,因为第二个 telnet 客户端没有收到 "Message sent:" 行。

如有任何帮助,我们将不胜感激!干杯。

这是因为 undefined behavior,你的线程函数中有这个 UB,源于这些行:

thrpass_st * _args = (thrpass_st *) malloc(sizeof(thrpass_st));
_args = (thrpass_st *) args;

第一行分配内存,但随后您用另一个指针覆盖指向该内存的指针,该指针指向另一个函数内的局部变量,该变量一旦 accept循环迭代。

最简单的解决方案是复制结构:

*_args = *(thrpass_st *) args;

实际上,在这种情况下根本不需要指针,只需执行例如。

thrpass_st _args = *(thrpass_st *) args;

这样也不会因为忘记 free 线程末尾的指针而导致内存泄漏。


另请注意,这里有一个竞争条件。如果两个客户端连接得非常紧密,则存在两个线程将相同结构数据作为参数传递的风险。

正确的 解决方案是为循环中的参数结构分配内存,并将该指针传递给线程函数。当然不要忘记 free 线程函数末尾的内存。