多线程 clients/server 原型中的分段错误

Segmentation fault in a multithreaded clients/server prototype

我正在开发一种算法原型,该算法适用于一组节点,其中每个节点都保持与所有其他节点的连接并发送消息。

为了发送消息,节点先发送fixed-size header再发送数据

经过大量工作,我得出结论,问题出在代码的 multi-threaded 编程部分。因此,我将此代码创建为 PoC。

此原型旨在让一台服务器和多个在编译期间定义的客户端数量。

服务器负责在单独的线程上监听一些客户端。由于这只是一个原型,我们将删除数据。

每个客户端分两步将数据发送到服务器:Header 然后 body 使用 send_message

顺便说一句,此算法应生成从每个客户端到服务器的特定带宽基准的数据。默认情况下,每个客户端向服务器发送 100Mb/s 数据。

代码包括:

client.cpp:

#include "network.h"

int main(int argc, char *argv[]){

    int sockfd;

    std::cout << "HEADER: " << HEADER << std::endl;

    // Read the server's IP
    struct hostent *server = gethostbyname(argv[1]);

    // Read the arguments from console
    get_client_arguments (argc, argv);

    // Connect to the server
    sockfd = connect_to_server(server);

    sleep (1);

    // Start sending (Start the experiment)
    multi_unicaster (sockfd);

    close (sockfd);

    return 0;
}

/**********************************************/

void usage (char *argv){

    std::cout << "usage: " << argv << " hostname [-p port] [-t throughput]" << std::endl;
    exit(0);
}

/**********************************************/

server.cpp:

#include "network.h"

/**********************************************/

int main(int argc, char *argv[]){

    std::cout << "HEADER: " << HEADER << std::endl;
    get_server_arguments(argc, argv);

    // Create several threads to listen for incoming connections
    // and read data from several clients simultaneously
    start_listening_threads ();

    return 0;
}

/**********************************************/

void usage (char *argv){

    std::cout << "usage: " << argv << " [-p port]" << std::endl;
    exit(0);
}

/**********************************************/

message.h:

#ifndef __MESSAGE__
#define __MESSAGE__

#include <vector>
#include <queue>
#include <string>
#include <cstring>
#include <cstdio>
#include <algorithm>
#include <stdexcept>
#include <iostream>
#include <cstdlib>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <netdb.h>
#include "func.h"

/***************************************/

// The header structure
typedef struct {

    // Message ID
    unsigned mID;

    // IP of sender
    struct in_addr sender;

    // Message sie
    size_t datasize;

}header_type;

/***************************************/

#define HEADER sizeof(header_type)

/***************************************/

/*
 * Message Class
 */
class message
{
    private:
        // Message header
        header_type * header;

        // Message text
        byte * text;

    public:

        // Message Accessors, mutators and related functions
        byte * get_text();
        header_type * get_header();
        void set_datasize(size_t);
        size_t get_datasize();
        struct in_addr get_sender();
        void set_ID(unsigned);
        unsigned get_ID();
        void print();

        message(int,struct in_addr,size_t);
        message(header_type *, size_t);
        ~message();

        message & operator = (const message&);
        message(const message&);
};
#endif


/***************************************/

extern std::queue <message * > sending_messages_queue;

/***************************************/
/*
 * Constructor used for initializing complete messages
 * 
 */

message::message(int ID,struct in_addr IP,size_t d_s){

    header = (header_type *) malloc (HEADER);

    header -> mID = ID;
    header -> datasize = d_s;
    header -> sender.s_addr = IP.s_addr;

    if (d_s > 0){
        text = (byte *) malloc (d_s);
        memset (text, '.', d_s);
    }
    else text = NULL;
}

/***************************************/
/*
 * Copy constructor (Initialize)
 * 
 */

message::message(const message& other){

    header = (header_type *) malloc (HEADER);

    std::memcpy (header, other.header, HEADER);

    if (header -> datasize > 0){
        text = (byte *) malloc (header -> datasize);
        std::memcpy (text, other.text, header -> datasize);
    } else
        text = NULL;
}

/***************************************/
/*
 * destructor
 *
 * Message destructor
 */

message::~message() {
    if (text != NULL){
        free(text);
        text = NULL;
    }
}

/***************************************/
/*
 * Assignment operator (Update)
 * 
 */

message & message::operator = (const message& other) {

    header = (header_type *) malloc (HEADER);

    std::memcpy (header, other.header, HEADER);

    if (header -> datasize >0){
        text = (byte *) malloc (header -> datasize);
        std::memcpy (text, other.text, header -> datasize);
    } else
        text = NULL;

    return *this;
}

/***************************************/

/*
* another constructor
*
*/

message::message(header_type *h, size_t s){

    header = (header_type *) malloc (HEADER);

    std::memcpy (header, h, HEADER);

    if (s > 0){
        text = (byte *) malloc (s);
        std::memset (text, '.', s);
    } else
        text = NULL;
}

/***************************************/

/*
* get_header
*
* Header accessor
*/

header_type * message::get_header(){
    return header;
}

/***************************************/

/*
* get_text
*
* Text accessor
*/

byte * message::get_text(){
    return text;
}

/***************************************/

/*
* get_sender
*
* Sender IP accessor
*/

struct in_addr message::get_sender(){
    return header -> sender;
}

/***************************************/

/*
* get_datasize
*
* datasize accessor
*/

size_t message::get_datasize(){
    return header -> datasize;
}

/***************************************/

/*
* set_ID
*
* ID mutator
*/
void message::set_ID(unsigned ID) {
    header -> mID = ID;
}

/***************************************/

/*
* get_ID
*
* ID Accessor
*/
unsigned message::get_ID() {
    return header -> mID;
}

/***************************************/

/*
* set_datasize
*
* datasize mutator
*/
void message::set_datasize(size_t d) {
    header -> datasize = d;
}

/***************************************/

/*
* print
*
*/
void message::print() {
    std::cout << header -> mID << "," << inet_ntoa (header -> sender)  << "," << header -> datasize;

    std::cout << std::endl;
}

/***************************************/

func.h:

// Some support functions

using namespace std;

/***********************************************************************/

#include <stdio.h>
#include <sys/types.h>
#include <cstdlib>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <unistd.h>
#include <strings.h>
#include <cstring>
#include <iostream>
#include <string>
#include <sys/time.h>
#include <math.h>

/***********************************************************************/

typedef uint8_t byte;
const unsigned long kilo = 1024;
const unsigned long mega = 1024 * kilo;
const unsigned long giga = 1024 * mega;
const unsigned MESSAGE_SIZE = 10 * kilo;

/***********************************************************************/

int port = 4444;
int throughput = 100;
int newsockfd [CLIENTS];

/***********************************************************************/

void usage (char *argv);

/***********************************************************************/

/* 
 * 
 * subtract_time
 * 
 * Subtracts time to handle negative values
 * 
 */

struct timeval subtract_time (struct timeval * left_operand, struct timeval * right_operand){

    struct timeval res;

    if (left_operand -> tv_sec >= right_operand -> tv_sec){
        if (left_operand -> tv_usec >= right_operand -> tv_usec){

            res.tv_sec = left_operand -> tv_sec - right_operand -> tv_sec;
            res.tv_usec = left_operand -> tv_usec - right_operand -> tv_usec;
        }else{
            res.tv_sec = left_operand -> tv_sec - right_operand -> tv_sec - 1;
            res.tv_usec = 1000000 + left_operand -> tv_usec - right_operand -> tv_usec;
        }
    }

    return res;
}

/***********************************************************************/

void get_server_arguments (int argc, char *argv[]){

    int i = 1;
    while (i < argc){

        if (strcmp (argv [i], "-p") ==0){

            port = atoi (argv [i + 1]);
            i+= 2;
        }
        else usage (argv [0]);
    }
}

/***********************************************************************/

void get_client_arguments (int argc, char *argv[]){

    int i = 2;

    while (i < argc){

        if (strcmp (argv [i], "-p") ==0){

            port = atoi (argv [i + 1]);
            i+= 2;
        }

        else if (strcmp (argv [i], "-t") ==0){

            throughput = atoi (argv [i + 1]);
            i+= 2;
        }
        else usage (argv [0]);
    }
}

/***********************************************************************/

void print_bandwidth(unsigned long long sz){

    double size;

    if (sz > giga){

        // Round result and show two decimal values
        size = round (sz / (giga /1000));
        std::cout << size /1000 << " Gb/s"<< std::endl;
    }
    else if (sz > mega){

        // Round result and show two decimal values
        size = round (sz / (mega /100));
        std::cout << size /100 << " Mb/s"<< std::endl;
    }
    else if (sz > kilo){

        // Round result and show one decimal value
        size = round (sz /( kilo /10));
        std::cout << size /10 << " Kb/s"<< std::endl;
    }
    else{
        std::cout << sz << " b/s"<< std::endl;
    }
}

/***********************************************************************/

network.h:

// Network related functions

#include "message.h"
#include <netinet/tcp.h>
#include <arpa/inet.h>

/***********************************************************************/

void read_message (int);
int accept_connection (int);
void * listening (void *);

/***********************************************************************/

unsigned burst_size;
bool NAGLE = false;
struct sockaddr_in serv_addr;
struct timeval recent_elapsed_time_val {0,0};
struct timeval start_tv;
int initial_listening_socket;
unsigned connections = 0;

/***********************************************************************/

void listen_for_connections (){

    // Server: Listens for connections from clients
    struct sockaddr_in serv_addr;

    initial_listening_socket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
    if (initial_listening_socket < 0)
        std::cerr << "ERROR opening socket";
    bzero((char *) &serv_addr, sizeof(serv_addr));
    serv_addr.sin_family = AF_INET;
    serv_addr.sin_port = htons(port);
    serv_addr.sin_addr.s_addr = INADDR_ANY;

    if (bind(initial_listening_socket, (struct sockaddr *) &serv_addr,sizeof(serv_addr)) < 0)
        std::cerr << "ERROR on binding"<<std::endl;

    listen(initial_listening_socket,CLIENTS);
}

/***********************************************************************/

int accept_connection (){

    // Server: Accepts connections from client
    int newsockfd;
    socklen_t clilen;
    struct sockaddr_in cli_addr;

    clilen = sizeof(cli_addr);

    std::cout << "waiting for new connection .." << std::endl; 
    newsockfd = accept(initial_listening_socket, (struct sockaddr *) &cli_addr, &clilen);

    std::cout << "received new connection .." << std::endl;

    connections ++;

    if (connections == CLIENTS)
        close(initial_listening_socket);

    return newsockfd;
}

/***********************************************************************/

int connect_to_server(struct hostent *server){

    // Client: Connects to server
    int sockfd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
    if (sockfd < 0)
        std::cerr << "ERROR opening socket";

    if (server == NULL){
        std::cerr << stderr << "ERROR, no such host"<< std::endl;
        exit(0);
    }

    int flag;
    if (NAGLE) flag = 0;
    else flag = 1;

    if (setsockopt (sockfd, IPPROTO_TCP, TCP_NODELAY, (char *) &flag, sizeof(int)) ==-1){
        perror ("ERROR on setting TCP_NODELAY!");
        std::terminate ();
    }

    bzero((char *) &serv_addr, sizeof(serv_addr));
    serv_addr.sin_family = AF_INET;
    bcopy((char *)server->h_addr,(char *)&serv_addr.sin_addr.s_addr,server->h_length);
    serv_addr.sin_port = htons(port);

    if (connect(sockfd,(struct sockaddr *)&serv_addr,sizeof(serv_addr)) < 0)
        std::cerr <<"ERROR connecting"<< std::endl;

    return sockfd;
}


/***********************************************************************/

void start_listening_threads (){

    // Server: Creates listening threads
    pthread_t listening_thread [CLIENTS];

    listen_for_connections ();

    for(unsigned i=0;i< CLIENTS;i++){

        unsigned * arg = (unsigned *) malloc(sizeof(*arg));

        if ( arg == NULL ) {
            fprintf(stderr, "Couldn't allocate memory for thread arg.\n");
            exit(EXIT_FAILURE);
        }

        *arg = i;

        pthread_create(&listening_thread[i], NULL,(void* (*)(void*))&listening, arg);
    }

    for(unsigned i=0;i< CLIENTS;i++){

        pthread_join (listening_thread[i], NULL);
    }
}

/***********************************************************************/

void * listening (void *a){

    // Server: Start listening after establishing a connection with the client
    int i = *((int *) a);

    newsockfd [i] = accept_connection ();

    while (1){

        read_message (newsockfd [i]);
    }

    return NULL;
}

/***********************************************************************/

void measure_throughput(unsigned counter){

    // Client: Tracks throughput and keeps on the wanted threshold
    struct timeval current_time;

    // Get the current time in order to track the throughput
    gettimeofday (&current_time, NULL);
    struct timeval elapsed_time_val = subtract_time (&current_time, &start_tv);

    double elapsed = elapsed_time_val.tv_sec+ (elapsed_time_val.tv_usec/1000000.0);

    unsigned long long sent_bytes = counter * (MESSAGE_SIZE + HEADER);
    if (elapsed > 0){

        // Calculate the expected time to send sent_bytes
        double theoretical_time = (sent_bytes) / ((throughput * mega) / 8.0);

        // Compare the expected time with the real elapsed time
        if (theoretical_time > elapsed){
            __useconds_t additional_time = (theoretical_time - elapsed) * 1000000;
            usleep (additional_time);
        }
    }

    if (elapsed_time_val.tv_sec > recent_elapsed_time_val.tv_sec){
        unsigned sending_throughput = (unsigned)((sent_bytes * 8) / (mega * elapsed * 1.0));
        std::cout << "throughput: " << sending_throughput << std::endl;
        recent_elapsed_time_val = elapsed_time_val;
    }
}

/***********************************************************************/

void send_message (message * m, int sockfd){

    // Client: Send on message header then data.
    if (write (sockfd, m -> get_header(), HEADER) == -1){

        perror ("Error exporting Header to socket");
        close (sockfd);
        exit (1);
    }

    if (write (sockfd, m -> get_text (), MESSAGE_SIZE) == -1){

        perror ("Error exporting Header to socket");
        close (sockfd);
        exit (1);
    }
}

/***********************************************************************/

void read_message (int sockfd){

    // Server: Listens for one message header then text.
    int receivedPackage = 0;
    int pos = 0;
    int expected_bytes = HEADER;
    header_type header;

    while (expected_bytes >0){

        if ((receivedPackage = read(sockfd, &header + pos, expected_bytes)) < 0){
            perror ("ERROR importing message header from socket!");
            std::terminate();
        }
        pos += receivedPackage;
        expected_bytes -= receivedPackage;
    }

    if (header.datasize != MESSAGE_SIZE){
        message * m = new message (&header, (size_t)0);
        m-> print ();
    }

    pos = 0;
    receivedPackage = 0;
    expected_bytes = MESSAGE_SIZE;
    byte text [MESSAGE_SIZE];

    while (expected_bytes >0){

        if ((receivedPackage = read(sockfd, text + pos, expected_bytes)) < 0){
            perror ("ERROR importing message header from socket!");
            std::terminate();
        }
        pos += receivedPackage;
        expected_bytes -= receivedPackage;
    }
}

/***********************************************************************/

void multi_unicaster (int sockfd){

    unsigned counter=0;

    gettimeofday (&start_tv, NULL);

    while (1){

        counter ++;

        struct in_addr IP;

        inet_aton ("127.0.0.1",&IP);

        message * m = new message (counter, IP, MESSAGE_SIZE);

        if (m -> get_datasize () != MESSAGE_SIZE)
            m-> print ();

        send_message (m,sockfd);

        delete m;

        measure_throughput(counter);
    }
}

/***********************************************************************/

生成文件:

all: server client

FLAGS=-Wall -Wextra -Werror -pedantic -pthread $(ARGS) -std=c++11 -g -rdynamic -lpthread
CXXFLAGS=$(DEF) $(FLAGS)

output/%.o: %.cpp
    g++ $(CXXFLAGS) -c -o $@ $<

client: output/client.o
    g++ $(FLAGS) -o $@ $^

server: output/server.o
    g++ $(FLAGS) -o $@ $^

clean:
    rm -rf output/* *~ server client

当 运行 环回上的代码时,一切正常,但是当在不同的服务器(实际上,不同的数据中心)上测试时,有时它没有任何问题,而其他人则没有。

如果接收到的数据是正确的,那么接收到的header应该是正确的。为了验证接收到的数据没有错误,header中接收到的datasize应该是正确的(即10 * kilo)否则数据是乱码的。

此验证在 network.h 中的 read_message 函数中提供,我猜这是问题所在。

我提供了所有这些代码,应该有人body 需要测试它。

我发布了我在评论中给出的答案,以防它能帮助遇到类似问题的人。

代码中 header 类型的消息如下所示:

typedef struct {
    // Message ID
    unsigned mID;
    // IP of sender
    struct in_addr sender;
    // Message size
    size_t datasize;
} header_type;

此消息 header 类型 non-portable 且依赖于体系结构。

在某些未签名的架构上 size_t 可能有 32 位,在其他架构上可能有 64 位或 16 位...

此外,结构 in_addr 是特定于实现的,因此消息 header 在不同的操作系统上可能看起来不同(服务器 运行 在哪个 OS 上?版本?)。

除非所有网络节点(服务器和客户端)运行 在相同的 OS 和架构上,否则需要字节流和位特定类型(即 uint64_t datasizeuint8_t client_addr[16].

另一个相关问题是消息大小的架构(与网络)字节顺序。

不同的架构表现出不同的Endianness,因此确保正确存储和读取消息长度很重要。

我会考虑将消息大小设置为 union,或者将消息大小长度限制为 32 位 (uint32_t),因此我可以使用 POSIX network byte order API.

typedef struct {
    // Network byte ordered Message ID
    uint32_t nb_mID;
    // IP of sender as either a IPv4 string or a IPv6 string 39
    uint8_t sender[39];
    // IPv4 vs. IPv6 data identifier
    uint8_t sender_type;
    // Network byte ordered Message size
    uint32_t nb_datasize;
} header_type;

旁注

附带说明一下,应该提到的是,每个连接设计使用一个线程会因过多的上下文切换而导致速度减慢,并可能使服务器更容易受到 DoS 攻击。

通常,运行比 CPU 核心数量更多的线程(或进程)会导致过多的上下文切换。

出于其他考虑,这在某种程度上通常是可以接受的,但是每个连接一个线程会 运行 非常快速地降低系统资源,并且系统很容易达到在上下文切换上花费更多时间的程度然后是任务表现。