Pthread 循环中的段错误
Segmentation Fault in Pthread Loop
最近为了工作做了一堆数值分析。主要针对相对简单概念的少量数据。为了迎接即将到来的项目,我开始研究更复杂的系统,计算量呈指数增长。我的 运行 时间从几十秒变成了几十分钟。为了加快 运行 倍,我决定学习如何使用 pthreads 编写代码。
正因为如此,我一直在研究一个使用串行方法和 pthreads 来填充矩阵的程序。我编写了这个程序来执行这 n 次中的每一次,并取每次 运行 的平均时间。当我 运行 这个程序只使用一个 pthread_t 它按预期工作。当我添加一个额外的线程时,我收到一个 "Segmentation fault" 错误。
我的代码如下:
fill.h
#ifndef FILL_H_
#define FILL_H_
#include <pthread.h>//Allows access to pthreads
#include <sys/time.h>//Allows the ability to pull the system time
#include <stdio.h>//Allows input and output
#include <stdlib.h>//Allows for several fundamental calls
#define NUM_THREADS 2
#define MAT_DIM 50
#define RUNS 1
pthread_t threads[NUM_THREADS];
pthread_mutex_t mutexmat;
typedef struct{
int id;
int column;
int* matrix[NUM_THREADS];
}WORKER;
#endif
fill.c
/*This routine will fill an array both in serial and parallel with random
*numbers. It will also display the real time it took to accomplish each task*/
/* C includes */
#include "fill.h"
/* Fills a matrix */
void fill(int start, int stop, int** matrix)
{
int i, j;
for(i = start; i < stop; i++)
{
for(j = 0; j < MAT_DIM; j++)
matrix[i][j] = rand() % 10;
}
}
void* work(void* threadarg)
{
/* Creates a pointer to a worker type variable*/
WORKER *this_worker;
/* Points this_worker at what thread arg is pointing to*/
this_worker = (WORKER*) threadarg;
/* Calculates my stopping point for this thread*/
int stop = this_worker-> column + (MAT_DIM / NUM_THREADS);
/* Used to drive matrix */
int i,j;
/* Fills portion of Matrix */
for( i = this_worker-> column; i < stop; i++)
{
/* Prints the column that matrix is working on */
printf("Worker %d working on column %d\n", this_worker->id, i);
for( j = 0; j < MAT_DIM; j++)
{
this_worker-> matrix[i][j] = rand() % 10;
}
}
/* Signals thread is done */
printf("Thread %d done.\n", this_worker-> id);
/* Terminates thread */
pthread_exit(NULL);
}
int main()
{
/* Seeding rand */
srand (time(NULL));
/* These will be used for loops */
int i, j, r, t;
/* Creating my matrices */
int* matrix_serial[MAT_DIM];
int* matrix_thread[MAT_DIM];
/* creating timeval variables */
struct timeval t0, t1;
/* Beginning serial solution */
/* Creating timer for serial solution */
gettimeofday(&t0, 0);
/* Creating serial matrix */
for(i = 0; i < MAT_DIM; i++)
matrix_serial[i] = (int*)malloc( MAT_DIM * sizeof(int));
/* Filling the matrix */
for(r = 0; r < RUNS; r++)
fill(0, MAT_DIM, matrix_serial);
/* Calculating how long it took to run */
gettimeofday(&t1, 0);
unsigned long long int delta_t = (t1.tv_sec * 1000000 + t1.tv_usec)
- (t0.tv_sec * 1000000 + t0.tv_usec);
double t_dbl = (double)delta_t/1000000.0;
double dt_avg = t_dbl / (double)r;
printf("\nSerial Run Time for %d runs: %f\t Average:%f\n",r, t_dbl, dt_avg);
/* Begin multithread solution */
/* Creating the offset where each matrix will start */
int offset = MAT_DIM / NUM_THREADS;
/* Creating a variable to store a return code */
int rc;
/* Creates a WORKER type variable named mat_work_t */
WORKER mat_work_t[NUM_THREADS];
/* Allocating a chunk of memory for my matrix */
for( i = 0; i < MAT_DIM; i++)
matrix_thread[i] = (int*)malloc( MAT_DIM * sizeof(int));
/* Begin main loop */
for(r = 0; r < RUNS; r++)
{
/* Begin multithread population of matrix */
for(t = 0; t < NUM_THREADS; t++)
{
/* Sets the values for mat_work_t[t] */
mat_work_t[t].id = t;
mat_work_t[t].column = t * offset;
/* Points the mat_work_t[t].matrix at the matrix_thread */
for(i = 0; i < MAT_DIM; i++)
mat_work_t[t].matrix[i] = &matrix_thread[i][0];
/* Creates thread placing its return value into rc */
rc = pthread_create(&threads[t],
NULL,
work,
(void*) &mat_work_t[t]);
/* Prints that a thread was successfully created */
printf("Thread %d created.\n", mat_work_t[t].id);
/* Checks to see if a return code was sent. If it was it will print it. */
if (rc)
{
printf("ERROR: return code from pthread_create() is %d\n", rc);
return(-1);
}
}
/* Makes sure all threads are done doing work before progressing */
printf("Waiting for workers to finish.\n");
for(i = 0; i < NUM_THREADS; i++)
pthread_join(threads[i], NULL);
printf("Work complete!\n");
}
/* Prints out the last matrix that was created by the loop */
for(i = 0; i < MAT_DIM; i++)
{
for(j = 0; j < MAT_DIM; j++)
printf("%d ",matrix_thread[i][j]);
printf("\n");
}
/* Terminates thread */
pthread_exit(NULL);
}
当我 运行 gdb 我得到:
[New Thread 0x7ffff7fd3700 (LWP 27907)]
Thread 0 created.
Worker 0 working on column 0
Worker 0 working on column 1
Worker 0 working on column 2
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7fd3700 (LWP 27907)]
0x0000000000400924 in work (threadarg=0x7fffffffd9c0) at fill.c:35
35 this_worker-> matrix[i][j] = rand() % 10;
我对分段错误的理解完全是教科书式的:当您尝试访问 "yours" 无法访问的内存时,就会发生分段错误。由此我知道代码在访问存储此矩阵的内存时遇到问题。
我的问题:
- 关于问题的性质,我的逻辑是否正确?
- 为什么添加线程会导致这个程序崩溃?
- 以后我该如何解决此类问题(如果有任何提示,我们将不胜感激)?
- 最后,我该如何修复它(线索或解决方案将不胜感激)?
你确定 struct WORKER 的矩阵大小只是 NUM_THREADS 吗?
您访问的数组超出了您在 2 个地方声明的数组的大小限制。
一个是
在主要功能中
NUM_THREADS(即 2 )实际上与 MAT_DIM (50)
相比太低了
for(i = 0; i < MAT_DIM; i++)
mat_work_t[t].matrix[i] = &matrix_thread[i][0];
这里是功函数
for( i = this_worker-> column; i < stop; i++)
{
/* Prints the column that matrix is working on */
printf("Worker %d working on column %d\n", this_worker->id, i);
for( j = 0; j < MAT_DIM; j++)
{
this_worker-> matrix[i][j] = rand() % 10;
}
}
在您访问矩阵[1][j] 之前,循环运行良好,当您尝试访问矩阵[2][j] 时,您遇到了分段错误,因为您已将数组大小声明为 2,并且您正在尝试访问第三个(即矩阵[2][j])
最近为了工作做了一堆数值分析。主要针对相对简单概念的少量数据。为了迎接即将到来的项目,我开始研究更复杂的系统,计算量呈指数增长。我的 运行 时间从几十秒变成了几十分钟。为了加快 运行 倍,我决定学习如何使用 pthreads 编写代码。
正因为如此,我一直在研究一个使用串行方法和 pthreads 来填充矩阵的程序。我编写了这个程序来执行这 n 次中的每一次,并取每次 运行 的平均时间。当我 运行 这个程序只使用一个 pthread_t 它按预期工作。当我添加一个额外的线程时,我收到一个 "Segmentation fault" 错误。
我的代码如下:
fill.h
#ifndef FILL_H_
#define FILL_H_
#include <pthread.h>//Allows access to pthreads
#include <sys/time.h>//Allows the ability to pull the system time
#include <stdio.h>//Allows input and output
#include <stdlib.h>//Allows for several fundamental calls
#define NUM_THREADS 2
#define MAT_DIM 50
#define RUNS 1
pthread_t threads[NUM_THREADS];
pthread_mutex_t mutexmat;
typedef struct{
int id;
int column;
int* matrix[NUM_THREADS];
}WORKER;
#endif
fill.c
/*This routine will fill an array both in serial and parallel with random
*numbers. It will also display the real time it took to accomplish each task*/
/* C includes */
#include "fill.h"
/* Fills a matrix */
void fill(int start, int stop, int** matrix)
{
int i, j;
for(i = start; i < stop; i++)
{
for(j = 0; j < MAT_DIM; j++)
matrix[i][j] = rand() % 10;
}
}
void* work(void* threadarg)
{
/* Creates a pointer to a worker type variable*/
WORKER *this_worker;
/* Points this_worker at what thread arg is pointing to*/
this_worker = (WORKER*) threadarg;
/* Calculates my stopping point for this thread*/
int stop = this_worker-> column + (MAT_DIM / NUM_THREADS);
/* Used to drive matrix */
int i,j;
/* Fills portion of Matrix */
for( i = this_worker-> column; i < stop; i++)
{
/* Prints the column that matrix is working on */
printf("Worker %d working on column %d\n", this_worker->id, i);
for( j = 0; j < MAT_DIM; j++)
{
this_worker-> matrix[i][j] = rand() % 10;
}
}
/* Signals thread is done */
printf("Thread %d done.\n", this_worker-> id);
/* Terminates thread */
pthread_exit(NULL);
}
int main()
{
/* Seeding rand */
srand (time(NULL));
/* These will be used for loops */
int i, j, r, t;
/* Creating my matrices */
int* matrix_serial[MAT_DIM];
int* matrix_thread[MAT_DIM];
/* creating timeval variables */
struct timeval t0, t1;
/* Beginning serial solution */
/* Creating timer for serial solution */
gettimeofday(&t0, 0);
/* Creating serial matrix */
for(i = 0; i < MAT_DIM; i++)
matrix_serial[i] = (int*)malloc( MAT_DIM * sizeof(int));
/* Filling the matrix */
for(r = 0; r < RUNS; r++)
fill(0, MAT_DIM, matrix_serial);
/* Calculating how long it took to run */
gettimeofday(&t1, 0);
unsigned long long int delta_t = (t1.tv_sec * 1000000 + t1.tv_usec)
- (t0.tv_sec * 1000000 + t0.tv_usec);
double t_dbl = (double)delta_t/1000000.0;
double dt_avg = t_dbl / (double)r;
printf("\nSerial Run Time for %d runs: %f\t Average:%f\n",r, t_dbl, dt_avg);
/* Begin multithread solution */
/* Creating the offset where each matrix will start */
int offset = MAT_DIM / NUM_THREADS;
/* Creating a variable to store a return code */
int rc;
/* Creates a WORKER type variable named mat_work_t */
WORKER mat_work_t[NUM_THREADS];
/* Allocating a chunk of memory for my matrix */
for( i = 0; i < MAT_DIM; i++)
matrix_thread[i] = (int*)malloc( MAT_DIM * sizeof(int));
/* Begin main loop */
for(r = 0; r < RUNS; r++)
{
/* Begin multithread population of matrix */
for(t = 0; t < NUM_THREADS; t++)
{
/* Sets the values for mat_work_t[t] */
mat_work_t[t].id = t;
mat_work_t[t].column = t * offset;
/* Points the mat_work_t[t].matrix at the matrix_thread */
for(i = 0; i < MAT_DIM; i++)
mat_work_t[t].matrix[i] = &matrix_thread[i][0];
/* Creates thread placing its return value into rc */
rc = pthread_create(&threads[t],
NULL,
work,
(void*) &mat_work_t[t]);
/* Prints that a thread was successfully created */
printf("Thread %d created.\n", mat_work_t[t].id);
/* Checks to see if a return code was sent. If it was it will print it. */
if (rc)
{
printf("ERROR: return code from pthread_create() is %d\n", rc);
return(-1);
}
}
/* Makes sure all threads are done doing work before progressing */
printf("Waiting for workers to finish.\n");
for(i = 0; i < NUM_THREADS; i++)
pthread_join(threads[i], NULL);
printf("Work complete!\n");
}
/* Prints out the last matrix that was created by the loop */
for(i = 0; i < MAT_DIM; i++)
{
for(j = 0; j < MAT_DIM; j++)
printf("%d ",matrix_thread[i][j]);
printf("\n");
}
/* Terminates thread */
pthread_exit(NULL);
}
当我 运行 gdb 我得到:
[New Thread 0x7ffff7fd3700 (LWP 27907)]
Thread 0 created.
Worker 0 working on column 0
Worker 0 working on column 1
Worker 0 working on column 2
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7fd3700 (LWP 27907)]
0x0000000000400924 in work (threadarg=0x7fffffffd9c0) at fill.c:35
35 this_worker-> matrix[i][j] = rand() % 10;
我对分段错误的理解完全是教科书式的:当您尝试访问 "yours" 无法访问的内存时,就会发生分段错误。由此我知道代码在访问存储此矩阵的内存时遇到问题。
我的问题:
- 关于问题的性质,我的逻辑是否正确?
- 为什么添加线程会导致这个程序崩溃?
- 以后我该如何解决此类问题(如果有任何提示,我们将不胜感激)?
- 最后,我该如何修复它(线索或解决方案将不胜感激)?
你确定 struct WORKER 的矩阵大小只是 NUM_THREADS 吗?
您访问的数组超出了您在 2 个地方声明的数组的大小限制。
一个是
在主要功能中
NUM_THREADS(即 2 )实际上与 MAT_DIM (50)
for(i = 0; i < MAT_DIM; i++)
mat_work_t[t].matrix[i] = &matrix_thread[i][0];
这里是功函数
for( i = this_worker-> column; i < stop; i++)
{
/* Prints the column that matrix is working on */
printf("Worker %d working on column %d\n", this_worker->id, i);
for( j = 0; j < MAT_DIM; j++)
{
this_worker-> matrix[i][j] = rand() % 10;
}
}
在您访问矩阵[1][j] 之前,循环运行良好,当您尝试访问矩阵[2][j] 时,您遇到了分段错误,因为您已将数组大小声明为 2,并且您正在尝试访问第三个(即矩阵[2][j])