将单词存储到哈希表中

Storing words into a hashtable

我有一个包含英文单词的 txt 文件,每个单词占一行。 我是 C 语言的初学者。我正在使用 loadunload 函数将所有单词存储到哈希表中(单独链接)并从内存中卸载它们,但是 运行成一些问题。

函数(main.c中的代码是正确的):

load:

#include <stdbool.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#include <stdio.h>

#include "dictionary.h"

#define SIZE 26

typedef  struct node
{
    char word[LENGTH+1];
    struct node *next;
}node;

unsigned int hash_num = 0;
node *hashtable[SIZE];  //26 letters in alphabet

node *head = NULL;

// hashfunction
unsigned int hash(char const *key)  
{
    unsigned int hash= tolower(key[0]) - 'a';
    return hash % SIZE;
}

/**
 * Loads dictionary into memory.  Returns true if successful else false.
 */
bool load(const char* dictionary)
{

    unsigned int hash_index=0;

    FILE *fp = fopen(dictionary, "r");
    if(fp == NULL)
    {
        fprintf(stderr, "Couldn't open %s",dictionary);
        return false;
    }

    //dictionary 


    while(!feof(fp))
    {
        node *temp = malloc(sizeof(node));
        if (temp == NULL)
        {
            unload();
            fclose(fp);
            return false;
        }


        if(fscanf(fp , "%s", temp->word) == 1)   //storing word of dictionary in my new_node -> word
        {
            hash_index = hash(temp->word); 
            head= hashtable[hash_index];    //head always points to first element of linked list (containting word of dictionary)


            temp->next = head;  
            head = temp;        


            hash_num++;
        }
        else    //if fscanf couldn't store the word (return 0)
        {
            free(temp);    //free last temp
            break;
        }
    }

    fclose(fp);
    return true;

}

unload:

bool unload(void)
{

    for(int i=0; i<SIZE; i++)
    {
        if(hashtable[i] != NULL)      //if hashtable isn't NULL (has nodes)
        {
            node *cursor = hashtable[i];        //cursor points at head of individual linked list
            while(cursor != NULL)       //free them
            {
                node *temp = cursor;
                cursor = cursor->next;
                free(temp);
            }
        }
    }

    return true;
}

谁能告诉我逻辑是否正确?每当我 运行 valgrind 它告诉我我的所有节点都已分配但只有 3 个空闲。

total heap usage: 143,094 allocs, 3 frees, 8,014,288 bytes allocated
LEAK SUMMARY:
==15903==    definitely lost: 8,013,040 bytes in 143,090 blocks
==15903==    indirectly lost: 0 bytes in 0 blocks
==15903==      possibly lost: 0 bytes in 0 blocks

查看提供的源代码时(缺少"dictionary.h"),主要问题在load()函数中定位。

问题 1(主要) - 添加新的 word/node 时 hashtable[] 永远不会更新(在计算 hash_index = hash(temp->word); 之后)。

To store the updated linked-list (managed as reversed), it is necessary to update the hashtable[hash_index] with the new node pointer (the allocated temp node).

temp->next = head;
head = temp;

hashtable[hash_index] = head; // update the hashtable[] pointer

hash_num++;

Alternate solution without global variable head.

temp->next = hashtable[hash_index]; //head always points to first element...
hashtable[hash_index] = temp; // update the hashtable[] pointer

hash_num++;

而不是

temp->next = head;
head = temp;

hash_num++;

问题 2(小) - hashtable[SIZE] 从未初始化。

In the unload() function, in the for-loop, the if-condition if(hashtable[i] != NULL) assumes that each item of the array is initialized to NULL.

load() 函数的开头或调用它之前添加一个 for 循环来初始化每个指针。

for(int i=0; i<SIZE; i++)
{
    hashtable[i] = NULL;
}

问题 3(潜在错误源) - 正如审阅者所建议的,使用 head,声明为全局变量 node *head = NULL; 可能是错误的潜在来源。

In the load() function, the variable head is used as a temporary storage but could store value during software run. If a read operation is performed without a well-known write operation before, the result could be an unexpected error even if the compilation doesn't detect error or warning.

The best way is to reduce the number of global variable as much as possible.

Enhancement (Reverse the linked-list) - 因为managed linked-list是在前面添加新项,所以这里是在最后添加新项的解决方案。

node *first = hashtable[hash_index];
if (first == NULL) {
    hashtable[hash_index] = temp;
}
else {
    temp->next = NULL; // ending the list
    while (first->next!=NULL) {
        first = first->next;  // loop until last node
    }
    first->next = temp; // linking to the last node
}

hash_num++;

而不是

head= hashtable[hash_index];    //head always points to first element ...

temp->next = head;  
head = temp;        

hash_num++;