检查字符串是否包含在数组中，如果不包含则追加 (C)

Question

我有 2 个数组，一个名为 'edges'，其中包含一个城市名称列表，另一个名为 cityNames，它被初始化为一个空字符串。

我想做的是逐个元素地移动边缘数组，看看它是否包含在 cityNames 数组中。如果是，则移动到 edges 中的下一个元素，如果不是，则将该值附加到 cityNames 数组。

下面的代码将 edges[i].startCity 添加到 cityNames 数组，但它不检查重复项，我不明白为什么。

for (int i = 1; i < noEdges; i++) {
        for (int j = 0; j < noCities; j++) {
            if(strcmp(edges[i].startCity, cityNames[j].cityName) != 0) {
                strcpy(cityNames[i].cityName, edges[i].startCity);
            }
        }
        noCities += 1;
    }

提前致谢

Answer 1

我假设：

edges 是一个已知长度的结构数组 noEdges，每个结构包含一个 string（要么是 char 指针，要么是 char 数组）
cityNames 是一个结构数组，其大小至少为不同名称的数量（可以是 noEdges 或 edges 数组的大小）
cityNames 结构包含一个 char 数组元素，其大小至少为最长名称 + 1（+1 为终止空值）

那么下面的代码可以给出唯一的名字：

noCity = 0;
for (int i = 0; i < noEdges; i++) {
        int dup = 0;       // expect edges[i].startCity not to be a duplicate
        for (int j = 0; j < noCities; j++) {
            if(strcmp(edges[i].startCity, cityNames[j].cityName) == 0) {
                dup = 1;   // got a duplicate
                break;     // no need to go further ...
            }
        }
        if (dup == 0) {    // not a duplicate: add it to cityNames
            strcpy(cityNames[noCities].cityName, edges[i].startCity);
            noCities += 1; // we now have one more city
        }
    }
}

Answer 2

如果可以的话，最好不要使用字符串（或者至少在实际需要时操作字符串）。

您可以先为每个城市名称分配一个数字，这样您就有了一个整数数组，使用起来更快更容易。扫描重复项变得微不足道，因为您现在只需比较数字。

当您需要在屏幕上显示实际文本或将城市名称写入文件时，您可以使用与城市名称关联的索引来检索索引的适当文本表示。然后，您可以将 cityNames[] 的数据类型替换为整数。这使得每个 'node' 其中 'edges' 连接一个数字而不是文本。

char* actualCityNames[n]; //array holding all city names with duplicates, could be a file also
char* indexedCityNames[n];//array with indexed cities (in order of appearance in actualCityNames, i.e. not alphabetical order)
//indexedCityNames will most likely not use up N slots if duplicates occur
//this is why there is a second counter for the size of indexed cities

int indexedCount = 0;//number of unique city names
int duplicates = 0;

//loop for actualCityNames slots
for(int i=0; i<n; i++){
    
    //loop for indexedCityNames
    for(int j=0; j<indexedCount; j++){
        
        //strcmp returns 0 if both strings are the same
        if(strcmp(actualCityNames[i],indexedCityNames[j]) == 0){
            //duplicate found, mark flag
            duplicates = 1;
        }
    } 
    
    if(!duplicates){
        strcpy(indexedCityNames[indexedCount],actualCityNames[I]);
        indexedCount++;
    }
    duplicates = 0;
}

Answer 3

您的代码片段不检查重复项，因为在内部循环中，if 语句会在遇到第一个 cityName 不等于当前 startCity 时立即追加 startCity ].

另外在这个声明中

strcpy(cityNames[i].cityName, edges[i].startCity);
                ^^^

使用了不正确的索引。

并且变量noCities只有在追加新的startCity时才会递增。

另外，外循环应该从等于 0 的索引开始。

按以下方式重写循环

int noCities = 0;

for ( int i = 0; i < noEdges; i++ ) {
    int j = 0;

    while ( j < noCities && strcmp(edges[i].startCity, cityNames[j].cityName) != 0 ) {
        ++j;
    }
        
    if ( j == noCities ) strcpy(cityNames[noCities++].cityName, edges[i].startCity);
}

检查字符串是否包含在数组中，如果不包含则追加 (C)

Check if a string is included in an array and append if not (C)

c

arrays

loops

unique

c-strings