哪种查找方法对于简单的整数到字符串查找最有效

Question

我需要在某些 C 代码中查找字符串标识符，并且正在考虑如何编写查找代码。标识符和字符串在编译时是固定的，不可能改变。我认为对字符串数组的索引是最有效的 - 即 lookup1.

有时在代码中标识符不是从 0 开始或编号中有间隙，因此为这些情况选择 lookup2。 lookup2 使用 switch 语句。

另一个选项是 lookup3，它使用具有整数到字符串映射的结构。

我考虑的一些利弊。

如果标识符不是从零开始或存在间隙，则 lookup2 更灵活

如果标识符从零开始并且没有间隙，那么 lookup1 更好？如果不是，那么使用 lookup2 方法？

lookup3 怎么样？

这是遗留代码，定义已经存在。对于新代码，枚举会更好吗？

一般来说一个类别中会有5-20个定义。可以超过 100 个。

这是代码。

#include <stdio.h>

#define RINGING     0x0
#define DIALING     0x1
#define IDLE        0x2
#define ENGAGED     0x3
#define CONNECTED   0x4

static const char* const lookup1(int id) {
    static const char* const identifiers[] = {
        "RINGING",
        "DIALING",
        "IDLE",
        "ENGAGED",
        "CONNECTED" };

    int size = sizeof(identifiers) / sizeof(identifiers[0]);
    if (id >= 0 && id < size) {
        return identifiers[id];
    }
    return "Unknown identifier";
}


static const char* const lookup2(int id) {
    switch (id) {
    case RINGING: return "RINGING";
    case DIALING: return "DIALING";
    case IDLE: return "IDLE";
    case ENGAGED: return "ENGAGED";
    case CONNECTED: return "CONNECTED";
    default: return "unknown";
    }
}

static const char* const lookup3(int id) {
    struct id2name {
        int id;
        const char* const name;
    };

    static struct id2name pairings[] = {
        { RINGING, "RINGING" },
        { DIALING, "DIALING" },
        { IDLE, "IDLE" },
        { ENGAGED, "ENGAGED" },
        { CONNECTED, "CONNECTED" } };

    int size = sizeof(pairings) / sizeof(pairings[0]);
    if (id >= 0 && id < size) {
        return pairings[id].name;
    }

    return "Unknown identifier";
}


int main() {
    const int identifiers[] = { RINGING, DIALING, IDLE, ENGAGED, CONNECTED };
    const int size = sizeof(identifiers) / sizeof(identifiers[0]);
    for (int i = 0; i < size; ++i) {
        printf("using lookup1 id %d is: %s\n", i, lookup1(i));
        printf("using lookup2 id %d is: %s\n", i, lookup2(i));
        printf("using lookup3 id %d is: %s\n", i, lookup3(i));
    }
}

Answer 1

If identifiers start from zero and there are no gaps then lookup1 is better?

是的。

If not then go for lookup2 method?

是的。

How about lookup3?

lookup3 有问题。您需要迭代所有配对并检查 ID，即：

static struct id2name pairings[] = {
    { RINGING, "RINGING" },
    { DIALING, "DIALING" },
    { IDLE, "IDLE" },
    { ENGAGED, "ENGAGED" },
    { CONNECTED, "CONNECTED" } };

int size = sizeof(pairings) / sizeof(pairings[0]);
for (i = 0; i < size; i++) {
    if (pairings[i].id == id) {
        return pairings[i].name;
    }
}

如果 pairings[] 中的 ID 已经排序，你可以更快地打破 for 循环，即

for (i = 0; i < size && pairings[i].id < id; i++) {

For new code, enums would be better for this?

不是在性能方面，但它们看起来会更好。

Answer 2

像您的 lookup1() 这样的 table 查找在清晰度、简洁性和速度方面是无与伦比的。然而，这并不是说其他方法可能至少在速度上没有竞争力。对于相对性能问题，您确实需要进行基准测试。

如果最大索引号很大或者任何索引号小于零，或者如果您不能至少依赖 C99，那么直接基于数组的 table 查找是有问题的，但是否则，索引之间的间隙不是特别的问题，包括数组开头和使用的最低索引之间的间隙。考虑一下：

#define INITIALIZER(x) [x] = #x,

const char *lookup4(int x) {
    static const char * const table[] = {
        INITIALIZER(RINGING)
        INITIALIZER(DIALING)
        INITIALIZER(IDLE)
        INITIALIZER(ENGAGED)
        INITIALIZER(CONNECTED)
        // initializers have the form:
        // [MACRO] = "MACRO",
    };

    const char *result = ((x < 0 | x >= (sizeof(table) / sizeof(table[0]))) 
        ? NULL 
        : table[x];

    return result ? result : "unknown";
}

使用指定的初始化器（由 INITIALIZER() 宏生成）来初始化查找 table 中对应于有效字符串的那些元素；其他人将是 NULL。这最终与您的 lookup1().

非常相似

你的 lookup2() 没有什么特别的问题。它简洁明了，我想大多数编译器都会为它生成非常高效的代码。

然而，由于 lookup3() 的出现，我看不出有任何理由比其他任何人更喜欢它。您不使用消息编号，那么为什么将它们存储在您的结构中？但是，如果没有它们，您 不需要 结构，因此您基本上有一个更复杂的 lookup1() 实现。如果您确实使用了数字——例如通过在数组中搜索所请求的消息编号——平均而言，这将比其他方法更昂贵。

哪种查找方法对于简单的整数到字符串查找最有效

Which lookup method is most efficient for simple integer to string lookup

c

lookup