为什么在使用 g++ 4.8.4 时，此代码会生成具有单个元素的地图？

Question

在过去一年半的时间里，我参与了将一个较旧的 Win32-MFC 项目移植到 Linux 的工作，但最终遇到了一些我不完全理解的问题。起初我认为这可能是由于引入了 C++11 移动语义，但我不确定这是否是问题所在。在 g++ 4.8.4 下使用 -std=c++11 标记以下代码：

#include <map>
#include <string>
#include <iostream>
#include <iomanip>
#include <cstring>

const char* foo[] = { "biz", "baz", "bar", "foo", "yin" };
const int sizes[] = { 3, 3, 3, 3, 3 };

typedef std::map <std::string, int> simpleMap_t;
typedef std::pair<std::string, int> simplePair_t;

int main()
{
    simpleMap_t map;
    std::string key;
    for (int i = 0; i<5; i++)
    {
        key.resize(sizes[i]);
        memcpy(const_cast<char *>(key.data()), foo[i], sizes[i]);
        simplePair_t pair = std::make_pair(key, 0);
        std::cout << "key: \""         << key        << "\" - " << static_cast<const void*>(key.data())
                  << " pair.first: \"" << pair.first << "\" - " << static_cast<const void*>(pair.first.data())
                  << std::endl;
        map.insert(map.end(), pair);
    }

    std::cout << "map size =  " << map.size() << std::endl;
    return 0;
}

将产生此输出：

key: "biz" - 0x1dec028 pair.first: "biz" - 0x1dec028
key: "baz" - 0x1dec028 pair.first: "baz" - 0x1dec028
key: "bar" - 0x1dec028 pair.first: "bar" - 0x1dec028
key: "foo" - 0x1dec028 pair.first: "foo" - 0x1dec028
key: "yin" - 0x1dec028 pair.first: "yin" - 0x1dec028
map size =  1

虽然在 Visual Studio 2013 年编译的相同代码将产生以下结果：

key: "biz" - 0039FE14 pair.first: "biz" - 0039FDE0
key: "baz" - 0039FE14 pair.first: "baz" - 0039FDE0
key: "bar" - 0039FE14 pair.first: "bar" - 0039FDE0
key: "foo" - 0039FE14 pair.first: "foo" - 0039FDE0
key: "yin" - 0039FE14 pair.first: "yin" - 0039FDE0
map size =  5

有趣的是，当字符串的大小在每次迭代中发生变化时，使用 g++ 编译时代码将 "work"。替换：

const char* foo[] = { "biz", "baz", "bar", "foo", "yin" };
const int sizes[] = { 3, 3, 3, 3, 3 };

与：

const char* foo[] = { "bizbiz", "baz", "barbar", "foo", "yinyin" };
const int sizes[] = { 6, 3, 6, 3, 6 };

将产生：

key: "bizbiz" - 0xc54028 pair.first: "bizbiz" - 0xc54028
key: "baz" - 0xc54098 pair.first: "baz" - 0xc54098
key: "barbar" - 0xc54108 pair.first: "barbar" - 0xc54108
key: "foo" - 0xc54178 pair.first: "foo" - 0xc54178
key: "yinyin" - 0xc541e8 pair.first: "yinyin" - 0xc541e8
map size =  5

我对移动语义的理解不完整，但我想知道这是否是这里的原因。在创建 std::pair 时是否放弃了内部 std::string 缓冲区的所有权？或者它是不是像 std::string::resize() 方法中的优化那样在应该的时候没有重新分配新的字符缓冲区？

Answer 1

由于以下奇怪的代码行，您的代码具有未定义的行为：

key.resize(sizes[i]);
memcpy(const_cast<char *>(key.data()), foo[i], sizes[i]);

几乎任何时候你发现自己需要抛弃 const-ness（并且，就此而言，使用 memcpy），你做错了什么.

的确，粗略浏览 documentation for std::string::data（您读过，对吧？对吗？）确认：

Modifying the character array accessed through data is [sic] undefined behavior.

好的老式作业有问题吗？

key.assign(foo[i], sizes[i]);`

由于UB，进一步分析这个是愚蠢的。

为什么在使用 g++ 4.8.4 时，此代码会生成具有单个元素的地图？

Why, when using g++ 4.8.4, does this code result in a map with a single element?

c++

g++

c++11