为什么 Linux 上的 CLANG 3.5 在调用 DTOR 时会清理 "std::string" 两次？

Question

有一个项目专注于在没有额外依赖的情况下使用 C++ 98，但它需要维护动态分配的内存。智能指针不可用，因此添加了手动清理内容的代码。方法是在 CTOR 中显式地将变量设置为 NULL，读取一些可能动态分配内存的数据，捕获任何发生的异常并在必要时通过手动调用 DTOR 清理内存。无论如何都需要实现释放内存，以防万一一切都成功了，并且只是通过检查内存是否已分配的安全措施得到了增强。

以下是与此问题最相关的available code：

default_endian_expr_exception_t::doc_t::doc_t(kaitai::kstream* p__io, default_endian_expr_exception_t* p__parent, default_endian_expr_exception_t* p__root) : kaitai::kstruct(p__io) {
    m__parent = p__parent;
    m__root = p__root;
    m_main = 0;

    try {
        _read();
    } catch(...) {
        this->~doc_t();
        throw;
    }
}

void default_endian_expr_exception_t::doc_t::_read() {
    m_indicator = m__io->read_bytes(2);
    m_main = new main_obj_t(m__io, this, m__root);
}

default_endian_expr_exception_t::doc_t::~doc_t() {
    if (m_main) {
        delete m_main; m_main = 0;
    }
}

header 最相关的部分如下：

class doc_t : public kaitai::kstruct {
    public:
        doc_t(kaitai::kstream* p__io, default_endian_expr_exception_t* p__parent = 0, default_endian_expr_exception_t* p__root = 0);

    private:
        void _read();

    public:
        ~doc_t();

    private:
        std::string m_indicator;
        main_obj_t* m_main;
        default_endian_expr_exception_t* m__root;
        default_endian_expr_exception_t* m__parent;
    };

three different environments, clang3.5_linux, clang7.3_osx and msvc141_windows_x64, to explicitly throw exceptions when reading data and if it leaks memory under those conditions. The problem is that this triggers SIGABRT on CLANG 3.5 for Linux only. The most interesting stack frames中测试的代码如下：

<frame>
  <ip>0x577636E</ip>
  <obj>/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19</obj>
  <fn>std::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt;::~basic_string()</fn>
</frame>
<frame>
  <ip>0x5ECFB4</ip>
  <obj>/home/travis/build/kaitai-io/ci_targets/compiled/cpp_stl_98/bin/ks_tests</obj>
  <fn>default_endian_expr_exception_t::doc_t::doc_t(kaitai::kstream*, default_endian_expr_exception_t*, default_endian_expr_exception_t*)</fn>
  <dir>/home/travis/build/kaitai-io/ci_targets/tests/compiled/cpp_stl_98</dir>
  <file>default_endian_expr_exception.cpp</file>
  <line>51</line>
</frame>

[...]

<frame>
  <ip>0x577636E</ip>
  <obj>/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19</obj>
  <fn>std::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; &gt;::~basic_string()</fn>
</frame>
<frame>
  <ip>0x5ED17E</ip>
  <obj>/home/travis/build/kaitai-io/ci_targets/compiled/cpp_stl_98/bin/ks_tests</obj>
  <fn>default_endian_expr_exception_t::doc_t::~doc_t()</fn>
  <dir>/home/travis/build/kaitai-io/ci_targets/tests/compiled/cpp_stl_98</dir>
  <file>default_endian_expr_exception.cpp</file>
  <line>62</line>
</frame>

第 51 行和第 62 行是上面提供的 CTOR 和 DTOR 的最后一行，因此实际上是右括号。这看起来像是编译器添加的一些代码只是试图释放维护的 std::string 两次，一次在 DTOR 中，另一次在 CTOR 中，很可能仅在抛出异常时才释放。

这个分析完全正确吗？

如果是这样，这是 C++ 的一般预期行为还是仅此具体编译器？我想知道，因为其他编译器没有 SIGABRT，即使所有代码都相同。这是否意味着不同的编译器清理 non-pointers 就像 std::string 不同？如何知道每个编译器的行为方式？

看看是什么，我本以为 std::string 只被 CTOR 释放，因为异常：

C++11 15.2 Constructors and destructors (2)

An object of any storage duration whose initialization or destruction is terminated by an exception will have destructors executed for all of its fully constructed subobjects (excluding the variant members of a union-like class), that is, for subobjects for which the principal constructor (12.6.2) has completed execution and the destructor has not yet begun execution.

在这种情况下，销毁不会因异常而终止，只有构造才会终止。但是因为 DTOR 是 DTOR，它也被设计为自动清理东西吗？如果是这样，一般是所有编译器还是只有这个？

手动调用 DTOR 是否可靠？

根据我的研究，手动调用 DTOR 应该不会太糟糕。这是不是一个错误的表达方式，而且由于我现在看到的东西，它是一个很大的 no-go？我的印象是，如果手动调用 DTOR，它只需要兼容才能以这种方式调用。以上应该是我的理解。它只是因为我不知道的编译器的 aut-generated 代码而失败。

如何解决这个问题？

与其手动调用 DTOR 并触发自动生成的代码，不如使用自定义 cleanUp 函数来释放内存并将指针设置为 NULL？在出现异常的情况下在 CTOR 中调用它应该是安全的，并且总是在 DTOR 中，对吗？或者有什么方法可以让所有编译器以兼容的方式调用 DTOR？

谢谢！

Answer 1

一旦析构函数被调用，对象就不再存在（留下未初始化的内存）。这意味着析构函数可以省略“完成”内存写入，例如将指针设置为零（对象不再存在，因此它的值永远无法读取）。这也意味着基本上 any 对该对象的进一步操作是 UB。

如果不再以任何方式使用 this 指针，人们会认为销毁 *this 有一些余地。在您的示例中情况并非如此，因为析构函数被调用了两次。

我知道手动调用析构函数的一种情况是正确的，另一种情况大部分是正确的：当对象是用新放置创建的（在这种情况下，不会有自动调用析构函数的操作析构函数）。大多数正确的情况是销毁对象后立即通过在同一位置调用 placement-new 重新初始化对象。

关于你的第二个问题：你为什么要显式地调用析构函数？据我所知，您的代码应该可以正常工作而无需任何扭曲：

default_endian_expr_exception_t::doc_t::doc_t(kaitai::kstream* p__io, default_endian_expr_exception_t* p__parent, default_endian_expr_exception_t* p__root)
  : kaitai::kstruct(p__io), m__parent(p__parent), m__root(p__root), m_main() {
    _read();
}

在用户提供的构造函数运行之前，对象被初始化为有效状态。如果 _read 抛出一个应该仍然是这种情况的异常（否则修复 _read！），因此隐式析构函数调用应该很好地清理所有内容。

Answer 2

这是一个类似于您的情况的简化示例，并且使行为显而易见：

#include <iostream>

struct S {
    S() { std::cout << "S constructed\n";}
    ~S() { std::cout << "S destroyed\n";}
};

class Throws {
    S s;
public:
    Throws() {
        try {
            throw 42;
        } catch (int) {
            this->~Throws();
            throw;
        }
    }
};


int main() {
  try {
      Throws t;
  } catch (int) {}
}

输出：

S constructed
S destroyed
S destroyed

Demo with clang, demo with gcc.

该示例通过两次销毁相同的 S 实例表现出未定义的行为。由于析构函数没有做太多事情，特别是不访问 this，未定义的行为实际上通过两次成功的运行析构函数表现出来，因此可以很容易地在操作中观察到。

显然，OP 怀疑析构函数是否应该真正销毁对象，连同其所有成员和基类。为了消除这些疑虑，这里引用标准中的相关内容：

[class.dtor]/14 After executing the body of the destructor and destroying any objects with automatic storage duration allocated within the body, a destructor for class X calls the destructors for X’s direct non-variant non-static data members, the destructors for X’s non-virtual direct base classes and, if X is the most derived class (11.10.2), its destructor calls the destructors for X’s virtual base classes...

为什么 Linux 上的 CLANG 3.5 在调用 DTOR 时会清理 "std::string" 两次？

Why does CLANG 3.5 on Linux cleans a "std::string" up twice when calling DTOR when throwing in CTOR?

c++

memory-leaks

memory-management

clang

这个分析完全正确吗？

手动调用 DTOR 是否可靠？

如何解决这个问题？