这些带有 char 数组的新表达式中,哪些是格式正确的?

Which of these new expressions with char arrays are well-formed?

以下program

int main() 
{
    new char[4] {"text"};  // #1
    new char[5] {"text"};  // #2
    new char[] {"text"};   // #3
}

clang 给出了 #1 的错误,它说:

error: initializer-string for char array is too long

并接受 #2#3

gcc 对所有语句给出以下错误:

error: invalid conversion from 'const char*' to 'char' [-fpermissive]

此外,对于 #3,它给出了错误:

error: expected primary-expression before ']' token

那么关于这段代码是否格式正确,语言是怎么说的?

我想知道当前的规则,但我也想知道这是否在以前的语言版本中有所改变。

好的,这很容易追踪。 {} 的存在意味着正在执行列表初始化,因此我们可以访问规范中我们最喜欢的部分:[dcl.init.list]/3.

案例 1 中正在初始化的对象是 char[4]。 braced-init-list 不是指定的初始值设定项,因此忽略 3.1。 char[4] 不是 class,因此忽略 3.2。那 brings us to 3.3:

Otherwise, if T is a character array and the initializer list has a single element that is an appropriately-typed string-literal ([dcl.init.string]), initialization is performed as described in that subclause.

好吧,char[4] 绝对是一个字符数组,初始化列表肯定包含一个元素,而且该元素确实与字符数组的类型匹配。所以我们去[dcl.init.string]

这告诉我们(以某种方式):

Successive characters of the value of the string-literal initialize the elements of the array.

但下一段警告:

There shall not be more initializers than there are array elements.

好吧,这使得 #1 ill-formed。

因此,我们重做 char[5] 的过程。这不会触发,因为 5 足够大。

最后,我们来到char[]。就初始化而言,这与使用数字没有什么不同。 char[]是一个字符数组,所以遵循上面的规则。 C++17 会因为在 new 表达式中使用 char[] 而窒息,但是 C++20 is fine with it.

If the type-id or new-type-id denotes an array type of unknown bound ([dcl.array]), the new-initializer shall not be omitted; the allocated object is an array with n elements, where n is determined from the number of initial elements supplied in the new-initializer ([dcl.init.aggr], [dcl.init.string]).

这意味着#2 和#3 应该是合法的。因此 GCC 将它们设为 ill-formed 是错误的。它因为错误的原因而成为#1 ill-formed。

Clang 是正确的,因为 #1 是 ill-formed 而 #2 没问题。

正如 Ted Lyngmo 在评论中指出的那样,#3 根据 C++ 语法规则是无效的,直到论文 P1009R2 做出更改以允许它。 new-expression 根本不允许类型中存在空 [] 的可能性,这是在没有语法来初始化由 [= 创建的数组时遗留下来的38=]new-expression,因此编译器无法确定实际大小。本文的更改在 C++20 中被接受(但编译器作者有时会选择在以前的 -std= 模式中追溯支持“修复”)。

对于#1和#2的区别,在[expr.new]中指定数组对象的初始化遵循[dcl.init]的direct-initialization规则。 [dcl.init] 早期初始化的一般规则说如果初始化器是 braced-init_list,它就是 list-initialization。 [dcl.init.list] 中的规则如下:

List-initialization of an object or reference of type T is defined as follows:

  • [C++20 only:] If the braced-init-list contains a designated-initializer-list, ...

  • If T is an aggregate class and...

  • Otherwise, if T is a character array and the initializer list has a single element that is an appropriately-typed string-literal ([dcl.init.string]), initialization is performed as described in that subclause.

  • ...

因此 [dcl.init.string] (C++17, latest) 给出了适用于此代码的实际初始化规则:

An array of {C++17: narrow character type}{C++20: ordinary character type ([basic.fundamental])}, char8_­t array, char16_­t array, char32_­t array, or wchar_­t array can be initialized by {C++17: a narrow}{C++20: an ordinary} string literal, UTF-8 string literal, UTF-16 string literal, UTF-32 string literal, or wide string literal, respectively, or by an appropriately-typed string-literal enclosed in braces ([lex.string]). Successive characters of the value of the string-literal initialize the elements of the array.

There shall not be more initializers than there are array elements. [ Example:

char cv[4] = "asdf";            // error

is ill-formed since there is no space for the implied trailing '[=20=]'. — end example ]

If there are fewer initializers than there are array elements, each element not explicitly initialized shall be zero-initialized ([dcl.init]).

就像普通变量定义一样,当new-expression的字符数组类型有一个指定的界限时,它必须足够大以容纳a的所有字符初始化它的字符串文字,包括尾随的空字符。

(这是 C 和 C++ 之间的老区别:C 确实允许 char cv[4] = "asdf"; 并忽略空字符。)