在刚好超过数组末尾的指针上以零长度调用 memcpy 是否合法?

Is it legal to call memcpy with zero length on a pointer just past the end of an array?

answered elsewhere 一样,使用无效或 NULL 指针调用函数 memcpy 是未定义的行为,即使长度参数为零。在这样一个函数的上下文中,尤其是 memcpymemmove,刚好超过数组末尾的指针是有效指针吗?

我问这个问题是因为获得刚好超过数组末尾的指针是合法的(相对于例如超过数组末尾两个元素的指针)但您不能取消引用它, 然而 ISO 9899:2011 的脚注 106 指出这样的指针指向程序的地址 space ,根据 §7.1.4.[=19= 指针有效的标准要求]

这种用法出现在我想将一个项目插入数组中间的代码中,要求我将所有项目移动到插入点之后:

void make_space(type *array, size_t old_length, size_t index)
{
    memmove(array + index + 1, array + index, (old_length - index) * sizeof *array);
}

如果我们想在数组的末尾插入,index等于lengtharray + index + 1指向刚好越过数组的末尾,但是复制的个数元素为零。

如果我们看一下 C99 standard,有这个:

7.21.1.p2

Where an argument declared as size_t n specifies the length of the array for a function, n can have the value zero on a call to that function. Unless explicitly stated otherwise in the description of a particular function in this subclause, pointer arguments on such a call shall still have valid values, as described in 7.1.4. On such a call, a function that locates a character finds no occurrence, a function that compares two character sequences returns zero, and a function that copies characters copies zero characters. ...

7.21.2.1

memcpy的描述中没有明确说明

7.1.4.p1

... If a function argument is described as being an array, the pointer actually passed to the function shall have a value such that all address computations and accesses to objects (that would be valid if the pointer did point to the first element of such an array) are in fact valid.

添加了重点。似乎指针必须指向有效位置(在取消引用的意义上),关于允许指向末尾 + 1 的指针算法的段落在这里不适用。

问题是 memcpy 的参数是否为数组。当然它们没有声明为数组,但是

7.21.1.p1

The header string.h declares one type and several functions, and defines one macro useful for manipulating arrays of character type and other objects treated as arrays of character type.

memcpystring.h.
所以我会假设 memcpy does 将参数视为字符数组。 因为提到的宏是NULL,所以句子的"useful for..."部分显然适用于函数。

将结束指针传递给 memmove 的第一个参数有几个陷阱,可能会导致鼻恶魔攻击。 严格来说,没有不透水的保证才能定义好。

(不幸的是,标准中没有太多关于 "past the last element" 概念的信息。)

注意:很抱歉现在换了个方向...

问题基本上是如果移动了 0 个字节,"one past the end pointer" 是否是 memmove 的有效第一个函数参数:

T array[length];
memmove(array + length, array + length - 1u, 0u);

有问题的要求是第一个参数的有效性。

N1570, 7.1.4, 1

If a function argument is described as being an array, the pointer actually passed to the function shall have a value such that all address computations and accesses to objects (that would be valid if the pointer did point to the first element of such an array) are in fact valid.

If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, or a pointer to non-modifiable storage when the corresponding parameter is not const-qualified) or a type (after promotion) not expected by a function with variable number of arguments, the behavior is undefined.

如果指针

使参数有效
  1. 不在地址space、
  2. 之外
  3. 不是空指针,
  4. 不是指向 const 内存的指针

如果参数类型

  1. 不是数组类型

1。地址 space

N1570, 6.5.6, 8

Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object.

N1570, 6.5.6, 9

Moreover, if the expression P points either to an element of an array object or one past the last element of an array object, and the expression Q points to the last element of the same array object, the expression ((Q)+1)-(P) has the same value as ((Q)-(P))+1 and as -((P)-((Q)+1)), and has the value zero if the expression P points one past the last element of the array object, even though the expression (Q)+1 does not point to an element of the array object.106

106 Another way to approach pointer arithmetic is first to convert the pointer(s) to character pointer(s): In this scheme the integer expression added to or subtracted from the converted pointer is first multiplied by the size of the object originally pointed to, and the resulting pointer is converted back to the original type. For pointer subtraction, the result of the difference between the character pointers is similarly divided by the size of the object originally pointed to.

When viewed in this way, an implementation need only provide one extra byte (which may overlap another object in the program) just after the end of the object in order to satisfy the "one past the last element" requirements.

尽管脚注不是规范性的 - 正如 Lundin 所指出的 - 我们在这里有一个解释 "an implementation need only provide one extra byte"。 虽然,我无法通过引用来证明,但我怀疑这是一个暗示,即标准意味着要求实现在程序地址 space 中包含内存,位于结束指针指向的位置。

2。空指针

过去的结束指针不是空指针。

3。指向常量内存

除了给出一些关于几个操作的结果的信息之外,该标准没有对过去的结束指针施加进一步的要求,并且(再次 non-normaltive ;))脚注阐明它可以与另一个重叠 object。 因此,不能保证结束指针指向的内存是非常量。 由于 memove 的第一个参数是指向 non-constant 内存的指针,因此不能保证传递过去的结束指针是有效的并且可能是未定义的行为。

4。数组参数的有效性

第 7.21.1 章描述了字符串处理 header <string.h> 并且第一个子句指出:

The header declares one type and several functions, and defines one macro useful for manipulating arrays of character type and other objects treated as arrays of character type.

我认为这里的标准不是很清楚 "objects treated as arrays of character type" 是指函数还是仅指宏。 如果这句话实际上暗示 memove 将第一个参数视为字符数组,则将结束指针传递给 memmove 的行为是未定义的行为,如 7.1.4 所示(这需要指向一个有效的 object).

3.15 object

  1. object region of data storage in the execution environment, the contents of which can represent values

数组对象或对象的内存、指向最后一个元素指向的指针不能表示值,因为它不能被取消引用(6.5.6 加法运算符,第 8 段)。

7.24.2.1 The memcpy function

  1. The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.

传递给 memcpy 的指针必须指向一个对象。

6.5.3.4 The sizeof and _Alignof operators

  1. When sizeof is applied to an operand that has type char, unsigned char, or signed char, (or a qualified version thereof) the result is 1. When applied to an operand that has array type, the result is the total number of bytes in the array. When applied to an operand that has structure or union type, the result is the total number of bytes in such an object, including internal and trailing padding.

sizeof 运算符不将过去元素计为对象,因为它不计入对象的大小。然而,它清楚地给出了整个对象的大小。

6.3.2.1 Lvalues, arrays, and function designators

  1. An lvalue is an expression (with an object type other than void) that potentially designates an object; 64) if an lvalue does not designate an object when it is evaluated, the behavior is undefined.

我认为指向数组对象或对象的过去指针并不代表对象。

int a ;
int* p = a+1 ; 

p 已定义,但它不指向一个对象,因为它不能被取消引用,它指向的内存不能表示一个值,并且 sizeof 不将该内存计为目的。 Memcpy 需要指向对象的指针。

因此,将过去的指针传递给 memcpy 会导致未定义的行为。

更新:

这部分也支持结论:

6.5.9 Equality operators

  1. Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.

这意味着指向对象的指针如果递增到一个对象之后,可以指向另一个对象。在这种情况下,它肯定不能指向它原来指向的对象,说明指向一个对象的指针并没有指向一个对象。