open_memstream 返回的指针在 Cython 中表示为 b''

Pointer returned by open_memstream represented as b'' in Cython

我在 cdef 对象中有以下 cython 代码:

def __getstate__(self):

    cdef char *bp
    cdef size_t size
    cdef cn.FILE *stream

    stream = cn.open_memstream(&bp, &size)

    cn.write_padded_binary(self.im, self.n, 256, stream)
    cn.fflush(stream);

    cn.fclose(stream)

    print("pointer", bp, "size_t:", size)
    # ('pointer', b'', 'size_t:', 6144)
    bt = c.string_at(bp, size)
    print("bt", bt)

    cn.free(bp)

    return bt

但是print("pointer", bp, "size_t:", size)中打印的指针和print("bt", bt)中打印的bytestring让我担心出了什么问题。指针只是 ('pointer', b'', 'size_t:', 6144),字节串似乎包含来自 Python 源代码的文本:

x00\x00 Normalize an encoding name.\n\n Normalization works as follows: all non-alphanumeric\n characters except the dot used for Python package names are\n collapsed and replaced with a single underscore, e.g. \' -;#\'\n becomes \'_\'. Leading and trailing underscores are removed.\n\n Note that encoding names should be ASCII only; if they do use\n non-ASCII characters, these must be Latin-1 compatible.\n\n \x00\x00\

(虽然它主要只是字节符号)。

我确定 write_padded_binary_works,因为当我给它一个常规文件描述符时它会起作用。我也确信 open_memstream 有效,因为当我尝试使用 cn.fprintf(stream, "hello"); 而不是 write_padded_binary 时,输出是 ('bt', b'hello')。但是,指针也是 ('pointer', b'hello', 'size_t:', 5) 所以我一定是误解了一些与指针相关的东西,我认为......

您遇到的问题(已诊断 )是您无法将 char* 直接传递给 Python 函数。当您执行 Cython 尝试将其转换为字符串时(这没有意义,因为它只是保存二进制数据,因此将其解释为空终止的 C 字符串会导致它读取任意长度,直到找到 0.

这种情况下 printctypes.string_at 都有问题。在这两种情况下,技巧都是首先将其转换为适当大小的整数。 C uintptr_t 保证足够大以容纳整数,因此适当的选择是:

from libc.stdint cimport uintptr_t

print("pointer", <uintptr_t>bp, "size_t:", size)
bt = c.string_at(<uintptr_t>bp, size)