使用 Pyparsing 解析二进制斯坦福多边形文件 (PLY)

Question

对于一个更大的项目，我目前正在编写斯坦福多边形文件 (PLY) 解析器。 Github Gists 中的示例目前能够将 ASCII-format PLY 文件解析为数据抽象 Mesh。对于那些有兴趣的人，它还包含对实际语法的描述。

但是格式定义 (PLY - Polygon File Format) 还包括两种二进制格式（小端和大端）。由于这两种格式更为常见（并且 storage-space 有效），我希望也能够使用 pyparsing 解析这些文件。

如果可能的话，我非常感谢您提供一些关于如何做到这一点的建议。

二进制 PLY 文件的想法是，header 部分包含文件实际数据的 ASCII 描述，而 body 包含实际数据。一个例子（括号中的数据是十六进制字节）：

ply
format binary_little_endian 1.0          
element vertex 1
property float x
property float y
property float z
property uchar red
property uchar green
property uchar blue
property uchar alpha
end_header
[84 72 F1 C1 D8 FD 9F C1 00 00 00 00 3B 45 CB FF]

我的第一个方法是只加载二进制格式的输入文件（使用 bytes 而不是 str），并相应地调整解析器，但这不知何故会抛出 pyparsing追踪。另外，我真的不知道如何告诉 pyparsing 如何理解字节组。

  File "components.py", line 338, in create
    mesh = PlyParser.create().load(mesh_path)
  File "model_parser.py", line 120, in create
    property_position = aggregate_property("position", b"x", b"y", b"z")
  File "model_parser.py", line 113, in aggregate_property
    aggregates.append(pp.Group(property_simple_prefix + keyword_or(*keywords)("name")))
  File "model_parser.py", line 87, in keyword_or
    return pp.Or(pp.CaselessKeyword(literal) for literal in keywords)
  File "pyparsing.py", line 3418, in __init__
    super(Or,self).__init__(exprs, savelist)
  File "pyparsing.py", line 3222, in __init__
    exprs = list(exprs)
  File "model_parser.py", line 87, in <genexpr>
    return pp.Or(pp.CaselessKeyword(literal) for literal in keywords)
  File "pyparsing.py", line 2496, in __init__
    super(CaselessKeyword,self).__init__( matchString, identChars, caseless=True )
  File "pyparsing.py", line 2422, in __init__
    self.matchLen = len(matchString)
TypeError: object of type 'int' has no len()

Answer 1

已经有一个用于解析二进制 PLY 文件的模块：python-plyfile。

您可以使用它或至少查看源代码以了解其工作原理。

它使用 numpy.fromfile - 被描述为 "highly efficient way of reading binary data with a known data-type" - 来读取二进制数据。

Answer 2

您可能想尝试以文本形式打开文件，使用 pyparsing 解析 header 并捕获 "end header" 标记的结束位置。使用从 header 中提取的结构信息来构建将处理二进制内容的 Python 结构 reader。然后以二进制形式重新打开文件，寻找到位置，并使用结构 reader 加载二进制内容。可能比将 pyparsing 扭曲为文本和二进制更简单。

使用 Pyparsing 解析二进制斯坦福多边形文件 (PLY)

Parsing binary Stanford polygon files (PLY) with Pyparsing

python

binaryfiles

mesh

pyparsing

python-3.x