Google Vision OCR 的坐标值不完整

Question

我有一个脚本可以遍历不同形式的图像。在解析 Google 视觉文本检测响应时，我使用 'boundingPoly' 中的 XY 坐标为每个文本项专门查找表单不同部分的数据。

我遇到的问题是某些响应返回时只有 X 坐标。示例：

{u'description': u'sometext', u'boundingPoly': {u'vertices': [{u'x': 5595}, {u'x': 5717}, {u'y': 122, u'x': 5717}, {u'y': 122, u'x': 5595}

我设置了 try/except（使用 python 2.7）来解决这个问题，但它始终是同一个问题：KeyError: 'y'。我正在遍历数千种形式；到目前为止，它发生在 1000 行中的 10 行。

以前有人遇到过这个问题吗？如果遇到此错误，除了尝试重新提交请求之外，是否有其他修复方法？

Answer 1

From the docs:

boundingPoly

object(BoundingPoly)

The bounding polygon around the face. The coordinates of the bounding box are in the original image's scale, as returned in ImageParams. The bounding box is computed to "frame" the face in accordance with human expectations. It is based on the landmarker results. Note that one or more x and/or y coordinates may not be generated in the BoundingPoly (the polygon will be unbounded) if only a partial face appears in the image to be annotated.

我认为这意味着 'y' 值在这种情况下是 0，或者更一般地说，是一个边缘值。换句话说，它不知道有界多边形真正结束的位置，因为文本一直延伸到图像的边缘，因此图像没有提供足够的信息来确定文本确实在那里结束.就图像而言，它结束于 0 的 'y'。

Google Vision OCR 的坐标值不完整

Incomplete coordinate values for Google Vision OCR

python

ocr

google-cloud-vision