使用 BigQuery 选择所有相关行(从 GAE 读取日志)

Selecting all related rows with BigQuery (reading logs from GAE)

我的 Google App Engine 日志正在通过标准 streaming export tool 导出到 BigQuery。我想查询"show me all log lines for requests in which any log line contains a string"。

此查询为我提供了我感兴趣的请求 ID:

SELECT protoPayload.requestId AS reqId
  FROM TABLE_QUERY(logs, 'true') 
  WHERE protoPayload.line.logMessage contains 'INTERNAL_SERVICE_ERROR'

...这让我可以查询相关行:

SELECT
  metadata.timestamp AS Time,
  protoPayload.host AS Host,
  protoPayload.status AS Status,
  protoPayload.resource AS Path,
  protoPayload.line.logMessage
FROM
  TABLE_QUERY(logs, 'true')
WHERE
  protoPayload.requestId in ("requestid1", "requestid2", "etc")
ORDER BY time

但是,我无法将两者合并为一个查询。 BQ 似乎不允许在 WHERE 子句中进行子选择,当我尝试对命名表进行传统的自连接时,我收到了令人困惑的错误消息。秘诀是什么?

对于至少有一个 logMessage 包含给定字符串的 select 行,您可以使用 OMIT IF 构造

SELECT
  metadata.timestamp AS Time,
  protoPayload.host AS Host,
  protoPayload.status AS Status,
  protoPayload.resource AS Path,
  protoPayload.line.logMessage
FROM
  TABLE_QUERY(logs, 'true')
OMIT RECORD IF
  EVERY(NOT (protoPayload.line.logMessage contains 'INTERNAL_SERVICE_ERROR'))
ORDER BY time