了解 jq JOIN()

Understanding jq JOIN()

我正在尝试理解 JOIN() jq 的内置函数。

来自 jq 手册 (https://stedolan.github.io/jq/manual):

JOIN($idx; stream; idx_expr; join_expr):

This builtin joins the values from the given stream to the given index. 
The index's keys are computed by applying the given index expression to each value from the given stream. 
An array of the value in the stream and the corresponding value from the index is fed to the given join expression to produce each result.

如果没有例子,我觉得这很难理解。

您能否举例说明它的用法?

此函数应类似于 SQL 中的 JOIN 子句。它用于根据它们之间的相关列合并来自两个(或 SQL 中更多)table 的行。

让我们构建一些“tables”。

第一个应该是带有 ID 的订单列表,以及对订购客户和订购产品的 ID 引用:

[
  {
    "OrderID": "10",
    "CustomerIDRef": "2",
    "ProductIDRef": "7"
  },
  {
    "OrderID": "11",
    "CustomerIDRef": "1",
    "ProductIDRef": "7"
  },
  {
    "OrderID": "12",
    "CustomerIDRef": "2",
    "ProductIDRef": "14"
  },
  {
    "OrderID": "13",
    "CustomerIDRef": "2",
    "ProductIDRef": "7"
  }
]
as $orders

让第二个成为映射到他们姓名的客户列表:

[
  {
    "CustomerID": "1",
    "CustomerName": "Alfred"
  },
  {
    "CustomerID": "2",
    "CustomerName": "Bill"
  },
  {
    "CustomerID": "3",
    "CustomerName": "Caroline"
  }
]
as $customers

由于jq的JOIN一次只能处理两个table(更多需要级联),忽略掉缺失的Products table.

在我们到达 JOIN 之前,我们需要先查看 INDEX,它将像上面的 table 这样的数组转换为具有 [=58= 的对象] 的“主键”作为字段名称。这是合理的,因为字段名称是唯一的,呈现查找总是 return 不超过一条记录。

INDEX($customers[]; .CustomerID)
{
  "1": {
    "CustomerID": "1",
    "CustomerName": "Alfred"
  },
  "2": {
    "CustomerID": "2",
    "CustomerName": "Bill"
  },
  "3": {
    "CustomerID": "3",
    "CustomerName": "Caroline"
  }
}

Demo

现在,我们可以轻松地在订单(作为“左 table”)和他们的客户(作为“右 table”)之间执行 JOIN 操作。提供“右 table”作为 INDEXed 对象,“左 table”作为流 .[],“相关列”作为左侧字段 table的对象与右table的主键(查找对象中的字段名)匹配,我们得到:(现在让最后一个参数只是.

JOIN(INDEX($customers[]; .CustomerID); $orders[]; .CustomerIDRef; .)
[
  {
    "OrderID": "10",
    "CustomerIDRef": "2",
    "ProductIDRef": "7"
  },
  {
    "CustomerID": "2",
    "CustomerName": "Bill"
  }
]
[
  {
    "OrderID": "11",
    "CustomerIDRef": "1",
    "ProductIDRef": "7"
  },
  {
    "CustomerID": "1",
    "CustomerName": "Alfred"
  }
]
[
  {
    "OrderID": "12",
    "CustomerIDRef": "2",
    "ProductIDRef": "14"
  },
  {
    "CustomerID": "2",
    "CustomerName": "Bill"
  }
]
[
  {
    "OrderID": "13",
    "CustomerIDRef": "2",
    "ProductIDRef": "7"
  },
  {
    "CustomerID": "2",
    "CustomerName": "Bill"
  }
]

Demo

如您所见,我们得到了一个数组流,每个订单一个。每个数组都有两个元素:左边的记录 table 和右边的记录。不成功的查找将在右侧产生 null

最后,第四个参数是“join expression”描述了如何连接两个匹配的记录,它本质上是一个map.

JOIN(INDEX($customers[]; .CustomerID); $orders[]; .CustomerIDRef;
  "\(.[0].OrderID): \(.[1].CustomerName) ordered Product #\(.[0].ProductIDRef)."
)
10: Bill ordered Product #7.
11: Alfred ordered Product #7.
12: Bill ordered Product #14.
13: Bill ordered Product #7.

Demo