了解 jq JOIN()
Understanding jq JOIN()
我正在尝试理解 JOIN()
jq
的内置函数。
来自 jq 手册 (https://stedolan.github.io/jq/manual):
JOIN($idx; stream; idx_expr; join_expr):
This builtin joins the values from the given stream to the given index.
The index's keys are computed by applying the given index expression to each value from the given stream.
An array of the value in the stream and the corresponding value from the index is fed to the given join expression to produce each result.
如果没有例子,我觉得这很难理解。
您能否举例说明它的用法?
此函数应类似于 SQL
中的 JOIN
子句。它用于根据它们之间的相关列合并来自两个(或 SQL
中更多)table 的行。
让我们构建一些“tables”。
第一个应该是带有 ID 的订单列表,以及对订购客户和订购产品的 ID 引用:
[
{
"OrderID": "10",
"CustomerIDRef": "2",
"ProductIDRef": "7"
},
{
"OrderID": "11",
"CustomerIDRef": "1",
"ProductIDRef": "7"
},
{
"OrderID": "12",
"CustomerIDRef": "2",
"ProductIDRef": "14"
},
{
"OrderID": "13",
"CustomerIDRef": "2",
"ProductIDRef": "7"
}
]
as $orders
让第二个成为映射到他们姓名的客户列表:
[
{
"CustomerID": "1",
"CustomerName": "Alfred"
},
{
"CustomerID": "2",
"CustomerName": "Bill"
},
{
"CustomerID": "3",
"CustomerName": "Caroline"
}
]
as $customers
由于jq的JOIN
一次只能处理两个table(更多需要级联),忽略掉缺失的Products table.
在我们到达 JOIN
之前,我们需要先查看 INDEX
,它将像上面的 table 这样的数组转换为具有 [=58= 的对象] 的“主键”作为字段名称。这是合理的,因为字段名称是唯一的,呈现查找总是 return 不超过一条记录。
INDEX($customers[]; .CustomerID)
{
"1": {
"CustomerID": "1",
"CustomerName": "Alfred"
},
"2": {
"CustomerID": "2",
"CustomerName": "Bill"
},
"3": {
"CustomerID": "3",
"CustomerName": "Caroline"
}
}
现在,我们可以轻松地在订单(作为“左 table”)和他们的客户(作为“右 table”)之间执行 JOIN
操作。提供“右 table”作为 INDEX
ed 对象,“左 table”作为流 .[]
,“相关列”作为左侧字段 table的对象与右table的主键(查找对象中的字段名)匹配,我们得到:(现在让最后一个参数只是.
)
JOIN(INDEX($customers[]; .CustomerID); $orders[]; .CustomerIDRef; .)
[
{
"OrderID": "10",
"CustomerIDRef": "2",
"ProductIDRef": "7"
},
{
"CustomerID": "2",
"CustomerName": "Bill"
}
]
[
{
"OrderID": "11",
"CustomerIDRef": "1",
"ProductIDRef": "7"
},
{
"CustomerID": "1",
"CustomerName": "Alfred"
}
]
[
{
"OrderID": "12",
"CustomerIDRef": "2",
"ProductIDRef": "14"
},
{
"CustomerID": "2",
"CustomerName": "Bill"
}
]
[
{
"OrderID": "13",
"CustomerIDRef": "2",
"ProductIDRef": "7"
},
{
"CustomerID": "2",
"CustomerName": "Bill"
}
]
如您所见,我们得到了一个数组流,每个订单一个。每个数组都有两个元素:左边的记录 table 和右边的记录。不成功的查找将在右侧产生 null
。
最后,第四个参数是“join expression”描述了如何连接两个匹配的记录,它本质上是一个map
.
JOIN(INDEX($customers[]; .CustomerID); $orders[]; .CustomerIDRef;
"\(.[0].OrderID): \(.[1].CustomerName) ordered Product #\(.[0].ProductIDRef)."
)
10: Bill ordered Product #7.
11: Alfred ordered Product #7.
12: Bill ordered Product #14.
13: Bill ordered Product #7.
我正在尝试理解 JOIN()
jq
的内置函数。
来自 jq 手册 (https://stedolan.github.io/jq/manual):
JOIN($idx; stream; idx_expr; join_expr):
This builtin joins the values from the given stream to the given index.
The index's keys are computed by applying the given index expression to each value from the given stream.
An array of the value in the stream and the corresponding value from the index is fed to the given join expression to produce each result.
如果没有例子,我觉得这很难理解。
您能否举例说明它的用法?
此函数应类似于 SQL
中的 JOIN
子句。它用于根据它们之间的相关列合并来自两个(或 SQL
中更多)table 的行。
让我们构建一些“tables”。
第一个应该是带有 ID 的订单列表,以及对订购客户和订购产品的 ID 引用:
[
{
"OrderID": "10",
"CustomerIDRef": "2",
"ProductIDRef": "7"
},
{
"OrderID": "11",
"CustomerIDRef": "1",
"ProductIDRef": "7"
},
{
"OrderID": "12",
"CustomerIDRef": "2",
"ProductIDRef": "14"
},
{
"OrderID": "13",
"CustomerIDRef": "2",
"ProductIDRef": "7"
}
]
as $orders
让第二个成为映射到他们姓名的客户列表:
[
{
"CustomerID": "1",
"CustomerName": "Alfred"
},
{
"CustomerID": "2",
"CustomerName": "Bill"
},
{
"CustomerID": "3",
"CustomerName": "Caroline"
}
]
as $customers
由于jq的JOIN
一次只能处理两个table(更多需要级联),忽略掉缺失的Products table.
在我们到达 JOIN
之前,我们需要先查看 INDEX
,它将像上面的 table 这样的数组转换为具有 [=58= 的对象] 的“主键”作为字段名称。这是合理的,因为字段名称是唯一的,呈现查找总是 return 不超过一条记录。
INDEX($customers[]; .CustomerID)
{
"1": {
"CustomerID": "1",
"CustomerName": "Alfred"
},
"2": {
"CustomerID": "2",
"CustomerName": "Bill"
},
"3": {
"CustomerID": "3",
"CustomerName": "Caroline"
}
}
现在,我们可以轻松地在订单(作为“左 table”)和他们的客户(作为“右 table”)之间执行 JOIN
操作。提供“右 table”作为 INDEX
ed 对象,“左 table”作为流 .[]
,“相关列”作为左侧字段 table的对象与右table的主键(查找对象中的字段名)匹配,我们得到:(现在让最后一个参数只是.
)
JOIN(INDEX($customers[]; .CustomerID); $orders[]; .CustomerIDRef; .)
[
{
"OrderID": "10",
"CustomerIDRef": "2",
"ProductIDRef": "7"
},
{
"CustomerID": "2",
"CustomerName": "Bill"
}
]
[
{
"OrderID": "11",
"CustomerIDRef": "1",
"ProductIDRef": "7"
},
{
"CustomerID": "1",
"CustomerName": "Alfred"
}
]
[
{
"OrderID": "12",
"CustomerIDRef": "2",
"ProductIDRef": "14"
},
{
"CustomerID": "2",
"CustomerName": "Bill"
}
]
[
{
"OrderID": "13",
"CustomerIDRef": "2",
"ProductIDRef": "7"
},
{
"CustomerID": "2",
"CustomerName": "Bill"
}
]
如您所见,我们得到了一个数组流,每个订单一个。每个数组都有两个元素:左边的记录 table 和右边的记录。不成功的查找将在右侧产生 null
。
最后,第四个参数是“join expression”描述了如何连接两个匹配的记录,它本质上是一个map
.
JOIN(INDEX($customers[]; .CustomerID); $orders[]; .CustomerIDRef;
"\(.[0].OrderID): \(.[1].CustomerName) ordered Product #\(.[0].ProductIDRef)."
)
10: Bill ordered Product #7.
11: Alfred ordered Product #7.
12: Bill ordered Product #14.
13: Bill ordered Product #7.