有没有办法计算 python 中 pcollection 的总记录数
Is there a way to count the total records from a pcollection in python
我需要使用 pcollection 连接事实和维度 BQ 表后收到的记录总数。
all_dim_joined_pcol = join_fact_dim_tbl_obj.join_fact_dim_using_cogbk()
我期望来自上面 pcollection 的记录数 all_dim_joined_pcol
我找到了使用 Count.Globally() 对 pcollection 中的元素进行计数的解决方案。该函数属于 class apache_beam.transforms.combiners.
counts = self.all_dim_joined_pcol | Count.Globally()
def collect(row):
temp_list.append(row)
print ("Count value is :" , temp_list)
message = "Join done successfully between {} and {} having count as {}".format(tbl1,tbl2,temp_list)
counts | "printing record count for" + fact_table_name + dimension_table_name >> beam.Map(collect)
我需要使用 pcollection 连接事实和维度 BQ 表后收到的记录总数。
all_dim_joined_pcol = join_fact_dim_tbl_obj.join_fact_dim_using_cogbk()
我期望来自上面 pcollection 的记录数 all_dim_joined_pcol
我找到了使用 Count.Globally() 对 pcollection 中的元素进行计数的解决方案。该函数属于 class apache_beam.transforms.combiners.
counts = self.all_dim_joined_pcol | Count.Globally()
def collect(row):
temp_list.append(row)
print ("Count value is :" , temp_list)
message = "Join done successfully between {} and {} having count as {}".format(tbl1,tbl2,temp_list)
counts | "printing record count for" + fact_table_name + dimension_table_name >> beam.Map(collect)