pyspark 提取 json 值列并通过 rest 使用请求 post 它

pyspark extract the json value column and post it via rest using requests

我在 PySpark 中有一个包含 1 行 1 列的数据框 - json

-----------------------------------------------------------------------------------------
|json                                                              
-----------------------------------------------------------------------------------------
|[{"a":{"b":0,"c":{"50":0.005,"60":0,"100":0},"d":0.01,"e":0,"f":2}}]|
-----------------------------------------------------------------------------------------

我需要提取 json 值并 post 通过 rest 使用请求提取它。

from pyspark.sql import SparkSession
import json
spark = (SparkSession.builder.appName("AuthorsAges").getOrCreate())
# Creating the DataFrame
data_df = spark.createDataFrame([["[{\"a\":{\"b\":0,\"c\": 
{\"50\":0.005,\"60\":0,\"100\":0},\"d\":0.01,\"e\":0,\"f\":2}}]"]])
data_df.show(1, False)
extract_text = data_df.collect()[0][0]
extract_json = json.loads(extract_text[1:-1])
# you can access any of the josn fields like this afterwards
print(extract_json['a']['c'])