将数据框转换为行列表 pyspark 胶水
convert dataframe to list of rows pyspark glue
如何将我的数据框 df
转换为行列表?
代码
df = glueContext.create_dynamic_frame_from_options(
connection_type = "s3",
connection_options = {"paths": ["s3://data/tmp1/file.csv"]},
format = "csv",
)
df = df.toDF()
list = df.values.tolist()
错误
dataframe has no attribute values
恕我直言,您可以使用 toPandas()
、
df = glueContext.create_dynamic_frame_from_options(
connection_type="s3",
connection_options={"paths": ["s3://data/tmp1/file.csv"]},
format="csv")
df = df.toPandas()
liste = df.values.tolist()
在glue中,你可以使用DyanamicFrame.map()方法(https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-crawler-pyspark-extensions-dynamic-frame.html#aws-glue-api-crawler-pyspark-extensions-dynamic-frame-map)
df.map(to_list)
def to_list(rec):
rec["list"] = [rec["col1"], rec["col2"] ]
del rec["col1"]
del rec["col2"]
如何将我的数据框 df
转换为行列表?
代码
df = glueContext.create_dynamic_frame_from_options(
connection_type = "s3",
connection_options = {"paths": ["s3://data/tmp1/file.csv"]},
format = "csv",
)
df = df.toDF()
list = df.values.tolist()
错误
dataframe has no attribute values
恕我直言,您可以使用 toPandas()
、
df = glueContext.create_dynamic_frame_from_options(
connection_type="s3",
connection_options={"paths": ["s3://data/tmp1/file.csv"]},
format="csv")
df = df.toPandas()
liste = df.values.tolist()
在glue中,你可以使用DyanamicFrame.map()方法(https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-crawler-pyspark-extensions-dynamic-frame.html#aws-glue-api-crawler-pyspark-extensions-dynamic-frame-map)
df.map(to_list)
def to_list(rec):
rec["list"] = [rec["col1"], rec["col2"] ]
del rec["col1"]
del rec["col2"]