pyspark: TypeError: condition should be a Column with with otherwise
pyspark: TypeError: condition should be a Column with with otherwise
我编写了一个从参数文件中获取条件并根据条件添加列值的函数;但我不断收到错误 TypeError: condition should be a Column
condition = "type_txt = 'clinic'"
input_df = input_df.withColumn(
"prm_data_category",
F.when(condition, F.lit("clinic")) # this doesn't work
.when(F.col("type_txt") == 'office', F.lit("office")) # this works
.otherwise(F.lit("other")),
)
有什么方法可以将条件用作 sql 条件,以便通过参数而不是 col 轻松传递?
您可以使用 sql expr
使用 F.expr
from pyspark.sql import functions as F
condition = "type_txt = 'clinic'"
input_df1 = input_df.withColumn(
"prm_data_category",
F.when(F.expr(condition), F.lit("clinic"))
.when(F.col("type_txt") == 'office', F.lit("office"))
.otherwise(F.lit("other")),
)
我编写了一个从参数文件中获取条件并根据条件添加列值的函数;但我不断收到错误 TypeError: condition should be a Column
condition = "type_txt = 'clinic'"
input_df = input_df.withColumn(
"prm_data_category",
F.when(condition, F.lit("clinic")) # this doesn't work
.when(F.col("type_txt") == 'office', F.lit("office")) # this works
.otherwise(F.lit("other")),
)
有什么方法可以将条件用作 sql 条件,以便通过参数而不是 col 轻松传递?
您可以使用 sql expr
使用 F.expr
from pyspark.sql import functions as F
condition = "type_txt = 'clinic'"
input_df1 = input_df.withColumn(
"prm_data_category",
F.when(F.expr(condition), F.lit("clinic"))
.when(F.col("type_txt") == 'office', F.lit("office"))
.otherwise(F.lit("other")),
)