pySpark.sql WHERE关键字怎么用?
pySpark.sql how to use WHERE keyword?
如何使用 WHERE
关键字来获取在泰坦尼克号灾难中幸存者的性别及其百分比?
我的代码:
spark.sql(
"SELECT Sex Where Survived=1 ,count(Sex) \
as gender_count,count(sex)*100/sum(count(sex)) over() \
as percent from titanic_table GROUP BY sex"
).show()
错误:
ParseException: "
mismatched input ',' expecting <EOF>(line 1, pos 28)
== SQL ==
SELECT Sex Where Survived=1 ,count(Sex)
as gender_count,count(sex)*100/sum(count(sex)) over()
as percent from titanic_table GROUP BY sex
----------------------------^^^
"
你应该把它放在 FROM
之后和 GROUP BY
之前。
您的代码应该是:
spark.sql("SELECT Sex, count(Sex) AS gender_count, \
100*count(sex)/sum(count(sex)) over() AS percent \
FROM titanic_table \
WHERE Survived = 1 \
GROUP BY sex").show()
如何使用 WHERE
关键字来获取在泰坦尼克号灾难中幸存者的性别及其百分比?
我的代码:
spark.sql(
"SELECT Sex Where Survived=1 ,count(Sex) \
as gender_count,count(sex)*100/sum(count(sex)) over() \
as percent from titanic_table GROUP BY sex"
).show()
错误:
ParseException: " mismatched input ',' expecting <EOF>(line 1, pos 28) == SQL == SELECT Sex Where Survived=1 ,count(Sex) as gender_count,count(sex)*100/sum(count(sex)) over() as percent from titanic_table GROUP BY sex ----------------------------^^^ "
你应该把它放在 FROM
之后和 GROUP BY
之前。
您的代码应该是:
spark.sql("SELECT Sex, count(Sex) AS gender_count, \
100*count(sex)/sum(count(sex)) over() AS percent \
FROM titanic_table \
WHERE Survived = 1 \
GROUP BY sex").show()