有百分比问题的 spark sql 查询
spark sql query with percentage issue
我无法获得带有 %
符号的 spark sql 查询。假设我有以下数据框
val df = sc.parallelize(Seq(("Peter",123,23.5),("John",45,45.5)))
.toDF("Name","Age","score(%)")
df.show
所以 table 看起来像这样:
+-----+---+--------+
| Name|Age|score(%)|
+-----+---+--------+
|Peter|123| 23.5|
| John| 45| 45.5|
+-----+---+--------+
我能做到:
sqlContext.sql("SELECT Name FROM df")
显示:
+-----+
| Name|
+-----+
|Peter|
| John|
+-----+
但是当我这样做时:
sqlContext.sql("SELECT score(%) FROM df")
它抛出以下内容:(看起来是 %
引起的问题,我尝试使用 \%
,但没有帮助)
java.lang.RuntimeException: [1.14] failure: ``distinct'' expected but `%' found
SELECT score(%) FROM df
^
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:36)
at org.apache.spark.sql.catalyst.DefaultParserDialect.parse(ParserDialect.scala:67)
at org.apache.spark.sql.SQLContext$$anonfun.apply(SQLContext.scala:175)
at org.apache.spark.sql.SQLContext$$anonfun.apply(SQLContext.scala:175)
at org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others.apply(SparkSQLParser.scala:115)
at org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others.apply(SparkSQLParser.scala:114)
at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:137)
at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map.apply(Parsers.scala:237)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map.apply(Parsers.scala:237)
at scala.util.parsing.combinator.Parsers$$anon.apply(Parsers.scala:217)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$$anonfun$apply.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$$anonfun$apply.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:197)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$$anon.apply(Parsers.scala:217)
at scala.util.parsing.combinator.Parsers$$anon$$anonfun$apply.apply(Parsers.scala:882)
at scala.util.parsing.combinator.Parsers$$anon$$anonfun$apply.apply(Parsers.scala:882)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at scala.util.parsing.combinator.Parsers$$anon.apply(Parsers.scala:881)
at scala.util.parsing.combinator.PackratParsers$$anon.apply(PackratParsers.scala:110)
at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34)
at org.apache.spark.sql.SQLContext$$anonfun.apply(SQLContext.scala:172)
at org.apache.spark.sql.SQLContext$$anonfun.apply(SQLContext.scala:172)
at org.apache.spark.sql.execution.datasources.DDLParser.parse(DDLParser.scala:42)
at org.apache.spark.sql.SQLContext.parseSql(SQLContext.scala:195)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:725)
... 48 elided
(当我对 spark 进行编程以使用 spark-csv 摄取大量 csv 时遇到此问题。当我尝试执行 sql SELECT
时,我 运行 进入这个 %
问题。我想尽可能避免修改 header...)
尝试使用反引号分隔列名。
sqlContext.sql("SELECT `score(%)` FROM df")
我无法获得带有 %
符号的 spark sql 查询。假设我有以下数据框
val df = sc.parallelize(Seq(("Peter",123,23.5),("John",45,45.5)))
.toDF("Name","Age","score(%)")
df.show
所以 table 看起来像这样:
+-----+---+--------+
| Name|Age|score(%)|
+-----+---+--------+
|Peter|123| 23.5|
| John| 45| 45.5|
+-----+---+--------+
我能做到:
sqlContext.sql("SELECT Name FROM df")
显示:
+-----+
| Name|
+-----+
|Peter|
| John|
+-----+
但是当我这样做时:
sqlContext.sql("SELECT score(%) FROM df")
它抛出以下内容:(看起来是 %
引起的问题,我尝试使用 \%
,但没有帮助)
java.lang.RuntimeException: [1.14] failure: ``distinct'' expected but `%' found
SELECT score(%) FROM df
^
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:36)
at org.apache.spark.sql.catalyst.DefaultParserDialect.parse(ParserDialect.scala:67)
at org.apache.spark.sql.SQLContext$$anonfun.apply(SQLContext.scala:175)
at org.apache.spark.sql.SQLContext$$anonfun.apply(SQLContext.scala:175)
at org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others.apply(SparkSQLParser.scala:115)
at org.apache.spark.sql.SparkSQLParser$$anonfun$org$apache$spark$sql$SparkSQLParser$$others.apply(SparkSQLParser.scala:114)
at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:137)
at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map.apply(Parsers.scala:237)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map.apply(Parsers.scala:237)
at scala.util.parsing.combinator.Parsers$$anon.apply(Parsers.scala:217)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$$anonfun$apply.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$$anonfun$apply.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:197)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$$anon.apply(Parsers.scala:217)
at scala.util.parsing.combinator.Parsers$$anon$$anonfun$apply.apply(Parsers.scala:882)
at scala.util.parsing.combinator.Parsers$$anon$$anonfun$apply.apply(Parsers.scala:882)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at scala.util.parsing.combinator.Parsers$$anon.apply(Parsers.scala:881)
at scala.util.parsing.combinator.PackratParsers$$anon.apply(PackratParsers.scala:110)
at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34)
at org.apache.spark.sql.SQLContext$$anonfun.apply(SQLContext.scala:172)
at org.apache.spark.sql.SQLContext$$anonfun.apply(SQLContext.scala:172)
at org.apache.spark.sql.execution.datasources.DDLParser.parse(DDLParser.scala:42)
at org.apache.spark.sql.SQLContext.parseSql(SQLContext.scala:195)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:725)
... 48 elided
(当我对 spark 进行编程以使用 spark-csv 摄取大量 csv 时遇到此问题。当我尝试执行 sql SELECT
时,我 运行 进入这个 %
问题。我想尽可能避免修改 header...)
尝试使用反引号分隔列名。
sqlContext.sql("SELECT `score(%)` FROM df")