Spark SQL 模拟 Redshift (Oracle) "SYSDATE" 或 "CURRENT_DATE"

Spark SQL analog for Redshift (Oracle) "SYSDATE" or "CURRENT_DATE"

Spark SQL 是否支持用于获取当前日期的任何日期函数?

我使用 org.apache.spark.sql.hive.HiveContext 而不是 org.apache.spark.sql.SQLContext 找到了问题的解决方案。现在以下代码按预期工作:

lazy val sc = ... // create Spak Context
lazy val hc = new HiveContext(sc)
val results = hc.sql("SELECT record_name as name FROM test_table WHERE day < current_date")
results.take(10)
  .map(r => s"name: ${r.getAs("name")}")
  .foreach(println)

current_date() 适用于 Spark 1.5

用法示例:

sqlContext.sql("SELECT current_date() as today FROM eventsAvro").first

输出:

res360: org.apache.spark.sql.Row = [2016-01-29]