Snowflake vs Spark - 权限不足,无法对模式进行操作

Snowflake vs Spark - Insufficient privileges to operate on schema

我创建了一个新的免费标准 snowflake 帐户 "xxxxx" 我能够从 Snowflake 网站访问默认数据库、架构和表 UI。

并尝试从 Spark 连接到 Snowflake

val spark = SparkSession.builder()
    .master("local[1]")
    .appName("SparkByExamples.com")
    .getOrCreate();

  var sfOptions = Map(
    "sfURL" -> "https://xxxxxx.us-east-1.snowflakecomputing.com/",
    "sfAccount" -> "xxxxxx",
    "sfUser" -> "xxxxx",
    "sfPassword" -> "#########",
    "sfDatabase" -> "snowflake_sample_data",
    "sfSchema" -> "tpch_sf1"
  )
  val df: DataFrame = spark.read
    .format("net.snowflake.spark.snowflake")
    .options(sfOptions)
    .option("query", "SELECT l_returnflag,l_linestatus,sum(l_quantity) as sum_qty FROM lineitem GROUP BY l_returnflag,l_linestatus")
    .load()

在 运行 之后,出现 "Insufficient privileges to operate on schema 'TPCH_SF1' " 错误。有人可以帮忙吗

20/02/23 19:35:12 警告 SnowflakeStrategy:下推失败:SQL 访问控制错误: 权限不足,无法操作架构 'TPCH_SF1' 20/02/23 19:35:13 INFO SnowflakeSQL 声明:Spark Connector Master:使用绑定变量执行查询:如果标识符不存在则创建临时阶段(?) 20/02/23 19:35:13 警告 SnowflakeStrategy:下推失败:SQL 访问控制错误: 权限不足,无法操作模式 'TPCH_SF1' 20/02/23 19:35:14 INFO SnowflakeSQL声明:Spark Connector Master:使用绑定变量执行查询:如果标识符不存在则创建临时阶段(?) 线程异常 "main" net.snowflake.client.jdbc.SnowflakeSQLException: SQL 访问控制错误: 权限不足,无法操作模式 'TPCH_SF1' 在 net.snowflake.client.jdbc.SnowflakeUtil.checkErrorAndThrowExceptionSub(SnowflakeUtil.java:152) 在 net.snowflake.client.jdbc.SnowflakeUtil.checkErrorAndThrowException(SnowflakeUtil.java:77) 在 net.snowflake.client.core.StmtUtil.pollForOutput(StmtUtil.java:495) 在 net.snowflake.client.core.StmtUtil.execute(StmtUtil.java:372) 在 net.snowflake.client.core.SFStatement.executeHelper(SFStatement.java:575) 在 net.snowflake.client.core.SFStatement.executeQueryInternal(SFStatement.java:265) 在 net.snowflake.client.core.SFStatement.executeQuery(SFStatement.java:203) 在 net.snowflake.client.core.SFStatement.execute(SFStatement.java:874)

使用 Spark 连接器时,您需要对所使用的架构具有 CREATE STAGE 权限。不要使用 tpch_sf1 架构,而是使用您有权创建阶段的架构并使用 table(数据库、架构、table 名称)的完全限定名称:

  var sfOptions = Map(
    "sfURL" -> "https://xxxxx.us-east-1.snowflakecomputing.com/",
    "sfAccount" -> "xxxxx",
    "sfUser" -> "xxxxx",
    "sfPassword" -> "#########",
    "sfDatabase" -> "your_own_database",
    "sfSchema" -> "your_own_Schema"
  )


  val df: DataFrame = spark.read
    .format("net.snowflake.spark.snowflake")
    .options(sfOptions)
    .option("query", "SELECT l_returnflag,l_linestatus,sum(l_quantity) as sum_qty FROM snowflake_sample_data.tpch_sf1.lineitem GROUP BY l_returnflag,l_linestatus")
    .load()

另一种选择是在共享数据库表之上,在您有权访问的架构中创建一个视图,然后查询该视图:

CREATE VIEW PUBLIC.TPCH_SF10_REGION 
AS 
  SELECT * 
  FROM   "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF10"."REGION"