使用现有 table 名称缓存新 table 是否会从内存中删除旧内容?

Does caching a new table with an existing table name remove old contents from memory?

使用,Spark 1.5.2:

dfOld.registerTempTable("oldTableName")
hiveContext.cacheTable("oldTableName") 
// ....
// do something
// ....
dfNew.registerTempTable("oldTableName")
hiveContext.cacheTable("oldTableName") 

现在,当我使用 "oldTableName" table 时,我确实从 dfNew 获取了最新内容,但是 dfOld 的内容是否已从内存中删除?

或者这样做的正确用法是:

dfOld.registerTempTable("oldTableName")
hiveContext.cacheTable("oldTableName") 
// ....
// do something
// ....
dfNew.registerTempTable("oldTableName")
hiveContext.unCacheTable("oldTableName") <========== un-cache the old contents first
hiveContext.cacheTable("oldTableName") 

不,在使用 hiveContext.uncacheTable("tableName")hiveContext.uncache() 明确要求 spark cacheManager 这样做之前,不会取消缓存内容[警告:这会取消缓存所有 tables]。
证明:在实验中,"Storage" 选项卡清楚地显示了相同 table.
的重复条目 对于此代码段:

dfOld.registerTempTable("myColorsTable")
hiveContext.cacheTable("myColorsTable") 
// ....
// do something
// ....
dfNew.registerTempTable("myColorsTable")
hiveContext.cacheTable("myColorsTable") 

在./bin/spark-shell

scala> df.collect
res54: Array[org.apache.spark.sql.Row] = Array([blue,#0033FF], [red,#FF0000], [green,#FSKA])  <=== 3 rows

scala> df2.collect
res55: Array[org.apache.spark.sql.Row] = Array([blue,#0033FF], [red,#FF0000])  <=== 2 rows

scala> df.registerTempTable("myColorsTable")

scala> sqlContext.isCached("myColorsTable") 
res58: Boolean = false

scala> sqlContext.cacheTable("myColorsTable") <=== cache table in df(3 rows)

scala> sqlContext.isCached("myColorsTable")
res60: Boolean = true

scala> sqlContext.sql("select * from myColorsTable").foreach(println) <=== sql is running on df(3 rows)
[blue,#0033FF]
[red,#FF0000]
[green,#FSKA]

scala> df2.registerTempTable("myColorsTable") <=== register another table with the same table name

scala> sqlContext.isCached("myColorsTable")
res63: Boolean = false

scala> sqlContext.sql("select * from myColorsTable").foreach(println) <=== sql is running on df2(2 rows)
[blue,#0033FF]
[red,#FF0000]

scala> sqlContext.cacheTable("myColorsTable")
15/12/19 09:53:55 WARN CacheManager: Asked to cache already cached data. <=====

来自CacheManager#cacheQuery()

 if (lookupCachedData(planToCache).nonEmpty) {
      logWarning("Asked to cache already cached data.")