泛型 RDD 上的 Apache Spark join/cogroup
Apache Spark join/cogroup on generic type RDD
我对 RDD
上的 join
或 cogroup
方法有疑问。详细地说,我必须加入两个 RDD
,其中一个是通用类型的 RDD
,与通配符一起使用。
val indexedMeasures = measures.map(m => (m.id(), m)) // RDD[(String, Measure[_]]
val indexedRegistry = registry.map(r => (r.id, r)) // RDD[(String, Registry)]
indexedRegistry.cogroup(indexedMeasures)
最后一条语句给出编译时错误,如下:
no type parameters for method cogroup: (other: org.apache.spark.rdd.RDD[(String, W)])org.apache.spark.rdd.RDD[(String, (Iterable[Registry],
Iterable[W]))] exist so that it can be applied to arguments (org.apache.spark.rdd.RDD[(String, Measure[?0]) forSome { type ?0 }]) --- because --- argument expression's type is not compatible
with formal parameter type; found : org.apache.spark.rdd.RDD[(String, Measure[?0]) forSome { type ?0 }] required: org.apache.spark.rdd.RDD[(String, ?W)] Note: (String,
Measure[?0]) forSome { type ?0 } >: (String, ?W), but class RDD is invariant in type T. You may wish to define T as -T instead. (SLS 4.5)
这是怎么回事?为什么我 cogroup
RDD
不能使用通用通配符类型?
感谢您的所有回复。
这篇文章中提到的问题Towards Equal Rights for Higher-kinded Types
Generics are a very popular feature of contemporary OO languages,
such as Java, C# or Scala. Their support for genericity is lacking, however. The
problem is that they only support abstracting over proper types, and not over
generic types. This limitation makes it impossible to, e.g., define a precise interface
for Iterable, a core abstraction in Scala’s collection API. We implemented
“type constructor polymorphism” in Scala 2.5, which solves this problem
at the root, thus greatly reducing the duplication of type signatures and code.
我对 RDD
上的 join
或 cogroup
方法有疑问。详细地说,我必须加入两个 RDD
,其中一个是通用类型的 RDD
,与通配符一起使用。
val indexedMeasures = measures.map(m => (m.id(), m)) // RDD[(String, Measure[_]]
val indexedRegistry = registry.map(r => (r.id, r)) // RDD[(String, Registry)]
indexedRegistry.cogroup(indexedMeasures)
最后一条语句给出编译时错误,如下:
no type parameters for method cogroup: (other: org.apache.spark.rdd.RDD[(String, W)])org.apache.spark.rdd.RDD[(String, (Iterable[Registry],
Iterable[W]))] exist so that it can be applied to arguments (org.apache.spark.rdd.RDD[(String, Measure[?0]) forSome { type ?0 }]) --- because --- argument expression's type is not compatible
with formal parameter type; found : org.apache.spark.rdd.RDD[(String, Measure[?0]) forSome { type ?0 }] required: org.apache.spark.rdd.RDD[(String, ?W)] Note: (String,
Measure[?0]) forSome { type ?0 } >: (String, ?W), but class RDD is invariant in type T. You may wish to define T as -T instead. (SLS 4.5)
这是怎么回事?为什么我 cogroup
RDD
不能使用通用通配符类型?
感谢您的所有回复。
这篇文章中提到的问题Towards Equal Rights for Higher-kinded Types
Generics are a very popular feature of contemporary OO languages, such as Java, C# or Scala. Their support for genericity is lacking, however. The problem is that they only support abstracting over proper types, and not over generic types. This limitation makes it impossible to, e.g., define a precise interface for Iterable, a core abstraction in Scala’s collection API. We implemented “type constructor polymorphism” in Scala 2.5, which solves this problem at the root, thus greatly reducing the duplication of type signatures and code.