Scala - 合并列表以映射

Scala - merge a list to map

我需要将一个列表从 RDD 合并到一个集合中,但我在 Scala 中无法做到:

var accounts = set("name" -> "", "id" -> 0, ....)

//Split the RDD into lines and split each line by `|` to get the values
stream.foreachRDD {_.map(_._2).flatMap(_.split("|")).foreach(f => /*merge here ?*/)}

如何将这些值与我的帐户集相关联?

例如,假设从 CSV 加载一个 RDD(我编造了这个数据)

 Donald|Trump|US|Election|March|Spring|Rubio|Ted Cruz|Ben Carson|Primary|Winner|...
 Donald|Trump|US|Election|March|Spring|Rubio|Ted Cruz|Ben Carson|Primary|Winner|...
 Donald|Trump|US|Election|March|Spring|Rubio|Ted Cruz|Ben Carson|Primary|Winner|...
 Donald|Trump|US|Election|March|Spring|Rubio|Ted Cruz|Ben Carson|Primary|Winner|...
 Donald|Trump|US|Election|March|Spring|Rubio|Ted Cruz|Ben Carson|Primary|Winner|...
 Donald|Trump|US|Election|March|Spring|Rubio|Ted Cruz|Ben Carson|Primary|Winner|...
 Donald|Trump|US|Election|March|Spring|Rubio|Ted Cruz|Ben Carson|Primary|Winner|...
 ...

RDD最多有300个columns/fields。

我的主要 objective 是将其转换为一些 json 但我需要通过将每个值加载到映射或 class.[=14= 来将每个值关联到一个键]

var election = Map ("firstname" -> "Donald",
"lastname" -> "Trump",
"country" -> "US",
"event" -> "Election",
"period" -> "March"
"var1" -> "Spring",
 ....
"varN" -> "...")

我不确定我是否理解正确,但这有帮助吗?

val data = List(
  "Donald|Trump|US|Election|March",
  "John|Smith|UK|Election|February"
)

val mapKeys = List("firstname", "lastname", "country", "event", "period")

val election = data.map { row =>
  (mapKeys zip row.split("\|").toList).map {
    case (key, value) => key -> value
  }.toMap
}

因此,您将获得一个地图列表 - 对于您数据的每一行,您将获得一个包含 key/value 对的地图,如您所描述的。

对@slouc 的回答进行一些清理

stream.foreachRDD {_.map(_._2).map(l => (mapKeys zip l.split("\|")).toMap).saveToEs(conf)}