如何在spark scala中的map函数上传递常量值
How to pass constant value on map function in spark scala
val SCHEMA : Schema = ....
dStream.map(b => deserialize(bytes))
def deserialize(b: Array[Byte]): GenericRecord = {
new GenericDatumReader[GenericRecord](SCHEMA)
.read(null, DecoderFactory.get().jsonDecoder(SCHEMA, new ByteArrayInputStream(b)))
}
我需要将 SCHEMA 传递给映射函数。如何在反序列化方法中传递 SCHEMA 变量。
使用柯里化
deserialize(schema: Schema)(b: Array[Byte]): GenericRecord = { ... }
dStream.map(deserialize(SCHEMA))
或二进制函数:
def deserialize(b: Array[Byte], schema: Schema): GenericRecord = { ... }
dStream.map(b => deserialize(b, SCHEMA))
val SCHEMA : Schema = ....
dStream.map(b => deserialize(bytes))
def deserialize(b: Array[Byte]): GenericRecord = {
new GenericDatumReader[GenericRecord](SCHEMA)
.read(null, DecoderFactory.get().jsonDecoder(SCHEMA, new ByteArrayInputStream(b)))
}
我需要将 SCHEMA 传递给映射函数。如何在反序列化方法中传递 SCHEMA 变量。
使用柯里化
deserialize(schema: Schema)(b: Array[Byte]): GenericRecord = { ... }
dStream.map(deserialize(SCHEMA))
或二进制函数:
def deserialize(b: Array[Byte], schema: Schema): GenericRecord = { ... }
dStream.map(b => deserialize(b, SCHEMA))