从 map 函数的输出中排除 "None"

Excluding "None" from output of map function

我有这个代码:

fileRDD.map(positive)\
       .map(lambda x: [x,1])\
       .reduceByKey(lambda x,y: x+y)\
       .take(10)

输出为:

[(None, 3194395),
 (0, 240597),
 (1, 224805),
 (2, 210585),
 (3, 198246),
 (4, 202869),
 (5, 92615),
 (6, 60493)]

如何从输出中删除 None 行? (我只需要 0 到 6 个结果)

通过在 RDD 上使用 filter 函数:

rdd = spark.sparkContext.parallelize([
    (None, 3194395), (0, 240597), (1, 224805),
    (2, 210585), (3, 198246), (4, 202869),
    (5, 92615), (6, 60493)
])

rdd1 = rdd.filter(lambda x: x[0] is not None)

print(rdd1.collect())
#[(0, 240597), (1, 224805), (2, 210585), (3, 198246), (4, 202869), (5, 92615), (6, 60493)]