将 PairedRDD 保存为文本文件

Saving PairedRDD as a text file

users_grpd = pairs.groupByKey()

users_grpd_flattened = meds_grpd.map(
    lambda keyValue: (keyValue[0], ' '.join(map(str, keyValue[1]))))

users_grpd_flattened.saveAsTextFile('pairedrddresults.txt')

输出:

(u'3300975212', '120818 120519 120850 120521')

(u'3200272220', '120036 105037')

(u'13101231222', '2024574 12024')

我想知道是否有办法将这个 pairedrdd 保存为省略前导 u 和引号的文本文件?

如果您需要特定格式,您可以直接映射到字符串:

users_grpd_flattened = (pairs.groupByKey().
    map(lambda (k, vals): "{0}, {1}".format(k, ' '.join(str(x) for x in vals))))

如果需要括号,只需将格式字符串替换为 "({0}, {1})"