JavaRDD 的总和值 (Tuple3<String, String, Double>)

Sum values of JavaRDD (Tuple3<String, String, Double>)

我有一个包含 Cassandra 值的 JavaRDD Table

URL | Name | Value
A   |   x  |    1
A   |   x  |    2   
A   |   x  |    1.5
B   |   y  |    3
B   |   y  |    2.75
C   |   z  |    1.25
C   |   z  |    3 
C   |   z  |    1

所以我想通过只有一个 A、B、C 并求和这些值来减少它。 我这样试过:

JavaPairRDD<Tuple3<String, String, Double>,Double> x = y.mapToPair(new PairFunction<Tuple3<String, String, Double>, Tuple3<String, String, Double>, Double>(){

        @Override
        public Tuple2<Tuple3<String, String, Double>, Double> call(
                Tuple3<String, String, Double> arg0) throws Exception {
            // TODO Auto-generated method stub
            return null;
        }

    }); // To Do reduce

y 是 JavaRDD 类型> 但它说它不适用于该论点。 甚至可以这样解决还是有更好的方法?

使用JavaRdd的reduceBykey函数,它会根据key减少数据,最终创建一个Rdd。

试试这个代码

JavaRDD<Tuple3<String, String, Double>> x = ...........;
        JavaPairRDD<Tuple2<String, String>, Double> result = x.mapToPair(
                new PairFunction<Tuple3<String, String, Double>, Tuple2<String, String>, Double>() {
                    @Override
                    public Tuple2<Tuple2<String, String>, Double> call(
                            Tuple3<String, String, Double> t)
                            throws Exception {
                        return new Tuple2<Tuple2<String, String>, Double>(
                                new Tuple2<String, String>(t._1(), t
                                        ._2()), t._3());
                    }
                }).reduceByKey(new Function2<Double, Double, Double>() {
            @Override
            public Double call(Double v1, Double v2) throws Exception {
                return v1 + v2;
            }
        });