使用 HashMaps 将总值和平均值映射到键

Using HashMaps to map a total and average value to a key

我有一个国家 class 并从一个 .csv 文件中读取了数据,其中包含许多国家名称、它们所在的地区、每个国家的人口、面积等,并存储了它在一个数组列表中。我主要使用 java 集合框架进行数据分析,并希望找到每个 总数 平均人口 区域。

我认为使用 HashMap 最适合此操作,但我不知道该怎么做,因为我以前从未以任何复杂的方式或对象使用过它。我也知道我必须将 int 的数据类型更改为 long 对于总人口。

public class Country {
    

    private String name;
    private String region;
    private int population;
    private int area;
    private double density;

    /**
     * Default constructor
     */
    public Country() {

    }

    /**
     * Creates a country with all args
     * 
     * @param name
     * @param region
     * @param population
     * @param area
     * @param density
     */
    public Country(String name, String region, int population, int area, double density) {
        super();
        this.name = name;
        this.region = region;
        this.population = population;
        this.area = area;
        this.density = density;
    }

/**
     * @return the region
     */
    public String getRegion() {
        return region;
    }

    /**
     * @param region the region to set
     */
    public void setRegion(String region) {
        this.region = region;
    }

/**
     * @return the population
     */
    public int getPopulation() {
        return population;
    }

    /**
     * @param population the population to set
     */
    public void setPopulation(int population) {
        this.population = population;
    }



public static void totalPopulationByRegion(Collection<Country> countries) {
        Map<String, Integer> map = new HashMap<String, Integer>();

        int total = 0;

        for (Country country : countries) {
            if (map.containsKey(country.getRegion())) {
                map.put(country.getRegion(), total);
                total+=country.getPopulation();
            } else
                map.put(country.getRegion(), total);
        }

        for (Map.Entry m : map.entrySet()) {
            System.out.println(m.getKey() + " " + m.getValue());
        }
    }

从我在控制台上获得的输出,我意识到我的数学逻辑在这方面是完全错误的,甚至考虑到我没有处理过大而无法存储为 int 的事实。我没有得到我想要的密钥的重复项,我只是不知道如何获得映射到每个区域的人口的累计总数。如有任何帮助,我们将不胜感激。

从 main 方法调用时得到的输出:


Near east 41843152
Asia -478957430
Europe -7912568
Africa 54079957
Latin amer. & carib 17926472
Northern america -35219702
Baltics -1102504495
Oceania -616300040

来自 csv 文件的示例:

Country,Region,Population,Area (sq. mi.)
Afghanistan,ASIA,31056997,647500
Albania,EASTERN EUROPE                     ,3581655,28748
Algeria ,NORTHERN AFRICA                    ,32930091,2381740
American Samoa ,OCEANIA                            ,57794,199
Andorra ,WESTERN EUROPE                     ,71201,468
Angola ,SUB-SAHARAN AFRICA                 ,12127071,1246700
Anguilla ,LATIN AMER. & CARIB    ,13477,102
Antigua & Barbuda ,LATIN AMER. & CARIB    ,69108,443
Argentina ,LATIN AMER. & CARIB    ,39921833,2766890

如果您只想将区域与其总人口分组,那么您需要稍微修改一下代码。变量 total 应该在你的 for 循环中声明,它应该用国家的人口来初始化。

public static void totalPopulationByRegion(Collection<Country> countries) {
        Map</*Region*/ String, /*Population*/ Long> map = new HashMap<>();

        for (Country country : countries) {
            long total = country.getPopulation();
            if (map.containsKey(country.getRegion())) {
                total+=country.getPopulation();
            }
            map.put(country.getRegion(), total);
        }

        for (Map.Entry m : map.entrySet()) {
            System.out.println(m.getKey() + " " + m.getValue());
        }
    }

但是,如果您希望对数据有更多的处理,那么如果您按区域和 Country 本身分组并缓存它以备将来使用,这样会更容易:

Map<String, List<Country>> groupData(Collection<Country> countries) {
        Map</*Region*/String, List<Country>> map = new HashMap<>();

        for (Country country : countries) {
            List<Country> regionCountries = new ArrayList<>();
            if (map.containsKey(country.getRegion())) {
                regionCountries = map.get(country.getRegion());
            }
            regionCountries.add(country);
            map.put(country.getRegion(), regionCountries);
        }
        return map;
    }

然后这个 data 可以用来聚合每个地区的总人口和平均人口,就像这样(为了方便起见,我使用 Java 8 个流 API):

Map<String, Integer> getTotalPopulationPerRegion(Map<String, List<Country>> data) {
        Map<String, Integer> result = data.entrySet()
                .stream()
                .collect(Collectors.toMap(entry -> entry.getKey(), entry -> entry.getValue().stream().mapToInt(country -> country.getPopulation()).sum()));
        return result;
    }

Map<String, Double> getAveragePopulationPerRegion(Map<String, List<Country>> data) {
        Map<String, Double> result = data.entrySet()
                .stream()
                .collect(Collectors.toMap(entry -> entry.getKey(), entry -> entry.getValue().stream().mapToDouble(country -> country.getPopulation()).average().orElse(Double.NaN)));
        return result;
    }

假设您已经将您所在国家/地区的人口类型从 int 更改为 long class

public static class Country {
    private String name;
    private String region;
    private long population;
    ...
}

这里有一些方法可以满足您的需求:

public static void totalPopulationByRegion(Collection<Country> countries) {
    Map<String, Long> map = new HashMap<>();

    for (Country country : countries) {
        if (map.containsKey(country.getRegion())) {
            //if the map contains the region get the value and add the population of current country
            map.put(country.getRegion(), map.get(country.getRegion()) + country.getPopulation());
        } else{
            //else just put region of current country and population into the map
            map.put(country.getRegion(), country.getPopulation());
        }
    }

    for (Map.Entry m : map.entrySet()) {
        System.out.println(m.getKey() + " " + m.getValue());
    }
}

如果您使用的是 Java 8 或更高版本,可以使用 Map#computeIfPresentMap#computeIfAbsent 缩短以上内容,并避免使用 if else 块

public static void totalPopulationByRegion2(Collection<Country> countries) {
    Map<String, Long> map = new HashMap<>();

    for (Country country : countries) {
        map.computeIfPresent(country.getRegion(), (reg, pop)->  pop + country.getPopulation());
        map.computeIfAbsent(country.getRegion(), reg -> country.getPopulation());                   
    }

    for (Map.Entry m : map.entrySet()) {
        System.out.println(m.getKey() + " " + m.getValue());
    }
}

使用流 API 创建地图的任务可以成为使用 Collectors#groupingByCollectors#summingLong

的单线任务
public static void totalPopulationByRegion3(Collection<Country> countries) {
    Map<String, Long> map = 
            countries.stream()
                     .collect(Collectors.groupingBy(Country::getRegion, 
                                                    Collectors.summingLong(Country::getPopulation)));

    for (Map.Entry m : map.entrySet()) {
        System.out.println(m.getKey() + " " + m.getValue());
    }
}