如何使我的 Dijkstra 算法更高效

How can I make my Dijkstra algorithm more efficient

我有一个定向网络模型,该模型由一组节点组成,这些节点通过随着每次模型迭代而增长的链接连接起来。为了在最终模型迭代中找到 "average shortest path",我实现了 Dijkstra 算法,该算法计算从所有节点到所有节点的最短路径。更具体地说,该算法计算从每个网络的 3,000 个节点到所有其他 3,000 个节点(如果存在路径)的最短路径,大约 9,000,000 个路径长度,然后找到平均路径长度。当我尝试这个时,我 运行 内存不足。我能够获得直到大约 500 个节点的平均路径长度,其中大约 250,000 个路径长度在 12 小时内计算得出。我的问题是,是否有任何方法可以改进代码以使其更有效率?还是计算那么多路径不可行?

下面的代码...算法本身改编自 Vogella http://www.vogella.com/tutorials/JavaAlgorithmsDijkstra/article.html

网络中的节点代表树,边或链接代表网。

Dijstra 算法

package network;

imports...

public class DijkstraAlgorithm {
    private Context<Object> context;
    private Geography<Object> geography;
    private int id; 
    List<Tree> vertices = Tree.getVertices();
    List<Nets> edges = Nets.getEdges();
    private Set<Tree> settledNodes;
    private Set<Tree> unSettledNodes;
    private Map<Tree, Tree> predecessors;
    private Map<Tree, Integer> distance;

public DijkstraAlgorithm(Graph graph) {
    this.context = context;
    this.geography = geography;
    this.id = id;
    this.vertices = vertices;
    this.edges = edges;
}


setters and getters....

public void execute(Tree source){
    settledNodes = new HashSet<Tree>();
    unSettledNodes = new HashSet<Tree>();
    distance = new HashMap<Tree, Integer>();
    predecessors = new HashMap<Tree, Tree>();
    distance.put(source, 0);
    unSettledNodes.add(source);
    while (unSettledNodes.size()>0){
        Tree node = getMinimum(unSettledNodes);
        settledNodes.add(node);
        unSettledNodes.remove(node);
        findMinimalDistances(node);
    }
}

private void findMinimalDistances(Tree node){
    List<Tree>adjacentNodes = getNeighbors(node);
    for (Tree target: adjacentNodes){
        if (getShortestDistance(target)>getShortestDistance(node)+getDistance(node,target)){
            distance.put(target, getShortestDistance(node) + getDistance(node, target));
            predecessors.put(target, node);
            unSettledNodes.add(target);
        }

    }
}

private int getDistance(Tree node, Tree target){
    for (Nets edge: edges){
        if (edge.getStartTrees().equals(node) && edge.getEndTrees().equals(target)){
            return edge.getId();
        }
    }
    throw new RuntimeException("Should not happen");
}

private List<Tree> getNeighbors(Tree node){
    List<Tree> neighbors = new ArrayList<Tree>();
    for (Nets edge: edges) {
        if(edge.getStartTrees().equals(node) && !isSettled(edge.getEndTrees())){
            neighbors.add(edge.getEndTrees());
        }
    }
    return neighbors;
}

private Tree getMinimum(Set<Tree>vertexes){
    Tree minimum = null;
    for (Tree vertex: vertexes) {
        if (minimum == null){
            minimum = vertex;
        } else {
            if (getShortestDistance(vertex)< getShortestDistance(minimum)){
                minimum = vertex;
            }
        }
    }

    return minimum;

}

private boolean isSettled(Tree vertex){
    return settledNodes.contains(vertex);
}

private int getShortestDistance(Tree destination) {
    Integer d = distance.get(destination);
    if (d == null) {
        return Integer.MAX_VALUE;
    } else {
        return d;
    }
}

public LinkedList<Tree> getPath(Tree target){
    LinkedList<Tree>path = new LinkedList<Tree>();
    Tree step = target;
    if(predecessors.get(step)== null){
        return null;
    }
    path.add(step);
    while (predecessors.get(step)!=null){
        step = predecessors.get(step);
        path.add(step);

    }
    Collections.reverse(path);
    return path;
}



}

图表

package network;

imports...

public class Graph {
     private Context<Object> context;
     private Geography<Object> geography;
     private int id; 
     List<Tree> vertices = new ArrayList<>();
     List<Nets> edges = new ArrayList<>();
     List <Integer> intermediateNodes = new ArrayList<>();

public Graph(Context context, Geography geography, int id, List vertices, List edges) {
    this.context = context;
    this.geography = geography;
    this.id = id;
    this.vertices = vertices;
    this.edges = edges;
}

setters... getters...

//updates graph
@ScheduledMethod(start =1, interval =1, priority =1)
public void countNodesAndVertices() {
    this.setVertices(Tree.getVertices());
    this.setEdges(Nets.getEdges());

//run Dijkstra at the 400th iteration of network development
@ScheduledMethod(start =400, priority =1)
public void Dijkstra(){

    Graph graph2 = new Graph (context, geography, id, vertices, edges);
    graph2.setEdges(this.getEdges());
    graph2.setVertices(this.getVertices());
    for(Tree t: graph2.getVertices()){
    }
    DijkstraAlgorithm dijkstra = new DijkstraAlgorithm(graph2);

    //create list of pathlengths (links, not nodes)
    List <Double> pathlengths = new ArrayList<>();
    //go through all nodes as starting nodes
    for (int i = 0; i<vertices.size();i++){

        //find the shortest path to all nodes as end nodes
        for (int j = 0; j<vertices.size();j++){
            if(i != j){

                Tree startTree = vertices.get(i);
                Tree endTree = vertices.get(j);
                dijkstra.execute(vertices.get(i));
                //create a list that contains the path of nodes
                LinkedList<Tree> path = dijkstra.getPath(vertices.get(j));
                     //if the path is not null and greater than 0
                    if (path != null && path.size()>0){
                    //calculate the pathlength (-1, which is the size of the path length of links) 
                    double listsize = path.size()-1;
                    //add it to the list
                    pathlengths.add(listsize);
                }

            }
        }



    }   
    calculateAvgShortestPath(pathlengths);

}
//calculate the average
public void calculateAvgShortestPath(List<Double>pathlengths){
    Double sum = 0.0;
    for (Double cc: pathlengths){
        sum+= cc;
    }
    Double avgPathLength = sum/pathlengths.size();
    System.out.println("The average path length is: " + avgPathLength);

}

}

您可以进行多项优化。例如使用 Fibonacci 堆(甚至是标准的 java 优先级队列)肯定会加快速度。然而,对于这么大的数据集,内存问题可能会持续存在。处理这么大的数据集的唯一真正方法是使用分布式实现。我相信您可以使用 Spark Graphx 库中的最短路径实现。

一个快速的改进是移动线:

dijkstra.execute(vertices.get(i));

最多 6 行(因此它在 i 循环中,但不在 j 循环中)。

这应该会增加节点数量(即快 3000 倍)。

它应该仍然会给出相同的结果,因为 Dijkstra 算法计算的是从起始节点到所有目标节点的最短路径,因此无需为每对 start/end.

重新运行它