查找彼此相距 2 到 3 跳的节点对

Find pairs of nodes that are between 2 and 3 hops away from each other

我有一个包含大约 50,000 个节点和 2,000,000 条边的大图。我需要找到彼此相距 2 到 3 跳的所有节点对。

最简单(也是肮脏)的解决方案是首先创建组合展开,然后检查每对节点:

import networkx as nx
import itertools

g = nx.erdos_renyi_graph(n=5000, p=0.05)
L = list(G.nodes())
# Create a complete graph
G = nx.Graph()
G.add_nodes_from(L)
G.add_edges_from(itertools.combinations(L, 2))

但是,当我尝试创建一个包含 45,000 个节点和 200 万条边的完整图时,我 运行 内存不足。

是否有任何其他解决方案可以在合理的时间内检查大图?感谢您的任何建议或指示。

获取边缘列表并将其转换为 adjacency matrix, see for how to do it memory-efficiently with scipy.sparse. Then use numpy.linalg.matrix_power to raise the adjacency matrix to 2nd power. For each entry in the squared matrix, if the entry is non-zero, there exists a path of length two between the nodes (in fact the entry gives you the number of paths of length 2 between the nodes). See answers :

Powers of the adjacency matrix are concatenating walks. The ijth entry of the kth power of the adjacency matrix tells you the number of walks of length k from vertex i to vertex j.

您可以对长度为 3 的路径执行相同的操作,方法是将其提高到 3 次方。