如何在 python 中实现重新启动的随机游走

How to implement random walks with restarts in python

我有以下使用 networkx 创建的图表。

import networkx as nx

G = nx.Graph()

G.add_nodes_from(["John", "Mary", "Jill", "Todd",
                  "iPhone5", "Kindle Fire", "Fitbit Flex Wireless", "Harry Potter", "Hobbit"])

G.add_edges_from([
    ("John", "iPhone5"),
    ("John", "Kindle Fire"),
    ("Mary", "iPhone5"),
    ("Mary", "Kindle Fire"),
    ("Mary", "Fitbit Flex Wireless"),
    ("Jill", "iPhone5"),
    ("Jill", "Kindle Fire"),
    ("Jill", "Fitbit Flex Wireless"),
    ("Todd", "Fitbit Flex Wireless"),
    ("Todd", "Harry Potter"),
    ("Todd", "Hobbit"),
])

现在,我想执行 random walk with restarts 来识别与 John 最相似的用户。我在 networkx 中搜索了文档,但在 networkx.

中找不到它的实现

如果 python library/code random walk with restarts 可以做到这一点,请告诉我。

如果需要,我很乐意提供更多详细信息。

编辑

如果我现有的网络加权如下,我是否仍会按如下方式计算重新启动的随机游走:nx.pagerank_numpy(G, personalization={"John": 1})?

import networkx as nx

G = nx.Graph()

G.add_nodes_from(["John", "Mary", "Jill", "Todd",
                  "iPhone5", "Kindle Fire", "Fitbit Flex Wireless", "Harry Potter", "Hobbit"])

G.add_weighted_edges_from([
    ("John", "iPhone5", 0.1),
    ("John", "Kindle Fire", 0.2),
    ("Mary", "iPhone5", 0.3),
    ("Mary", "Kindle Fire", 0.4),
    ("Mary", "Fitbit Flex Wireless", 0.5),
    ("Jill", "iPhone5", 0.9),
    ("Jill", "Kindle Fire", 0.1),
    ("Jill", "Fitbit Flex Wireless", 0.1),
    ("Todd", "Fitbit Flex Wireless", 0.1),
    ("Todd", "Harry Potter", 0.1),
    ("Todd", "Hobbit", 0.1),
])

个性化 PageRank—implemented in networkx—如果个性化向量的起始节点为 1 而其他所有节点为 0,则本质上是重新开始的随机游走。

以下代码

nx.pagerank_numpy(G, personalization={"John": 1})

然后生成一个字典,其中包含在每个节点中结束的概率

{'John': 0.24958826532666656,
 'Mary': 0.1229452674202248,
 'Jill': 0.12294526742022475,
 'Todd': 0.04506174037342413,
 'iPhone5': 0.17574399763529416,
 'Kindle Fire': 0.17574399763529416,
 'Fitbit Flex Wireless': 0.08243647797726429,
 'Harry Potter': 0.012767493105803515,
 'Hobbit': 0.012767493105803515}

从这本词典中,您可以select概率最高的用户。

对于加权图pagerank_numpy方法有一个weight参数,您可以在其中设置要使用的边数据键。使用 add_weighted_edges_from 添加边时,此数据键称为 "weight",因此代码如下。

nx.pagerank_numpy(G, personalization={"John": 1}, weight="weight")