如果我的 csv 文件存在于 S3 中，如何使用 nx.read_edgelist 创建图形？

Question

我的一个 S3 存储桶 (s3://abc/FB/train_woheader.csv) 中有一个 csv 文件。当我写..

g=nx.read_edgelist('s3://abc/FB/train_woheader.csv',delimiter=',',create_using=nx.DiGraph(),nodetype=int, encoding='utf-8')
print(nx.info(g))

它说

FileNotFoundError: [Errno 2] No such file or directory: 's3://abc/FB/train_woheader.csv'

但是，如果我将 csv 保存在 Jupyter 实例中，那么我就可以使用该行创建图形

g=nx.read_edgelist('train_woheader.csv',delimiter=',',create_using=nx.DiGraph(),nodetype=int, encoding='utf-8')

csv 是一个大文件，因此只需要保存在 S3 中。它无法保存在 Jupyter 实例中，因为它占用了很多 space.

有什么帮助吗？

Answer 1

read_edgelist 期望获得文件或文件名 argument.
你可以做的是从 s3 读取文件（使用 boto3），使用 StringIO 并将填充的文件传递给 read_edgelis:

import io.StringIO()
with io.StringIO() as f
    f.write('data_coming_from_s3_using_boto3')
    f.seek(0)
    g=nx.read_edgelist(f,delimiter=',',create_using=nx.DiGraph(),nodetype=int, encoding='utf-8')

如果我的 csv 文件存在于 S3 中，如何使用 nx.read_edgelist 创建图形？

How to create a graph using nx.read_edgelist if my csv file is present at S3?

python

csv

amazon-s3

amazon-web-services

networkx