从文件中的每一行制作字典
Making a dictionary from each line in a file
我正在尝试从这个文件制作字典:键是第一个词,值是后面的所有词。
andrew fred
fred
judy andrew fred
george judy andrew
john george
这是我的代码:
follows_file = open("C:\Users\Desktop\Python\follows.txt")
followers = {}
for line in follows_file: #==> [Judy Andrew Fred]
users = line.split(' ') #==> [Judy, andrew, Fred, ....]
follower = users[0] #==> [Judy]
followed_by = users[1:] #==> [Andrew, Fred]
for user in followed_by:
# Add the 'follower to the list of followers user
if user not in followers:
followers[user] = []
followers[user].append(follower)
print(followers.items())
当我打印 follower 和 followed by 变量时,它们是正确的,但我无法将它们添加到字典中正确;这是输出
dict_items([('fred\n', ['andrew', 'judy']), ('andrew', ['judy']), ('judy' ['george']), ('andrew\n', ['george']), ('george', ['john'])])
我想要的输出是
(Andrew[Fred])(Fred[])(judy[Andrew Fred])(George[Judy Fred])(john[george])
非常感谢任何帮助!
您可以将 collections.defaultdict()
用作字典工厂,只需将用户追加到某个人之后,例如:
import collections
followers = collections.defaultdict(list) # use a dict factory to save some time on checks
with open("path/to/your_file", "r") as f: # open the file for reading
for line in f: # read the file line by line
users = line.split() # split on any white space
followers[users[0]] += users[1:] # append the followers for the current user
这将为您的数据产生:
{'andrew': ['fred'],
'fred': [],
'judy': ['andrew', 'fred'],
'george': ['judy', 'andrew'],
'john': ['george']}
这也将允许您在重复记录上将多个列表附加到用户 - 否则您可以只对 followers
使用普通的 dict
并将它们设置为 followers[users[0]] = users[1:]
.
您显示为所需输出的数据结构无效 Python,您真的希望它以这种方式显示吗?我的意思是,如果你坚持你可以这样做:
print("".join("({}[{}])".format(k, " ".join(v)) for k, v in followers.items()))
# (andrew[fred])(fred[])(judy[andrew fred])(george[judy andrew])(john[george])
编辑后的答案,由于@PM2Ring 和@IljaEverilä 的评论而得到改进。
这是我使用字典理解的原始解决方案
followers = {line.split()[0]: line.split()[1:] for line in follows_file}
@IljaEverilä 提出的一个更有效的替代方案是:
,它避免调用 split
两次
followers = {follower: followees for follower, *followees in map(str.split, follows_file)}
结果:
{'andrew': ['fred'],
'fred': [],
'george': ['judy', 'andrew'],
'john': ['george'],
'judy': ['andrew', 'fred']}
请注意,上述两种解决方案都假定您的文件不包含重复键。
之后不要忘记关闭文件:
follows_file.close()
或者更好的是,只使用上下文管理器,它会为您处理文件关闭:
with open('C:\Users\zacan\Desktop\Python\follows.txt', 'r') as follows_file:
followers = {follower: followees for follower, *followees in map(str.split, follows_file)}
followers = dict()
with open('C:\Users\zacan\Desktop\Python\follows.txt', 'r') as f:
for line in f:
users = line.split(' ')
followers[users[0]] = [_ for _ in users[1:]]
这应该可以,没有测试
这是一个使用 str.split
和 try
/ except
子句来捕获仅存在一个键的实例的解决方案。
注意 io.StringIO
让我们可以像读取文件一样读取字符串。
from io import StringIO
import csv
mystr = StringIO("""andrew fred
fred
judy andrew fred
george judy andrew
john george""")
# replace mystr with open("C:\Users\zacan\Desktop\Python\follows.txt")
with mystr as follows_file:
d = {}
for users in csv.reader(follows_file):
try:
key, *value = users[0].split()
except ValueError:
key, value = users[0], []
d[key] = value
print(d)
{'andrew': ['fred'],
'fred': [],
'george': ['judy', 'andrew'],
'john': ['george'],
'judy': ['andrew', 'fred']}
我正在尝试从这个文件制作字典:键是第一个词,值是后面的所有词。
andrew fred
fred
judy andrew fred
george judy andrew
john george
这是我的代码:
follows_file = open("C:\Users\Desktop\Python\follows.txt")
followers = {}
for line in follows_file: #==> [Judy Andrew Fred]
users = line.split(' ') #==> [Judy, andrew, Fred, ....]
follower = users[0] #==> [Judy]
followed_by = users[1:] #==> [Andrew, Fred]
for user in followed_by:
# Add the 'follower to the list of followers user
if user not in followers:
followers[user] = []
followers[user].append(follower)
print(followers.items())
当我打印 follower 和 followed by 变量时,它们是正确的,但我无法将它们添加到字典中正确;这是输出
dict_items([('fred\n', ['andrew', 'judy']), ('andrew', ['judy']), ('judy' ['george']), ('andrew\n', ['george']), ('george', ['john'])])
我想要的输出是
(Andrew[Fred])(Fred[])(judy[Andrew Fred])(George[Judy Fred])(john[george])
非常感谢任何帮助!
您可以将 collections.defaultdict()
用作字典工厂,只需将用户追加到某个人之后,例如:
import collections
followers = collections.defaultdict(list) # use a dict factory to save some time on checks
with open("path/to/your_file", "r") as f: # open the file for reading
for line in f: # read the file line by line
users = line.split() # split on any white space
followers[users[0]] += users[1:] # append the followers for the current user
这将为您的数据产生:
{'andrew': ['fred'],
'fred': [],
'judy': ['andrew', 'fred'],
'george': ['judy', 'andrew'],
'john': ['george']}
这也将允许您在重复记录上将多个列表附加到用户 - 否则您可以只对 followers
使用普通的 dict
并将它们设置为 followers[users[0]] = users[1:]
.
您显示为所需输出的数据结构无效 Python,您真的希望它以这种方式显示吗?我的意思是,如果你坚持你可以这样做:
print("".join("({}[{}])".format(k, " ".join(v)) for k, v in followers.items()))
# (andrew[fred])(fred[])(judy[andrew fred])(george[judy andrew])(john[george])
编辑后的答案,由于@PM2Ring 和@IljaEverilä 的评论而得到改进。
这是我使用字典理解的原始解决方案
followers = {line.split()[0]: line.split()[1:] for line in follows_file}
@IljaEverilä 提出的一个更有效的替代方案是:
,它避免调用split
两次
followers = {follower: followees for follower, *followees in map(str.split, follows_file)}
结果:
{'andrew': ['fred'],
'fred': [],
'george': ['judy', 'andrew'],
'john': ['george'],
'judy': ['andrew', 'fred']}
请注意,上述两种解决方案都假定您的文件不包含重复键。
之后不要忘记关闭文件:
follows_file.close()
或者更好的是,只使用上下文管理器,它会为您处理文件关闭:
with open('C:\Users\zacan\Desktop\Python\follows.txt', 'r') as follows_file:
followers = {follower: followees for follower, *followees in map(str.split, follows_file)}
followers = dict()
with open('C:\Users\zacan\Desktop\Python\follows.txt', 'r') as f:
for line in f:
users = line.split(' ')
followers[users[0]] = [_ for _ in users[1:]]
这应该可以,没有测试
这是一个使用 str.split
和 try
/ except
子句来捕获仅存在一个键的实例的解决方案。
注意 io.StringIO
让我们可以像读取文件一样读取字符串。
from io import StringIO
import csv
mystr = StringIO("""andrew fred
fred
judy andrew fred
george judy andrew
john george""")
# replace mystr with open("C:\Users\zacan\Desktop\Python\follows.txt")
with mystr as follows_file:
d = {}
for users in csv.reader(follows_file):
try:
key, *value = users[0].split()
except ValueError:
key, value = users[0], []
d[key] = value
print(d)
{'andrew': ['fred'],
'fred': [],
'george': ['judy', 'andrew'],
'john': ['george'],
'judy': ['andrew', 'fred']}