Python自生变量?
Python self generating variables?
我正在编写一个程序,该程序读取一个包含乐透号码的大型 .txt 文件。这意味着一个数组中总是有 7 个 int。 (49 中的 6,最后一个是 supernumber)。
例如:[[1, 11, 25, 37, 39, 47, 0],[3, 13, 15, 18, 37, 46, 0], ...]
我在这个 .txt 中有每个月,这意味着它就像
January:
[1, 11, 25, 37, 39, 47, 0]
[3, 13, 15, 18, 37, 46, 2]
[3, 6, 9, 12, 37, 46, 6]
February:
[3, 13, 15, 18, 37, 46, 0]
[1, 23, 17, 18, 37, 46, 8]
...
等等
如何生成一个只读取月份数字的数组?
我有一个解决方案,但它的编码风格很糟糕:
jan_tipps = []
feb_tipps = []
mar_tipps = []
#variable which month has to be checked
jan = False
feb = False
mar = False
for line in wholefile:
if line == '\n':
pass
elif line == 'January:\n':
jan = True
elif line == 'February:\n':
jan = False
feb = True
elif line == 'March:\n':
feb = False
mar = True
elif jan == True:
jan_tipps.append(line.split())
elif feb == True:
feb_tipps.append(line.split())
elif mar == True:
mar_tipps.append(line.split())
我想我需要泛型或自生成变量之类的东西。我不知道我必须在互联网上搜索什么。
创建月份字典,月份名称作为键,你想要的数组数组作为值
month = {
m: []
for m in ['January', 'February']
}
with open('file.txt') as file:
latest = None
for line in file:
line = line.strip()
if line == '': # striped empty line
continue
if line in month:
latest = line
else:
month[latest].append(line.split()) # actually if line="[1, 2]" then better to use eval instaed of split(', ')
您可以使用正则表达式:
import re
lotto = """
January:
[1, 11, 25, 37, 39, 47, 0]
[3, 13, 15, 18, 37, 46, 2]
[3, 6, 9, 12, 37, 46, 6]
February:
[3, 13, 15, 18, 37, 46, 0]
[1, 23, 17, 18, 37, 46, 8]
"""
def getMonthlyNumbers(month=None):
rx = re.compile(r'''
^{}:[\n\r]
(?P<numbers>(?:^\[.+\][\n\r]?)+)'''.format(month), re.M | re.X)
for match in rx.finditer(lotto):
# print it or do sth. else here
print(match.group('numbers'))
getMonthlyNumbers('January')
getMonthlyNumbers('February')
或者,所有月份,使用字典理解:
rx = re.compile(r'^(?P<month>\w+):[\n\r](?P<numbers>(?:^\[.+\][\n\r]?)+)', re.MULTILINE)
result = {m.group('month'): m.group('numbers') for m in rx.finditer(lotto)}
print(result)
产生
{'January': '[1, 11, 25, 37, 39, 47, 0]\n[3, 13, 15, 18, 37, 46, 2]\n[3, 6, 9, 12, 37, 46, 6]\n', 'February': '[3, 13, 15, 18, 37, 46, 0]\n[1, 23, 17, 18, 37, 46, 8]\n'}
这里的想法是在一行的开头查找月份名称,然后捕获任何 [...]
对。参见 a demo on regex101.com。
可能,您希望将每一行单独作为一个列表(而不是一个字符串),因此您可以选择:
import re
from ast import literal_eval
lotto = """
January:
[1, 11, 25, 37, 39, 47, 0]
[3, 13, 15, 18, 37, 46, 2]
[3, 6, 9, 12, 37, 46, 6]
February:
[3, 13, 15, 18, 37, 46, 0]
[1, 23, 17, 18, 37, 46, 8]
"""
rx = re.compile(r'^(?P<month>\w+):[\n\r](?P<numbers>(?:^\[.+\][\n\r]?)+)', re.MULTILINE)
result = {m.group('month'):
[literal_eval(numbers)
for numbers in m.group('numbers').split("\n") if numbers]
for m in rx.finditer(lotto)}
print(result)
您可以使用 regular expression to extract the month names, ast.literal_eval()
to parse the lists of numbers for each month, and a defaultdict
来存储它们,而无需在添加列表之前检查月份是否存在:
from collections import defaultdict
import ast
import re
with open('file.txt') as file:
months = defaultdict(list)
month = None
for line in file:
line = line.strip()
m = re.match('([A-Z][a-z]+):', line)
if m is not None:
month = m.group(1)
elif line.startswith('['):
months[month].append(ast.literal_eval(line))
for month, numbers in months.iteritems():
print '{}: {}'.format(month, numbers)
输出:
January: [[1, 11, 25, 37, 39, 47, 0], [3, 13, 15, 18, 37, 46, 2], [3, 6, 9, 12, 37, 46, 6]]
February: [[3, 13, 15, 18, 37, 46, 0], [1, 23, 17, 18, 37, 46, 8]]
正如 Klaus D. 评论的那样,您需要一本字典。但我怀疑这还不够。这是一个更广泛的答案。
一个问题:您的代码与您提供的输入数据不一致。您的代码将数字拆分为空格,但输入数据改为使用方括号和逗号。此代码适用于您提供的输入。
# Parser states:
# 0: waiting for a month name
# 1: expecting numbers in the format [1, 11, 25, 37, 39, 47, 0]
from collections import defaultdict
state = 0
tipps = defaultdict(list)
monthname = None
with open("numbers.txt","r") as f:
for line in f:
if state == 0:
if line.strip().endswith(":"):
monthname = line.split(":")[0]
state = 1
continue
if state == 1:
if line.startswith("["):
line = line.strip().strip("[]")
numbers = line.split(",")
tipps[monthname].append([int(n) for n in numbers])
elif not line.strip():
state = 0
else:
print (f"Unexpected data, parser stuck: {line}")
break
for k,v in tipps.items():
print (f"{k}: {v}")
输出为:
January: [[1, 11, 25, 37, 39, 47, 0], [3, 13, 15, 18, 37, 46, 2], [3, 6, 9, 12, 37, 46, 6]]
February: [[3, 13, 15, 18, 37, 46, 0], [1, 23, 17, 18, 37, 46, 8]]
我正在编写一个程序,该程序读取一个包含乐透号码的大型 .txt 文件。这意味着一个数组中总是有 7 个 int。 (49 中的 6,最后一个是 supernumber)。
例如:[[1, 11, 25, 37, 39, 47, 0],[3, 13, 15, 18, 37, 46, 0], ...]
我在这个 .txt 中有每个月,这意味着它就像
January:
[1, 11, 25, 37, 39, 47, 0]
[3, 13, 15, 18, 37, 46, 2]
[3, 6, 9, 12, 37, 46, 6]
February:
[3, 13, 15, 18, 37, 46, 0]
[1, 23, 17, 18, 37, 46, 8]
...
等等
如何生成一个只读取月份数字的数组?
我有一个解决方案,但它的编码风格很糟糕:
jan_tipps = []
feb_tipps = []
mar_tipps = []
#variable which month has to be checked
jan = False
feb = False
mar = False
for line in wholefile:
if line == '\n':
pass
elif line == 'January:\n':
jan = True
elif line == 'February:\n':
jan = False
feb = True
elif line == 'March:\n':
feb = False
mar = True
elif jan == True:
jan_tipps.append(line.split())
elif feb == True:
feb_tipps.append(line.split())
elif mar == True:
mar_tipps.append(line.split())
我想我需要泛型或自生成变量之类的东西。我不知道我必须在互联网上搜索什么。
创建月份字典,月份名称作为键,你想要的数组数组作为值
month = {
m: []
for m in ['January', 'February']
}
with open('file.txt') as file:
latest = None
for line in file:
line = line.strip()
if line == '': # striped empty line
continue
if line in month:
latest = line
else:
month[latest].append(line.split()) # actually if line="[1, 2]" then better to use eval instaed of split(', ')
您可以使用正则表达式:
import re
lotto = """
January:
[1, 11, 25, 37, 39, 47, 0]
[3, 13, 15, 18, 37, 46, 2]
[3, 6, 9, 12, 37, 46, 6]
February:
[3, 13, 15, 18, 37, 46, 0]
[1, 23, 17, 18, 37, 46, 8]
"""
def getMonthlyNumbers(month=None):
rx = re.compile(r'''
^{}:[\n\r]
(?P<numbers>(?:^\[.+\][\n\r]?)+)'''.format(month), re.M | re.X)
for match in rx.finditer(lotto):
# print it or do sth. else here
print(match.group('numbers'))
getMonthlyNumbers('January')
getMonthlyNumbers('February')
或者,所有月份,使用字典理解:
rx = re.compile(r'^(?P<month>\w+):[\n\r](?P<numbers>(?:^\[.+\][\n\r]?)+)', re.MULTILINE)
result = {m.group('month'): m.group('numbers') for m in rx.finditer(lotto)}
print(result)
产生
{'January': '[1, 11, 25, 37, 39, 47, 0]\n[3, 13, 15, 18, 37, 46, 2]\n[3, 6, 9, 12, 37, 46, 6]\n', 'February': '[3, 13, 15, 18, 37, 46, 0]\n[1, 23, 17, 18, 37, 46, 8]\n'}
这里的想法是在一行的开头查找月份名称,然后捕获任何 [...]
对。参见 a demo on regex101.com。
可能,您希望将每一行单独作为一个列表(而不是一个字符串),因此您可以选择:
import re
from ast import literal_eval
lotto = """
January:
[1, 11, 25, 37, 39, 47, 0]
[3, 13, 15, 18, 37, 46, 2]
[3, 6, 9, 12, 37, 46, 6]
February:
[3, 13, 15, 18, 37, 46, 0]
[1, 23, 17, 18, 37, 46, 8]
"""
rx = re.compile(r'^(?P<month>\w+):[\n\r](?P<numbers>(?:^\[.+\][\n\r]?)+)', re.MULTILINE)
result = {m.group('month'):
[literal_eval(numbers)
for numbers in m.group('numbers').split("\n") if numbers]
for m in rx.finditer(lotto)}
print(result)
您可以使用 regular expression to extract the month names, ast.literal_eval()
to parse the lists of numbers for each month, and a defaultdict
来存储它们,而无需在添加列表之前检查月份是否存在:
from collections import defaultdict
import ast
import re
with open('file.txt') as file:
months = defaultdict(list)
month = None
for line in file:
line = line.strip()
m = re.match('([A-Z][a-z]+):', line)
if m is not None:
month = m.group(1)
elif line.startswith('['):
months[month].append(ast.literal_eval(line))
for month, numbers in months.iteritems():
print '{}: {}'.format(month, numbers)
输出:
January: [[1, 11, 25, 37, 39, 47, 0], [3, 13, 15, 18, 37, 46, 2], [3, 6, 9, 12, 37, 46, 6]]
February: [[3, 13, 15, 18, 37, 46, 0], [1, 23, 17, 18, 37, 46, 8]]
正如 Klaus D. 评论的那样,您需要一本字典。但我怀疑这还不够。这是一个更广泛的答案。
一个问题:您的代码与您提供的输入数据不一致。您的代码将数字拆分为空格,但输入数据改为使用方括号和逗号。此代码适用于您提供的输入。
# Parser states:
# 0: waiting for a month name
# 1: expecting numbers in the format [1, 11, 25, 37, 39, 47, 0]
from collections import defaultdict
state = 0
tipps = defaultdict(list)
monthname = None
with open("numbers.txt","r") as f:
for line in f:
if state == 0:
if line.strip().endswith(":"):
monthname = line.split(":")[0]
state = 1
continue
if state == 1:
if line.startswith("["):
line = line.strip().strip("[]")
numbers = line.split(",")
tipps[monthname].append([int(n) for n in numbers])
elif not line.strip():
state = 0
else:
print (f"Unexpected data, parser stuck: {line}")
break
for k,v in tipps.items():
print (f"{k}: {v}")
输出为:
January: [[1, 11, 25, 37, 39, 47, 0], [3, 13, 15, 18, 37, 46, 2], [3, 6, 9, 12, 37, 46, 6]]
February: [[3, 13, 15, 18, 37, 46, 0], [1, 23, 17, 18, 37, 46, 8]]