Python自生变量？

Question

我正在编写一个程序，该程序读取一个包含乐透号码的大型 .txt 文件。这意味着一个数组中总是有 7 个 int。（49 中的 6，最后一个是 supernumber）。

例如：[[1, 11, 25, 37, 39, 47, 0],[3, 13, 15, 18, 37, 46, 0], ...]

我在这个 .txt 中有每个月，这意味着它就像

January:
[1, 11, 25, 37, 39, 47, 0]
[3, 13, 15, 18, 37, 46, 2]
[3,  6,  9, 12, 37, 46, 6]

February:
[3, 13, 15, 18, 37, 46, 0]
[1, 23, 17, 18, 37, 46, 8]

...

等等

如何生成一个只读取月份数字的数组？

我有一个解决方案，但它的编码风格很糟糕：

jan_tipps = []
feb_tipps = []
mar_tipps = []

#variable which month has to be checked
jan = False
feb = False
mar = False

for line in wholefile:
    if line == '\n':
        pass
    elif line == 'January:\n':
        jan = True
    elif line == 'February:\n':
        jan = False
        feb = True
    elif line == 'March:\n':
        feb = False
        mar = True
    elif jan == True:
        jan_tipps.append(line.split())

    elif feb == True:
        feb_tipps.append(line.split())
        
    elif mar == True:
        mar_tipps.append(line.split())

我想我需要泛型或自生成变量之类的东西。我不知道我必须在互联网上搜索什么。

Answer 1

创建月份字典，月份名称作为键，你想要的数组数组作为值

month = {
    m: [] 
    for m in ['January', 'February']
}
with open('file.txt') as file:
    latest = None
    for line in file:
        line = line.strip()
        if line == '':  # striped empty line
            continue
        if line in month:
            latest = line
        else:
            month[latest].append(line.split())  # actually if line="[1, 2]" then better to use eval instaed of split(', ')

Answer 2

您可以使用正则表达式：

import re

lotto = """
January:
[1, 11, 25, 37, 39, 47, 0]
[3, 13, 15, 18, 37, 46, 2]
[3,  6,  9, 12, 37, 46, 6]

February:
[3, 13, 15, 18, 37, 46, 0]
[1, 23, 17, 18, 37, 46, 8]
"""

def getMonthlyNumbers(month=None):
    rx = re.compile(r'''
        ^{}:[\n\r]
        (?P<numbers>(?:^\[.+\][\n\r]?)+)'''.format(month), re.M | re.X)

    for match in rx.finditer(lotto):
        # print it or do sth. else here
        print(match.group('numbers'))

getMonthlyNumbers('January')
getMonthlyNumbers('February')

或者，所有月份，使用字典理解：

rx = re.compile(r'^(?P<month>\w+):[\n\r](?P<numbers>(?:^\[.+\][\n\r]?)+)', re.MULTILINE)

result = {m.group('month'): m.group('numbers') for m in rx.finditer(lotto)}

print(result)

产生

{'January': '[1, 11, 25, 37, 39, 47, 0]\n[3, 13, 15, 18, 37, 46, 2]\n[3,  6,  9, 12, 37, 46, 6]\n', 'February': '[3, 13, 15, 18, 37, 46, 0]\n[1, 23, 17, 18, 37, 46, 8]\n'}

这里的想法是在一行的开头查找月份名称，然后捕获任何 [...] 对。参见 a demo on regex101.com。

可能，您希望将每一行单独作为一个列表（而不是一个字符串），因此您可以选择：

import re
from ast import literal_eval

lotto = """
January:
[1, 11, 25, 37, 39, 47, 0]
[3, 13, 15, 18, 37, 46, 2]
[3,  6,  9, 12, 37, 46, 6]

February:
[3, 13, 15, 18, 37, 46, 0]
[1, 23, 17, 18, 37, 46, 8]
"""

rx = re.compile(r'^(?P<month>\w+):[\n\r](?P<numbers>(?:^\[.+\][\n\r]?)+)', re.MULTILINE)

result = {m.group('month'): 
    [literal_eval(numbers) 
    for numbers in m.group('numbers').split("\n") if numbers] 
    for m in rx.finditer(lotto)}

print(result)

Answer 3

您可以使用 regular expression to extract the month names, ast.literal_eval() to parse the lists of numbers for each month, and a defaultdict 来存储它们，而无需在添加列表之前检查月份是否存在：

from collections import defaultdict
import ast
import re

with open('file.txt') as file:
    months = defaultdict(list)
    month = None
    for line in file:
        line = line.strip()
        m = re.match('([A-Z][a-z]+):', line)
        if m is not None:
            month = m.group(1)
        elif line.startswith('['):
            months[month].append(ast.literal_eval(line))
    for month, numbers in months.iteritems():
        print '{}: {}'.format(month, numbers)

输出：

January: [[1, 11, 25, 37, 39, 47, 0], [3, 13, 15, 18, 37, 46, 2], [3, 6, 9, 12, 37, 46, 6]]
February: [[3, 13, 15, 18, 37, 46, 0], [1, 23, 17, 18, 37, 46, 8]]

Answer 4

正如 Klaus D. 评论的那样，您需要一本字典。但我怀疑这还不够。这是一个更广泛的答案。

一个问题：您的代码与您提供的输入数据不一致。您的代码将数字拆分为空格，但输入数据改为使用方括号和逗号。此代码适用于您提供的输入。

# Parser states:
# 0: waiting for a month name
# 1: expecting numbers in the format [1, 11, 25, 37, 39, 47, 0]

from collections import defaultdict

state = 0
tipps = defaultdict(list)
monthname = None

with open("numbers.txt","r") as f:
    for line in f:
        if state == 0:
            if line.strip().endswith(":"):
                monthname = line.split(":")[0]
                state = 1
            continue
        if state == 1:
            if line.startswith("["):
                line = line.strip().strip("[]")
                numbers = line.split(",")
                tipps[monthname].append([int(n) for n in numbers])
            elif not line.strip():
                state = 0
            else:
                print (f"Unexpected data, parser stuck: {line}")
                break

for k,v in tipps.items():
    print (f"{k}: {v}")

输出为：

January: [[1, 11, 25, 37, 39, 47, 0], [3, 13, 15, 18, 37, 46, 2], [3, 6, 9, 12, 37, 46, 6]]
February: [[3, 13, 15, 18, 37, 46, 0], [1, 23, 17, 18, 37, 46, 8]]

Python自生变量？

Python self generating variables?

python

regex

generics

styles

coding-style