Python 字典存储值不正确?

Python dictionary storing values incorrectly?

我正在使用 Python 的 openpyxl 包从 excel 文件中读取内容,并将单元格值及其父值存储在字典中。不粗体的单元格被视为 'Tasks',粗体的单元格被视为 'Summaries'。

这是我尝试读取的 Excel 文件的示例:

对于每个任务,我想将任务名称及其摘要(作为列表)存储在字典中。例如,在示例 excel 文件中,任务 4 将以名称 'Task 4' 存储,其摘要将是 ['First Summary'、'Nested Summary 2']。我根据前导空格计算嵌套的父摘要。

我的问题是,在 while 循环中,摘要列表计算正确,而当我打印字典中的所有任务名称和摘要时,摘要是错误的。

from openpyxl import load_workbook

wb = load_workbook(filename='example.xlsx')
sheet = wb['Sheet1']

tasks = {}

task_summaries = []
curr_left_spaces = -1

i = 2
current_cell = sheet[f'A{i}']

while current_cell.value:
    if current_cell.font.bold:
        # calculate number of leading spaces to determine nesting level
        left_spaces = num_left_spaces(current_cell.value) 
        curr_summary = current_cell.value.strip()

        if left_spaces > curr_left_spaces:
            task_summaries.append(curr_summary)
            curr_left_spaces = left_spaces
        elif left_spaces < curr_left_spaces:
            task_summaries = [curr_summary]
            curr_left_spaces = left_spaces
        else:
            assert (left_spaces == curr_left_spaces)
            task_summaries.pop()
            task_summaries.append(curr_summary)

    else:
        task_name = current_cell.value.strip() 

        # prints correct task_summaries list here
        print(task_name, task_summaries) 

        tasks[task_name] = task_summaries

    i += 1
    current_cell = self.sheet[f'A{i}']


for name, summary in tasks.items():
    print(name, summary) # summary is incorrect here

预期结果:

Task 1 ['First Summary']
Task 2 ['First Summary', 'Nested Summary 1']
Task 3 ['First Summary', 'Nested Summary 1']
Task 4 ['First Summary', 'Nested Summary 2']
Task 5 ['Second Summary']
Task 6 ['Second Summary']
Task 1 ['First Summary']
Task 2 ['First Summary', 'Nested Summary 1']
Task 3 ['First Summary', 'Nested Summary 1']
Task 4 ['First Summary', 'Nested Summary 2']
Task 5 ['Second Summary']
Task 6 ['Second Summary']

实际结果:

Task 1 ['First Summary']
Task 2 ['First Summary', 'Nested Summary 1']
Task 3 ['First Summary', 'Nested Summary 1']
Task 4 ['First Summary', 'Nested Summary 2']
Task 5 ['Second Summary']
Task 6 ['Second Summary']
Task 1 ['First Summary', 'Nested Summary 2']
Task 2 ['First Summary', 'Nested Summary 2']
Task 3 ['First Summary', 'Nested Summary 2']
Task 4 ['First Summary', 'Nested Summary 2']
Task 5 ['Second Summary']
Task 6 ['Second Summary']

您的问题是您对所有条目使用相同的 task_summaries 列表,并将新任务添加到字典中,它们的值引用相同的列表。

所以最后所有条目的值都是列表 ['First Summary', 'Nested Summary 2'],然后,在任务 5 中,您执行 task_summaries = [curr_summary]task_summaries 创建一个新对象,现在最后两个任务引用了同一个列表。

你需要做的是给每个条目一个新的列表,所以改变这一行:

tasks[task_name] = task_summaries

至:

tasks[task_name] = list(task_summaries)

一个更简单的示例来演示:

>>> l = [1, 2]
>>> d = {}
>>> d['a'] = l   #  'a' gets a reference to l
>>> l[0] = 3     # so that changes 'a's value too
>>> print(l)
[3, 2]
>>> print(d)
{'a', [3, 2]}

>>> d['a'] = list(l)  # now 'a' gets a new copy of l
>>> l[0] = 4          # so that shouldn't affect him
>>> print(l)
[4, 2]
>>> print(d)
{'a', [3, 2]}