将不同长度的嵌套列表的 Python 字典导出到 csv。如果嵌套列表有 > 1 个条目,在移动到下一个键之前扩展到列
Export Python dict of nested lists of varying lengths to csv. If nested list has > 1 entry, expand to column before moving to next key
我有以下列表字典
d = {1: ['1','B1',['C1','C2','C3']], 2: ['2','B2','C15','D12'], 3: ['3','B3'], 4: ['4', 'B4', 'C4', ['D1', 'D2']]}
使用
将其写入 csv
with open('test.csv', "w", newline = '') as f:
writer = csv.writer(f)
writer.writerow(headers)
writer.writerows(d.values())
给我一个看起来像
的 csv
A B C D
1 B1 ['C1','C2',C3']
2 B2 C15 D12
3 B3
4 B4 C4 ['D1','D2']
如果值中有多个项目列表(嵌套列表?),我希望该列表像这样向下展开列
A B C D
1 B1 C1
1 C2
1 C3
2 B2 C15 D12
3 B3
4 B4 C4 D1
4 D2
我是 python 的新手,经过几天的论坛筛选和苦苦思索后,我似乎无法找到一种方法来做我需要做的事情。我想我可能需要拆分嵌套列表,但我需要让它们与各自的 "A" 值相关联。 A 列和 B 列始终有 1 个条目,C 列和 D 列可以有 1 到 X 个条目。
非常感谢任何帮助
似乎制作一个列表的列表可能比您正在做的更容易,并在适当的位置留出空格。这是可能做的事情:
import csv
from itertools import zip_longest
def condense(dct):
# get the maximum number of columns of any list
num_cols = len(max(dct.values(), key=len)) - 1
# Ignore the key, it's not really relevant.
for _, v in dct.items():
# first, memorize the index of this list,
# since we need to repeat it no matter what
idx = v[0]
# next, use zip_longest to make a correspondence.
# We will deliberately make a 2d list,
# and we will later withdraw elements from it one by one.
matrix = [([] if elem is None else
[elem] if not isinstance(elem, list) else
elem[:] # soft copy to avoid altering original dict
) for elem, _ in zip_longest(v[1:], range(num_cols), fillvalue=None)
]
# Now, we output the top row of the matrix as long as it has contents
while any(matrix):
# If a column in the matrix is empty, we put an empty string.
# Otherwise, we remove the row as we pass through it,
# progressively emptying the matrix top-to-bottom
# as we output a row, we also remove that row from the matrix.
# *-notation is more convenient than concatenating these two lists.
yield [idx, *((col.pop(0) if col else '') for col in matrix)]
# e.g. for key 0 and a matrix that looks like this:
# [['a1', 'a2'],
# ['b1'],
# ['c1', 'c2', 'c3']]
# this would yield the following three lists before moving on:
# ['0', 'a1', 'b1', 'c1']
# ['0', 'a2', '', 'c2']
# ['0', '', '', 'c3']
# where '' should parse into an empty column in the resulting CSV.
这里要注意的最重要的事情是我使用 isinstance(elem, list)
作为 shorthand 来检查这个东西是否是一个列表(你需要能够以某种方式做到这一点,像我们在这里做的那样压扁或圆化列表)。如果您有更复杂或更多样的数据结构,则需要通过此检查即兴发挥 - 也许编写一个辅助函数 isiterable()
尝试迭代并 returns 一个布尔值,基于这样做是否产生一个错误。
完成后,我们可以在 d
上调用 condense()
并让 csv
模块处理输出。
headers = ['A', 'B', 'C', 'D']
d = {1: ['1','B1',['C1','C2','C3']], 2: ['2','B2','C15','D12'], 3: ['3','B3'], 4: ['4', 'B4', 'C4', ['D1', 'D2']]}
# condense(d) produces
# [['1', 'B1', 'C1', '' ],
# ['1', '', 'C2', '' ],
# ['1', '', 'C3', '' ],
# ['2', 'B2', 'C15', 'D12'],
# ['3', 'B3', '', '' ],
# ['4', 'B4', 'C4', 'D1' ],
# ['4', '', '', 'D2' ]]
with open('test.csv', "w", newline = '') as f:
writer = csv.writer(f)
writer.writerow(headers)
writer.writerows(condense(d))
生成以下文件:
A,B,C,D
1,B1,C1,
1,,C2,
1,,C3,
2,B2,C15,D12
3,B3,,
4,B4,C4,D1
4,,,D2
这相当于您的预期输出。希望该解决方案具有足够的可扩展性,以便您将其应用于您的非 MVCE 问题。
我有以下列表字典
d = {1: ['1','B1',['C1','C2','C3']], 2: ['2','B2','C15','D12'], 3: ['3','B3'], 4: ['4', 'B4', 'C4', ['D1', 'D2']]}
使用
将其写入 csvwith open('test.csv', "w", newline = '') as f:
writer = csv.writer(f)
writer.writerow(headers)
writer.writerows(d.values())
给我一个看起来像
的 csvA B C D
1 B1 ['C1','C2',C3']
2 B2 C15 D12
3 B3
4 B4 C4 ['D1','D2']
如果值中有多个项目列表(嵌套列表?),我希望该列表像这样向下展开列
A B C D
1 B1 C1
1 C2
1 C3
2 B2 C15 D12
3 B3
4 B4 C4 D1
4 D2
我是 python 的新手,经过几天的论坛筛选和苦苦思索后,我似乎无法找到一种方法来做我需要做的事情。我想我可能需要拆分嵌套列表,但我需要让它们与各自的 "A" 值相关联。 A 列和 B 列始终有 1 个条目,C 列和 D 列可以有 1 到 X 个条目。
非常感谢任何帮助
似乎制作一个列表的列表可能比您正在做的更容易,并在适当的位置留出空格。这是可能做的事情:
import csv
from itertools import zip_longest
def condense(dct):
# get the maximum number of columns of any list
num_cols = len(max(dct.values(), key=len)) - 1
# Ignore the key, it's not really relevant.
for _, v in dct.items():
# first, memorize the index of this list,
# since we need to repeat it no matter what
idx = v[0]
# next, use zip_longest to make a correspondence.
# We will deliberately make a 2d list,
# and we will later withdraw elements from it one by one.
matrix = [([] if elem is None else
[elem] if not isinstance(elem, list) else
elem[:] # soft copy to avoid altering original dict
) for elem, _ in zip_longest(v[1:], range(num_cols), fillvalue=None)
]
# Now, we output the top row of the matrix as long as it has contents
while any(matrix):
# If a column in the matrix is empty, we put an empty string.
# Otherwise, we remove the row as we pass through it,
# progressively emptying the matrix top-to-bottom
# as we output a row, we also remove that row from the matrix.
# *-notation is more convenient than concatenating these two lists.
yield [idx, *((col.pop(0) if col else '') for col in matrix)]
# e.g. for key 0 and a matrix that looks like this:
# [['a1', 'a2'],
# ['b1'],
# ['c1', 'c2', 'c3']]
# this would yield the following three lists before moving on:
# ['0', 'a1', 'b1', 'c1']
# ['0', 'a2', '', 'c2']
# ['0', '', '', 'c3']
# where '' should parse into an empty column in the resulting CSV.
这里要注意的最重要的事情是我使用 isinstance(elem, list)
作为 shorthand 来检查这个东西是否是一个列表(你需要能够以某种方式做到这一点,像我们在这里做的那样压扁或圆化列表)。如果您有更复杂或更多样的数据结构,则需要通过此检查即兴发挥 - 也许编写一个辅助函数 isiterable()
尝试迭代并 returns 一个布尔值,基于这样做是否产生一个错误。
完成后,我们可以在 d
上调用 condense()
并让 csv
模块处理输出。
headers = ['A', 'B', 'C', 'D']
d = {1: ['1','B1',['C1','C2','C3']], 2: ['2','B2','C15','D12'], 3: ['3','B3'], 4: ['4', 'B4', 'C4', ['D1', 'D2']]}
# condense(d) produces
# [['1', 'B1', 'C1', '' ],
# ['1', '', 'C2', '' ],
# ['1', '', 'C3', '' ],
# ['2', 'B2', 'C15', 'D12'],
# ['3', 'B3', '', '' ],
# ['4', 'B4', 'C4', 'D1' ],
# ['4', '', '', 'D2' ]]
with open('test.csv', "w", newline = '') as f:
writer = csv.writer(f)
writer.writerow(headers)
writer.writerows(condense(d))
生成以下文件:
A,B,C,D
1,B1,C1,
1,,C2,
1,,C3,
2,B2,C15,D12
3,B3,,
4,B4,C4,D1
4,,,D2
这相当于您的预期输出。希望该解决方案具有足够的可扩展性,以便您将其应用于您的非 MVCE 问题。