openpyxl 数据帧过滤

Question

我刚刚开始使用 python，我需要在使用各自的值过滤 F 和 I 列后，将 K 列中的所有内容放入列表中。

所以基本上当 F 列匹配 stringA 并且 I 列匹配 stringC 时，然后将 K 列的所有值保存到列表中。我的代码已经可以导入正确的模块、打开和保存工作表，我只需要这方面的帮助。

我确定有不同的方法可以实现它。

l = []
for icol in sheet1.columns:
    coll = icol[0].column
    for cell in icol:
        if(coll == 'F' and cell.value == 'stringA' or coll == 'I' and cell.value == 'stringC'):
            print(coll, cell.value)
            if (coll == 'K'):
                l.append(cell.value)
print(l)

我真正需要的是在附加行中指定单元格名称。也许有一种非常 pythonic 的方法可以做到这一点。如果我弄清楚了，我会分享。

Answer 1

假设您已安装 pandas、xlrd 和 openpyxl，这将有效：

import pandas as pd

# this example data should result in a list with only 'value 1' and 'value 6'
df = pd.DataFrame([
    [None, None, None, None, None, 'stringA', None, None, 'stringC', None, 'value 1'],
    [None, None, None, None, None, 'stringX', None, None, 'stringC', None, 'value 2'],
    [None, None, None, None, None, 'stringA', None, None, 'stringX', None, 'value 3'],
    [None, None, None, None, None, None     , None, None, 'stringC', None, 'value 4'],
    [None, None, None, None, None, 'stringA', None, None, None     , None, 'value 5'],
    [None, None, None, None, None, 'stringA', None, None, 'stringC', None, 'value 6'],
])

# just writing the file, so you can verify it matches your input data
df.to_excel('test.xlsx', header=False, index=False)

# As @JiWei suggests, but using the column index instead of the name
print(df[(df[5] == 'stringA') & (df[8] == 'stringC')][10].tolist())

结果：

['value 1', 'value 6']

因此，如果您已经有了 test.xlsx 这样的文件，您只需要：

import pandas as pd

df = pd.read_excel('test.xlsx', header=None)
print(df[(df[5] == 'stringA') & (df[8] == 'stringC')][10].tolist())

openpyxl 数据帧过滤

openpyxl dataframe filteration

python

dataframe

openpyxl