如何从目录的多个 csv 中提取特定值，并将它们附加到数据框中？

Question

我有一个包含数百个 csv 文件的目录，这些文件代表热像仪 (288x383) 的像素，我想获取每个文件（例如 144 x 191）的中心值，以及每个文件的中心值收集这些值，将它们添加到一个数据框中，该数据框中显示每个文件的名称列表。

按照我的代码，我在其中创建了包含多个 csv 文件列表的数据框：

import os
import glob
import numpy as np
import pandas as pd
os.chdir("/Programming/Proj1/Code/Image_Data")

!ls

输出：
2021-09-13_13-42-16.csv
2021-09-13_13-42-22.csv
2021-09-13_13-42-29.csv
2021-09-13_13-42-35.csv
2021-09-13_13-42-47.csv
2021-09-13_13-42-53.csv
...

file_extension = '.csv'
all_filenames = [i for i in glob.glob(f"*{file_extension}")]
files = glob.glob('*.csv')

all_df = pd.DataFrame(all_filenames, columns = ['Full_name '])

all_df.head()
    **Full_name**
0   2021-09-13_13-42-16.csv
1   2021-09-13_13-42-22.csv
2   2021-09-13_13-42-29.csv
3   2021-09-13_13-42-35.csv
4   2021-09-13_13-42-47.csv
5   2021-09-13_13-42-53.csv
6   2021-09-13_13-43-00.csv

Answer 1

您可以一个一个地遍历您的文件，将它们作为数据框读入并获取您想要的中心值。然后将此值与文件名一起保存。然后可以将此结果列表读入一个新的数据框供您使用。

result = []
for file in files: 
    # read in the file, you may need to specify some extra parameters
    # check the pandas docs for read_csv
    df = pd.read_csv(file)

    # now select the value you want
    # this will vary depending on what your indexes look like (if any)
    # and also your column names
    value = df.loc[row, col]

    # append to the list
    result.append((file, value))

# you should now have a list in the format:
# [('2021-09-13_13-42-16.csv', 100), ('2021-09-13_13-42-22.csv', 255), ...

# load the list of tuples as a dataframe for further processing or analysis...
result_df = pd.DataFrame(result)

如何从目录的多个 csv 中提取特定值，并将它们附加到数据框中？

How to extract a specific value from multiple csv of a directory, and append them in a dataframe?

python

csv

automation

dataframe