从 HTTP header 的每一列指定 read_csv 中每一列的数据类型

Question

project_id = request.data['project']
list_fields = request.POST.getlist('headers')
type_fields = request.POST.getlist('type')

dataframe = pandas.read_csv(file_path, header=0)
                for field in list_fields:
                    for tipo in type_fields:
                        dataframe[field] = dataframe[field].astype(type)

如何根据请求中的过去将每种类型的数据分配给一个列？

Answer 1

您可以从前面传递一个包含您想要定义的所有列的列表，而不是传递另一个包含您想要的列类型的列表。之后你可以施放这些循环。

to_define_list_fields = request.POST.getlist('define')

type_list_fields = request.POST.getlist('types')                  

for dfield in to_define_list_fields:

   for type in type_list_fields:

       dataframe[dfield] = dataframe[dfield].astype(type)

Answer 2

当您使用 from_csv() 时，pandas 将进行大量类型推断。其实比起其他方法如convert_objects. I asked a question about it ，多少有点关系。

我假设前端用户必须指定每一列的数据类型。在这种情况下，这是一个简单的情况：

import numpy as np
import pandas as pd

df = pd.DataFrame({'a':[1, 2, 3], 'b': [4, 5, 6], 'c': [7, 8, 9]}, 
                   dtype=int)

list_fields = ['a', 'b', 'c']
list_types = [str, int, np.float64]

for field, dtype in zip(list_fields, list_types):
    df[field] = df[field].astype(dtype)

print(df.dtypes)

如果用户不必指定所有字段的数据类型，那么，经过进一步思考，我认为这将完全取决于您 filter/process 用户输入的方式。

从 HTTP header 的每一列指定 read_csv 中每一列的数据类型

Specify the data type of each column in read_csv from each column of an HTTP header

python

types

type-inference

http-headers

pandas