Pandas 数据框输出格式

Question

我正在导入一个交易清单，并试图将其合并到一个包含总数量和平均价格的头寸文件中。我根据（股票代码、类型、到期日和行使价）进行分组。两个问题：

输出在第一列中包含索引组（股票代码、类型、到期日和行权价）。我该如何更改它，以便每个索引列输出到它自己的列，以便输出 csv 的格式与输入数据的格式相同？
我目前强制股票交易具有值（“1”），因为将单元格留空会导致错误，但这会增加错误数据，因为“1”没有意义。有没有办法在不引起问题的情况下保留“”？

数据框：

    GM      stock   1           1       32      100
    AAPL    call    201612      120     3.5     1000
    AAPL    call    201612      120     3.25    1000
    AAPL    call    201611      120     2.5     2000
    AAPL    put     201612      115     2.5     500
    AAPL    stock   1            1      117     100

代码：

    import pandas as pd
    import numpy as np

    df = pd.read_csv(input_file, index_col=['ticker', 'type', 'expiration', 'strike'], names=['ticker', 'type', 'expiration', 'strike', 'price', 'quantity'])
    df_output = df.groupy(df.index).agg({'price':np.mean, 'quantity':np.sum})
    df_output.to_csv(output_file, sep=',')

csv 输出格式如下：

(ticker, type, expiration, strike), price, quantity

所需格式：

ticker, type, expiration, strike, price, quantity

Answer 1

第一个问题，你应该使用groupby(df.index_col)而不是groupby(df.index)

第二，我不确定你为什么不能保留“”，它是数字吗？

我模拟了一些如下数据：

import pandas as pd                                                                                                 
import numpy as np                                                                                                  

d = [                                                                                                               
    {'ticker':'A', 'type':'M', 'strike':'','price':32},                                                             
    {'ticker':'B', 'type':'F', 'strike':100,'price':3.5},                                                           
    {'ticker':'C', 'type':'F', 'strike':'', 'price':2.5}                                                            

]                                                                                                                   
df = pd.DataFrame(d)                                                                                                
print df                                                                                                            

#dgroup = df.groupby(['ticker', 'type']).agg({'price':np.mean})                                                     
df.index_col = ['ticker', 'type', 'strike']                                                                         
dgroup = df.groupby(df.index_col).agg({'price':np.mean})   
#dgroup = df.groupby(df.index).agg({'price':np.mean})                                                 
print dgroup                                                                                                        
print type(dgroup)                                                                                                  
dgroup.to_csv('check.csv')

在check.csv中的输出：

ticker,type,strike,price                                                                                            
A,M,,32.0                                                                                                           
B,F,100,3.5                                                                                                         
C,F,,2.5

Pandas 数据框输出格式

Pandas dataframe output formatting

python

format

export-to-csv

pandas