python3：如何打印groupby.last()？

Question

$ cat n2.txt
apn,date
3704-156,11/04/2019
3704-156,11/22/2019
5515-004,10/23/2019
3732-231,10/07/2019
3732-231,11/15/2019

$ python3
Python 3.7.5 (default, Oct 25 2019, 10:52:18) 
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd 
>>> df = pd.read_csv("n2.txt")
>>> df
        apn        date
0  3704-156  11/04/2019
1  3704-156  11/22/2019
2  5515-004  10/23/2019
3  3732-231  10/07/2019
4  3732-231  11/15/2019
>>> g = df.groupby('apn')
>>> g.last()
                date
apn                 
3704-156  11/22/2019
3732-231  11/15/2019
5515-004  10/23/2019
>>> f = g.last()

>>> for r in f.itertuples(index=True, name='Pandas'):
...     print(getattr(r,'apn'), getattr(r,'date'))
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
AttributeError: 'Pandas' object has no attribute 'apn'

>>> for r in f.itertuples(index=True, name='Pandas'):
...     print(getattr(r,"apn"), getattr(r,"date"))
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
AttributeError: 'Pandas' object has no attribute 'apn'

将其打印到文件的正确方法是什么？

例如

apn, date
3704-156,11/22/2019
3732-231,11/15/2019
5515-004,10/23/2019

Answer 1

您的代码应该更改：

df = pd.read_csv("n2.txt")
g = df.groupby('apn')
f = g.last()

使用Series.to_csv 因为f 的输出是pandas Series:

f.to_csv(file)

或使用 DataFrame.to_csv 将 index 转换为 2 列 DataFrame:

f.reset_index().to_csv(file, index=False)

或使用 DataFrame.drop_duplicates 的解决方案：

df = pd.read_csv("n2.txt")
df = df.drop_duplicates('apn', keep='last')
df.to_csv(file, index=False)

在您的解决方案中，将 Index 用于 select index of Series:

for r in f.itertuples(index=True, name='Pandas'):
    print(getattr(r,'Index'), getattr(r,'date'))
3704-156 11/22/2019
3732-231 11/15/2019
5515-004 10/23/2019

Answer 2

df = pd.read_csv("n2.txt")
g = df.groupby('apn').last()
print(g.to_csv())

应该如你所愿。

如果您在控制台中键入 g.to_csv()，它会 returns 一个以 'apn,data,\r\n...' 开头的字符串。而 print 函数在遇到 '\r\n' 时会开始一个新行，最终给出你想要的输出。

python3：如何打印groupby.last()？

python3: how to print groupby.last()?

python

group-by

dataframe

pandas

python-3.7