设置 Pandas DataFrame 列的字符限制

Setting Character Limit on Pandas DataFrame Column

背景:
鉴于以下 pandas df -

Holding Account Model Type Entity ID Direct Owner ID
WF LLC | 100 Jones Street 26th Floor San Francisco Ca Ltd Liability - Income Based Gross USA Only (486941515) 51364633 4564564 5646546
RF LLC | Neuberger | LLC | Aukai Services LLC-Neuberger Smid - Income Accuring Net of Fees Worldwide Fund (456456218) 46256325 1645365 4926654

提问:
Holding Account 列 (dtype = object) 值强制执行 80 个字符限制的最 pythonic 方法是什么?

上下文:我正在将 df 写入 .csv,然后上传到具有 80 个字符限制的系统。 Holding Account 列的值是唯一的,所以我只想牺牲那些使字符串超过 80 个字符的字符。

我的尝试:
这就是我尝试的 - df['column'] = df['column'].str[:80]

为什么不像以前那样使用 .str

df['Holding Account'] = df['Holding Account'].str[:80]

输出:

>>> df
                                                                    Holding Account  Model Type  Entity ID  Direct Owner ID
0  WF LLC | 100 Jones Street 26th Floor San Francisco Ca Ltd Liability - Income Bas    51364633    4564564          5646546
1  RF LLC | Neuberger | LLC | Aukai Services LLC-Neuberger Smid - Income Accuring N    46256325    1645365          4926654

使用切片会丢失一些信息,我建议在分解后创建一个映射table。这也为服务器或 db

节省了存储 space
s = df['Holding Account'].factorize()[0]
df['Holding Account'] = df['Holding Account'].factorize()[0]
d = dict(zip(s, df['Holding Account']))

如果你想获取数据库就这样做

df['new'] = df['Holding Account'] .map(d)