乘法列与 openpyxl 或 pandas

multiplication columns with openpyxl or pandas

这是我的代码:

import pandas as pd

data = pd.read_excel ('A.xlsx') 
df = pd.DataFrame(data, columns= ['xCode', 'xYear'])
df2 = df.assign(repeated=df.groupby('xCode').transform('count'))

print (df2)

      xCode  xYear   Repeated
1.    100    1900    3
2.    100    1900    3
3.    100    1934    3
4.    200    1921    1
5.    157    1945    1
       .
       . 
999.  ...    ....    .
1000. ...    ....    .

我必须在 xYear 列上找到与 xCode 不相关的数据,(在 xCode(x) 行中不存在) 例如:第5行,xCode是157,xYear是1945,所以我要做乘法,157 in 1900,1934,1921

此代码不正确

dict1 ={key={xcode},value={xyear}}
xc=xcode
xy=xyear
for x in dict1:
    if xy exist in xc rows, continue, else get multiplication xc*xy and get this result:


          xCode  repeated    IsntConnectWith    multiplication
    1.    100    2           1921               100*1921 (just result)
    2.    100    2           1945               100*1945
    3.    200    1           1900               200*1900
    4.    200    1           1934               200*1934
    5.    200    1           1945               200*1945
    6.    157    1           1900               157*1900
    7.    157    1           1934               157*1934
    8.    157    1           1921               157*1921

    999.  ...    .           ....               ........

我需要这个结果。使用 openpyxl 或 pandas

使用 pivot_table 转换您的数据框并保持未连接年份:

out = df.pivot_table('xYear', 'xCode', 'xYear', fill_value=0, sort=False).astype(bool) \
        .replace({True: pd.NA, False: 1}).stack().index.to_frame(index=False) \
        .assign(multiplication=lambda x: x['xYear'] * x['xCode'])
print(out)

# Output
   xCode  xYear  multiplication
0    100   1921          192100
1    100   1945          194500
2    200   1900          380000
3    200   1934          386800
4    200   1945          389000
5    157   1900          298300
6    157   1921          301597
7    157   1934          303638

注意:我不明白如何计算列 repeated