如何根据条件更改给定列中的值并将这些新值放入新列中? (新增信息)

How to change values in a given column based on a condition and put those new values in a new column? (New information added)

我有一个名为 trade_lanes 的 pandas 数据框,它显示以下输出:

Departure Country Arrival Country
Malaysia Poland
Germany USA
Germany Cameroon
Argentina Vietnam
Algeria Slovakia
China Vietnam
Denmark Singapore

我想做的是根据国家/地区创建 2 个新列:数据框 trade_lanes 中的“出发地区”和“到达地区”,这样输出看起来像这样:

Departure Region Departure Country Arrival Region Arrival Country
APAC Malaysia NECE Poland
NECE Germany AMERICAS USA
NECE Germany WEMEA Cameroon
AMERICAS Argentina APAC Viet Nam
WEMEA Algeria NECE Slovakia
APAC China APAC Vietnam
NECE Denmark APAC Japan
Others Tonga APAC Indonesia

我一直在寻找循环来解决这个问题,但我认为我的流程是错误的,因为在某种情况下国家被标记为区域,这使得它变得更加复杂。另请注意,我相信我 需要 使用 For-if-elif-else 循环,因为有些国家如果不属于这些国家,我会将它们标记为“其他”区域范围。

我正在考虑复制“出发国家”和“到达国家”列,然后手动替换它,但我很确定使用 for 循环有更简单的方法。

这就是我的尝试:

for elements in range(len(trade_lane)):
    
    apac = {"AUSTRALIA": "APAC", "BANGLADESH": "APAC", 
                  "CHINA": "APAC", "HONG KONG": "APAC",
                  "INDIA": "APAC", "INDONESIA": "APAC",
                  "JAPAN": "APAC", "MALAYSIA": "APAC",
                  "MALAYSIA": "APAC", "NEW ZEALAND": "APAC",
                     "SINGAPORE": "APAC", "KOREA": "APAC",
                    "TAIWAN": "APAC", "THAILAND": "APAC", "VIET NAM": "APAC"}

    nece = {"BELGIUM": "NECE", "CZECH REPUBLIC": "NECE",
                  "DENMARK": "NECE", "GERMANY": "NECE", "HUNGARY": "NECE",
                  "LUXEMBOURG": "NECE", "NETHERLANDS": "NECE",
                  "NORWAY": "NECE", "POLAND": "NECE", "ROMANIA": "NECE",
                  "SLOVAKIA": "NECE", "SWEDEN": "NECE", "TURKEY": "NECE"}
    
    wemea = {"ALGERIA": "WEMEA", "BAHRAIN": "WEMEA", 
                   "CAMEROON": "WEMEA", "CHAD": "WEMEA", "FRANCE": "WEMEA",
                   "GREECE": "WEMEA", "IRISH REPUBLIC": "WEMEA",
                   "ITALY": "WEMEA", "MOROCCO": "WEMEA",
                   "PORTUGAL": "WEMEA", "QATAR": "WEMEA",
                   "SAUDI ARABIA": "WEMEA", "SOUTH AFRICA": "WEMEA",
                   "SPAIN": "WEMEA", "TUNISIA": "WEMEA", "UGANDA": "WEMEA",
                   "UNITED ARAB EMIRATES": "WEMEA", "UNITED KINDGOM":"WEMEA"}
    
    americas = {"ARGENTINA": "AMERICAS", "BRAZIL": "AMERICAS",
                      "CANADA": "AMERICAS", "CHILE": "AMERICAS",
                      "COLOMBIA": "AMERICAS", "MEXICO": "AMERICAS",
                      "PERU": "AMERICAS", "UNITED STATES": "AMERICAS"}
    
    for x,y in apac.items():
        trade_lane["Departure Region"].values = trade_lane["Departure Country"].values[elements].replace(x,y)
        
trade_lane

但是我得到一个 KeyError:'Departure Region'

假设您的数据框贸易通道称为 df:

all_regions = {**apac, **nece, **wemea, **americas} # merge your dictionaries into one
df['Departure Region'] = df['Departure Country'].map(all_regions) #map countries to regions
df['Departure Region'] = df['Departure Region'].fillna('Others') #If any country not found in the map

您可以为 'Arrival region' 执行相同的过程。

这个的 for-loop 版本可以是:

all_regions = {**apac, **nece, **wemea, **americas} # merge your dictionaries into one

temp = []
for i,row in df.iterrows():
    if row['Departure Country'] in all_regions:
        temp.append(all_regions[row['Departure Country']])
   #else if: # add here corner cases.
   #    do something 
    else:
        temp.append('Others')

df['Departure Region'] = temp