如何根据条件更改给定列中的值并将这些新值放入新列中? (新增信息)
How to change values in a given column based on a condition and put those new values in a new column? (New information added)
我有一个名为 trade_lanes 的 pandas 数据框,它显示以下输出:
Departure Country
Arrival Country
Malaysia
Poland
Germany
USA
Germany
Cameroon
Argentina
Vietnam
Algeria
Slovakia
China
Vietnam
Denmark
Singapore
我想做的是根据国家/地区创建 2 个新列:数据框 trade_lanes 中的“出发地区”和“到达地区”,这样输出看起来像这样:
Departure Region
Departure Country
Arrival Region
Arrival Country
APAC
Malaysia
NECE
Poland
NECE
Germany
AMERICAS
USA
NECE
Germany
WEMEA
Cameroon
AMERICAS
Argentina
APAC
Viet Nam
WEMEA
Algeria
NECE
Slovakia
APAC
China
APAC
Vietnam
NECE
Denmark
APAC
Japan
Others
Tonga
APAC
Indonesia
我一直在寻找循环来解决这个问题,但我认为我的流程是错误的,因为在某种情况下国家被标记为区域,这使得它变得更加复杂。另请注意,我相信我 需要 使用 For-if-elif-else 循环,因为有些国家如果不属于这些国家,我会将它们标记为“其他”区域范围。
我正在考虑复制“出发国家”和“到达国家”列,然后手动替换它,但我很确定使用 for 循环有更简单的方法。
这就是我的尝试:
for elements in range(len(trade_lane)):
apac = {"AUSTRALIA": "APAC", "BANGLADESH": "APAC",
"CHINA": "APAC", "HONG KONG": "APAC",
"INDIA": "APAC", "INDONESIA": "APAC",
"JAPAN": "APAC", "MALAYSIA": "APAC",
"MALAYSIA": "APAC", "NEW ZEALAND": "APAC",
"SINGAPORE": "APAC", "KOREA": "APAC",
"TAIWAN": "APAC", "THAILAND": "APAC", "VIET NAM": "APAC"}
nece = {"BELGIUM": "NECE", "CZECH REPUBLIC": "NECE",
"DENMARK": "NECE", "GERMANY": "NECE", "HUNGARY": "NECE",
"LUXEMBOURG": "NECE", "NETHERLANDS": "NECE",
"NORWAY": "NECE", "POLAND": "NECE", "ROMANIA": "NECE",
"SLOVAKIA": "NECE", "SWEDEN": "NECE", "TURKEY": "NECE"}
wemea = {"ALGERIA": "WEMEA", "BAHRAIN": "WEMEA",
"CAMEROON": "WEMEA", "CHAD": "WEMEA", "FRANCE": "WEMEA",
"GREECE": "WEMEA", "IRISH REPUBLIC": "WEMEA",
"ITALY": "WEMEA", "MOROCCO": "WEMEA",
"PORTUGAL": "WEMEA", "QATAR": "WEMEA",
"SAUDI ARABIA": "WEMEA", "SOUTH AFRICA": "WEMEA",
"SPAIN": "WEMEA", "TUNISIA": "WEMEA", "UGANDA": "WEMEA",
"UNITED ARAB EMIRATES": "WEMEA", "UNITED KINDGOM":"WEMEA"}
americas = {"ARGENTINA": "AMERICAS", "BRAZIL": "AMERICAS",
"CANADA": "AMERICAS", "CHILE": "AMERICAS",
"COLOMBIA": "AMERICAS", "MEXICO": "AMERICAS",
"PERU": "AMERICAS", "UNITED STATES": "AMERICAS"}
for x,y in apac.items():
trade_lane["Departure Region"].values = trade_lane["Departure Country"].values[elements].replace(x,y)
trade_lane
但是我得到一个 KeyError:'Departure Region'
假设您的数据框贸易通道称为 df:
all_regions = {**apac, **nece, **wemea, **americas} # merge your dictionaries into one
df['Departure Region'] = df['Departure Country'].map(all_regions) #map countries to regions
df['Departure Region'] = df['Departure Region'].fillna('Others') #If any country not found in the map
您可以为 'Arrival region' 执行相同的过程。
这个的 for-loop 版本可以是:
all_regions = {**apac, **nece, **wemea, **americas} # merge your dictionaries into one
temp = []
for i,row in df.iterrows():
if row['Departure Country'] in all_regions:
temp.append(all_regions[row['Departure Country']])
#else if: # add here corner cases.
# do something
else:
temp.append('Others')
df['Departure Region'] = temp
我有一个名为 trade_lanes 的 pandas 数据框,它显示以下输出:
Departure Country | Arrival Country |
---|---|
Malaysia | Poland |
Germany | USA |
Germany | Cameroon |
Argentina | Vietnam |
Algeria | Slovakia |
China | Vietnam |
Denmark | Singapore |
我想做的是根据国家/地区创建 2 个新列:数据框 trade_lanes 中的“出发地区”和“到达地区”,这样输出看起来像这样:
Departure Region | Departure Country | Arrival Region | Arrival Country |
---|---|---|---|
APAC | Malaysia | NECE | Poland |
NECE | Germany | AMERICAS | USA |
NECE | Germany | WEMEA | Cameroon |
AMERICAS | Argentina | APAC | Viet Nam |
WEMEA | Algeria | NECE | Slovakia |
APAC | China | APAC | Vietnam |
NECE | Denmark | APAC | Japan |
Others | Tonga | APAC | Indonesia |
我一直在寻找循环来解决这个问题,但我认为我的流程是错误的,因为在某种情况下国家被标记为区域,这使得它变得更加复杂。另请注意,我相信我 需要 使用 For-if-elif-else 循环,因为有些国家如果不属于这些国家,我会将它们标记为“其他”区域范围。
我正在考虑复制“出发国家”和“到达国家”列,然后手动替换它,但我很确定使用 for 循环有更简单的方法。
这就是我的尝试:
for elements in range(len(trade_lane)):
apac = {"AUSTRALIA": "APAC", "BANGLADESH": "APAC",
"CHINA": "APAC", "HONG KONG": "APAC",
"INDIA": "APAC", "INDONESIA": "APAC",
"JAPAN": "APAC", "MALAYSIA": "APAC",
"MALAYSIA": "APAC", "NEW ZEALAND": "APAC",
"SINGAPORE": "APAC", "KOREA": "APAC",
"TAIWAN": "APAC", "THAILAND": "APAC", "VIET NAM": "APAC"}
nece = {"BELGIUM": "NECE", "CZECH REPUBLIC": "NECE",
"DENMARK": "NECE", "GERMANY": "NECE", "HUNGARY": "NECE",
"LUXEMBOURG": "NECE", "NETHERLANDS": "NECE",
"NORWAY": "NECE", "POLAND": "NECE", "ROMANIA": "NECE",
"SLOVAKIA": "NECE", "SWEDEN": "NECE", "TURKEY": "NECE"}
wemea = {"ALGERIA": "WEMEA", "BAHRAIN": "WEMEA",
"CAMEROON": "WEMEA", "CHAD": "WEMEA", "FRANCE": "WEMEA",
"GREECE": "WEMEA", "IRISH REPUBLIC": "WEMEA",
"ITALY": "WEMEA", "MOROCCO": "WEMEA",
"PORTUGAL": "WEMEA", "QATAR": "WEMEA",
"SAUDI ARABIA": "WEMEA", "SOUTH AFRICA": "WEMEA",
"SPAIN": "WEMEA", "TUNISIA": "WEMEA", "UGANDA": "WEMEA",
"UNITED ARAB EMIRATES": "WEMEA", "UNITED KINDGOM":"WEMEA"}
americas = {"ARGENTINA": "AMERICAS", "BRAZIL": "AMERICAS",
"CANADA": "AMERICAS", "CHILE": "AMERICAS",
"COLOMBIA": "AMERICAS", "MEXICO": "AMERICAS",
"PERU": "AMERICAS", "UNITED STATES": "AMERICAS"}
for x,y in apac.items():
trade_lane["Departure Region"].values = trade_lane["Departure Country"].values[elements].replace(x,y)
trade_lane
但是我得到一个 KeyError:'Departure Region'
假设您的数据框贸易通道称为 df:
all_regions = {**apac, **nece, **wemea, **americas} # merge your dictionaries into one
df['Departure Region'] = df['Departure Country'].map(all_regions) #map countries to regions
df['Departure Region'] = df['Departure Region'].fillna('Others') #If any country not found in the map
您可以为 'Arrival region' 执行相同的过程。
这个的 for-loop 版本可以是:
all_regions = {**apac, **nece, **wemea, **americas} # merge your dictionaries into one
temp = []
for i,row in df.iterrows():
if row['Departure Country'] in all_regions:
temp.append(all_regions[row['Departure Country']])
#else if: # add here corner cases.
# do something
else:
temp.append('Others')
df['Departure Region'] = temp