在 Python 中用 Pandas 确定奇数和偶数
Detemining Odd and Even values with Pandas in Python
我有以下一些 IP 地址,我想根据它们的最后一位数字对它们进行分类。
一个 IPv4 地址由四个数字组成:
- 每个包含一到三个数字(0-255)
- 用一个点 (.) 分隔每个数字或一组数字
现在我想引用 IP 地址的最后一位数字,如果它是 [Odd]
,则相关列用奇数填充,如果它是 [Even]
,它会用偶数填充。
IP Address
192.168.1.1 #last digit is 1 and consider it as odd
192.168.1.2 #last digit is 2 and even
192.168.152.200 #last digit is 200 and is even
192.168.54.98 #last digit is 98 and is even
192.168.98.93 #last digit is 93 and is odd
.....
......
预期结果:
IP Address Status
192.168.1.1 Odd
192.168.1.2 Even
192.168.152.200 Even
192.168.54.98 Even
192.168.98.93 Odd
........
........
数据:
df = pd.DataFrame({"IP Address" :
["192.168.1.1",
"192.168.1.2",
"192.168.152.200",
"192.168.54.98",
"192.168.98.93"]})
df:
IP Address
0 192.168.1.1
1 192.168.1.2
2 192.168.152.200
3 192.168.54.98
4 192.168.98.93
df['New-variable'] = df['IP Address'].apply(lambda x:"Odd" if int(x.split(".")[-1]) % 2 else "Even")
df:
IP Address New-variable
0 192.168.1.1 Odd
1 192.168.1.2 Even
2 192.168.152.200 Even
3 192.168.54.98 Even
4 192.168.98.93 Odd
你可以试试这个
IPList = ["192.168.1.1",
"192.168.1.1" ,
"192.168.1.2",
"192.168.152.200",
"192.168.54.98",
"192.168.98.93" ]
final_list = []
for ip in IPList:
_ip = ip.split(".")
if int(_ip[-1]) %2==0:
final_list.append([ip,'Even'])
continue
final_list.append([ip,'Odd'])
df = pd.DataFrame(final_list, columns =['IP Address', 'Type'])
df
输出:
IP Address Type
0 192.168.1.1 Odd
1 192.168.1.1 Odd
2 192.168.1.2 Even
3 192.168.152.200 Even
4 192.168.54.98 Even
5 192.168.98.93 Odd
假设你的df
叫x,你可以这样做:
import numpy as np
# First remove the '.' so that you can convert to float
x['IP_num'] = (x['IP'].apply(lambda x: ''.join([ch for ch in x if ch.isdigit()]))).astype(float)
# Then create a new column if the IP is odd / even
x['even_odd'] = np.where(x['IP_num'] % 2 == 0,'Even','Odd')
输出:
IP IP_num even_odd
0 192.168.1.1 19,216,811.00 Odd
1 192.168.1.1 19,216,811.00 Odd
2 192.168.152.200 192,168,152,200.00 Even
3 192.168.54.98 1,921,685,498.00 Even
4 192.168.98.93 1,921,689,893.00 Odd
如果需要,您可以删除 'IP_num' 列。
这只是显示输入:
import pandas as pd
adr_df = pd.DataFrame(['192.168.1.1', '192.168.1.2', '192.168.152.200',
'192.168.54.98', '192.168.98.93'], columns=['IP Adress'])
一个例子,如果你真的喜欢使用 regex 来捕获最后一位数字,你可以使用以下命令(小心使用 [=13= 转义点字符]):
adr_df['Last Nr'] = adr_df['IP Adress'].str.extract(r'.*\..*\..*\.(.*)').astype(int)
当然,可以有更精确的正则表达式字符串来匹配 ip,但这个对我有用。
检查您可以使用一个小的 lambda 函数达到的奇数:
adr_df.loc['Status'] = adr_df['Last Nr'].apply(lambda x: 'Odd' if x%2 else 'Even')
此处不需要 for
或 apply
循环 - 在最后 .
之后提取值,转换为整数,使用 % 2
并最后传递给 numpy.where
:
df['new'] = np.where(df['IP Address'].str.split('.').str[-1].astype(int) % 2,'Odd','Even')
print (df)
IP Address new
0 192.168.1.1 Odd
1 192.168.1.2 Even
2 192.168.152.200 Even
3 192.168.54.98 Even
4 192.168.98.93 Odd
df
df['Status'] = [int(str(x).strip()[-1]) for x in df['IP Address']]
df['Status'] = np.where(df['Status']%2, 'Odd', 'Even')
您可以 extract
最后一位数字,然后使用方法 divmod
和 map
:
df['IP Adress'].str.extract('(\d+)$', expand=False).astype(int)\
.divmod(2)[1].map({1: 'odd', 0: 'even'})
我有以下一些 IP 地址,我想根据它们的最后一位数字对它们进行分类。
一个 IPv4 地址由四个数字组成:
- 每个包含一到三个数字(0-255)
- 用一个点 (.) 分隔每个数字或一组数字
现在我想引用 IP 地址的最后一位数字,如果它是 [Odd]
,则相关列用奇数填充,如果它是 [Even]
,它会用偶数填充。
IP Address
192.168.1.1 #last digit is 1 and consider it as odd
192.168.1.2 #last digit is 2 and even
192.168.152.200 #last digit is 200 and is even
192.168.54.98 #last digit is 98 and is even
192.168.98.93 #last digit is 93 and is odd
.....
......
预期结果:
IP Address Status
192.168.1.1 Odd
192.168.1.2 Even
192.168.152.200 Even
192.168.54.98 Even
192.168.98.93 Odd
........
........
数据:
df = pd.DataFrame({"IP Address" :
["192.168.1.1",
"192.168.1.2",
"192.168.152.200",
"192.168.54.98",
"192.168.98.93"]})
df:
IP Address
0 192.168.1.1
1 192.168.1.2
2 192.168.152.200
3 192.168.54.98
4 192.168.98.93
df['New-variable'] = df['IP Address'].apply(lambda x:"Odd" if int(x.split(".")[-1]) % 2 else "Even")
df:
IP Address New-variable
0 192.168.1.1 Odd
1 192.168.1.2 Even
2 192.168.152.200 Even
3 192.168.54.98 Even
4 192.168.98.93 Odd
你可以试试这个
IPList = ["192.168.1.1",
"192.168.1.1" ,
"192.168.1.2",
"192.168.152.200",
"192.168.54.98",
"192.168.98.93" ]
final_list = []
for ip in IPList:
_ip = ip.split(".")
if int(_ip[-1]) %2==0:
final_list.append([ip,'Even'])
continue
final_list.append([ip,'Odd'])
df = pd.DataFrame(final_list, columns =['IP Address', 'Type'])
df
输出:
IP Address Type
0 192.168.1.1 Odd
1 192.168.1.1 Odd
2 192.168.1.2 Even
3 192.168.152.200 Even
4 192.168.54.98 Even
5 192.168.98.93 Odd
假设你的df
叫x,你可以这样做:
import numpy as np
# First remove the '.' so that you can convert to float
x['IP_num'] = (x['IP'].apply(lambda x: ''.join([ch for ch in x if ch.isdigit()]))).astype(float)
# Then create a new column if the IP is odd / even
x['even_odd'] = np.where(x['IP_num'] % 2 == 0,'Even','Odd')
输出:
IP IP_num even_odd
0 192.168.1.1 19,216,811.00 Odd
1 192.168.1.1 19,216,811.00 Odd
2 192.168.152.200 192,168,152,200.00 Even
3 192.168.54.98 1,921,685,498.00 Even
4 192.168.98.93 1,921,689,893.00 Odd
如果需要,您可以删除 'IP_num' 列。
这只是显示输入:
import pandas as pd
adr_df = pd.DataFrame(['192.168.1.1', '192.168.1.2', '192.168.152.200',
'192.168.54.98', '192.168.98.93'], columns=['IP Adress'])
一个例子,如果你真的喜欢使用 regex 来捕获最后一位数字,你可以使用以下命令(小心使用 [=13= 转义点字符]):
adr_df['Last Nr'] = adr_df['IP Adress'].str.extract(r'.*\..*\..*\.(.*)').astype(int)
当然,可以有更精确的正则表达式字符串来匹配 ip,但这个对我有用。
检查您可以使用一个小的 lambda 函数达到的奇数:
adr_df.loc['Status'] = adr_df['Last Nr'].apply(lambda x: 'Odd' if x%2 else 'Even')
此处不需要 for
或 apply
循环 - 在最后 .
之后提取值,转换为整数,使用 % 2
并最后传递给 numpy.where
:
df['new'] = np.where(df['IP Address'].str.split('.').str[-1].astype(int) % 2,'Odd','Even')
print (df)
IP Address new
0 192.168.1.1 Odd
1 192.168.1.2 Even
2 192.168.152.200 Even
3 192.168.54.98 Even
4 192.168.98.93 Odd
df
df['Status'] = [int(str(x).strip()[-1]) for x in df['IP Address']]
df['Status'] = np.where(df['Status']%2, 'Odd', 'Even')
您可以 extract
最后一位数字,然后使用方法 divmod
和 map
:
df['IP Adress'].str.extract('(\d+)$', expand=False).astype(int)\
.divmod(2)[1].map({1: 'odd', 0: 'even'})