在 Python 中用 "N/A" 替换特殊字符
Replace special characters with "N/A" in Python
我想将所有只有表情符号的行(例如 df['Comments'][2]
更改为 N/A。
df['Comments'][:6]
0 nice
1 Insane3
2 ❤️
3 @bertelsen1986
4 20 or 30 mm rise on the Renthal Fatbar?
5 Luckily I have one to
以下代码没有return我期望的输出:
df['Comments'].replace(';', ':', '!', '*', np.NaN)
预期输出:
df['Comments'][:6]
0 nice
1 Insane3
2 nan
3 @bertelsen1986
4 20 or 30 mm rise on the Renthal Fatbar?
5 Luckily I have one to
您可以通过遍历每行中的 unicode 字符(使用 emoji and unicodedata 包)来检测仅包含 表情符号的行:
df = {}
df['Comments'] = ["Test", "Hello ", ""]
import unicodedata
import numpy as np
from emoji import UNICODE_EMOJI
for i in range(len(df['Comments'])):
pure_emoji = True
for unicode_char in unicodedata.normalize('NFC', df['Comments'][i]):
if unicode_char not in UNICODE_EMOJI:
pure_emoji = False
break
if pure_emoji:
df['Comments'][i] = np.NaN
print(df['Comments'])
函数(remove_emoji)参考
尝试
先安装 emoji
lib - pip install emoji
import re
import emoji
df.Comments.apply(lambda x: x if (re.sub(r'(:[!_\-\w]+:)', '', emoji.demojize(x)) != "") else np.nan)
0 nice
1 Insane3
2 NaN
3 @bertelsen1986
4 Luckily I have one to
Name: a, dtype: object
我想将所有只有表情符号的行(例如 df['Comments'][2]
更改为 N/A。
df['Comments'][:6]
0 nice
1 Insane3
2 ❤️
3 @bertelsen1986
4 20 or 30 mm rise on the Renthal Fatbar?
5 Luckily I have one to
以下代码没有return我期望的输出:
df['Comments'].replace(';', ':', '!', '*', np.NaN)
预期输出:
df['Comments'][:6]
0 nice
1 Insane3
2 nan
3 @bertelsen1986
4 20 or 30 mm rise on the Renthal Fatbar?
5 Luckily I have one to
您可以通过遍历每行中的 unicode 字符(使用 emoji and unicodedata 包)来检测仅包含 表情符号的行:
df = {}
df['Comments'] = ["Test", "Hello ", ""]
import unicodedata
import numpy as np
from emoji import UNICODE_EMOJI
for i in range(len(df['Comments'])):
pure_emoji = True
for unicode_char in unicodedata.normalize('NFC', df['Comments'][i]):
if unicode_char not in UNICODE_EMOJI:
pure_emoji = False
break
if pure_emoji:
df['Comments'][i] = np.NaN
print(df['Comments'])
函数(remove_emoji)参考
尝试
先安装 emoji
lib - pip install emoji
import re
import emoji
df.Comments.apply(lambda x: x if (re.sub(r'(:[!_\-\w]+:)', '', emoji.demojize(x)) != "") else np.nan)
0 nice
1 Insane3
2 NaN
3 @bertelsen1986
4 Luckily I have one to
Name: a, dtype: object