如何使用 Pandas 数据框将 R 代码语法转换为 Python 语法?

How to convert R code syntax into Python syntax using Pandas data frame?

假设我们在 R 中有以下代码,它与 Python 中的 Pandas 数据框 syntax/method 的等效代码是什么?

network_tickets <- contains(comcast_data$CustomerComplaint, match = 'network', ignore.case = T)
internet_tickets <- contains(comcast_data$CustomerComplaint, match = 'internet', ignore.case = T)
billing_tickets <- contains(comcast_data$CustomerComplaint, match = 'bill', ignore.case = T)
email_tickets <- contains(comcast_data$CustomerComplaint, match = 'email', ignore.case = T)
charges_ticket <- contains(comcast_data$CustomerComplaint, match = 'charge', ignore.case = T)
    
comcast_data$ComplaintType[internet_tickets] <- "Internet"
comcast_data$ComplaintType[network_tickets] <- "Network"
comcast_data$ComplaintType[billing_tickets] <- "Billing"
comcast_data$ComplaintType[email_tickets] <- "Email"
comcast_data$ComplaintType[charges_ticket] <- "Charges"
    
comcast_data$ComplaintType[-c(internet_tickets, network_tickets, billing_tickets, c
                              harges_ticket, email_tickets)] <- "Others"

我可以像下面这样转换第一组操作 Python:

network_tickets = df.ComplaintDescription.str.contains ('network', regex=True, case=False)

但是,发现将变量 network_tickets 作为值“Internet”分配给新的 pandas 数据框列(即 ComplaintType)的挑战。在 R 中,您似乎只需一行就可以做到这一点。

但是,不确定我们如何在 Python 中用一行代码做到这一点,尝试了以下方法但出现错误:

a) df['ComplaintType'].apply(internet_tickets) = "Internet"
b) df['ComplaintType'] = df.apply(internet_tickets)
c) df['ComplaintType'] = internet_tickets.apply("Internet")

我想我们可以先在数据框中创建一个新列:

df['ComplaintType'] = internet_tickets

但不确定接下来的步骤。

使用Series.str.contains with DataFrame.loc按列表设置值:

df = pd.DataFrame(data = {"ComplaintDescription":["BiLLing is super","email","new"]})

L = [ "Internet","Network", "Billing", "Email", "Charges"]
for val in L:
    df.loc[df['ComplaintDescription'].str.contains(val, case=False), 'ComplaintType'] = val

df['ComplaintType'] = df['ComplaintType'].fillna('Others')
print (df)
  ComplaintDescription ComplaintType
0     BiLLing is super       Billing
1                email         Email
2                  new        Others

编辑:

如果需要单独使用值:

df.loc[df['ComplaintDescription'].str.contains('network', case=False), 'ComplaintType'] = "Internet"