非结构化数据查找列数
Unstructured data to find a column count
我的性能日志中有非结构化数据。我想从中获取服务细节。我可以做定界符,但是我无法计算或打印该列,因为它没有任何 header.
请帮我弄清楚这个issue.s
import pandas as pd
df = pd.read_csv (r'/Users/Myhome/Documents/Py_Learning/log.csv', sep = '|' , skipinitialspace=True)
#df = pd.read_csv (r'/Users/Myhome/Documents/Py_Learning/log.csv', sep =':|,|[|]', engine='python', header=None) ---> Multi separator is giving error.
#df.groupby("CLIENT")
SERVICE = df.columns[4]
print (SERVICE)
如何在所有行中找到唯一的服务名称并获取计数。我想把它作为上周数据的图表。
示例数据:
2019-10-22 15:35|Where:CARD|SERVICE:Dell|VERSION:1.0|CLIENT:HDD|OPERATION:boverdue|RESPONSETIME:0034|STATUS:100000:ERR_TRANSACTION_TIMED_OUT|SEVERITY:ERROR|STATUSCODE:SOAP-FAULT|STATUSMESSAGE:NA 2019-10-22 15:35|Where:Digital|SERVICE:Laptop|VERSION:1.0|CLIENT:mouse|OPERATION:connet|RESPONSETIME:3456|STATUS:NO_RECORDS_MATCH_SELECTION_CRITERIA|SEVERITY:INFO|STATUSCODE:1120|STATUSMESSAGE:NA
我不知道你的数据集究竟如何,但你可以 return 使用 value_counts
reference.
的唯一值
df_unique = (df['SERVICE'].value_counts()
.rename_axis('service')
.reset_index(name='COUNT'))
我的性能日志中有非结构化数据。我想从中获取服务细节。我可以做定界符,但是我无法计算或打印该列,因为它没有任何 header.
请帮我弄清楚这个issue.s
import pandas as pd
df = pd.read_csv (r'/Users/Myhome/Documents/Py_Learning/log.csv', sep = '|' , skipinitialspace=True)
#df = pd.read_csv (r'/Users/Myhome/Documents/Py_Learning/log.csv', sep =':|,|[|]', engine='python', header=None) ---> Multi separator is giving error.
#df.groupby("CLIENT")
SERVICE = df.columns[4]
print (SERVICE)
如何在所有行中找到唯一的服务名称并获取计数。我想把它作为上周数据的图表。
示例数据:
2019-10-22 15:35|Where:CARD|SERVICE:Dell|VERSION:1.0|CLIENT:HDD|OPERATION:boverdue|RESPONSETIME:0034|STATUS:100000:ERR_TRANSACTION_TIMED_OUT|SEVERITY:ERROR|STATUSCODE:SOAP-FAULT|STATUSMESSAGE:NA 2019-10-22 15:35|Where:Digital|SERVICE:Laptop|VERSION:1.0|CLIENT:mouse|OPERATION:connet|RESPONSETIME:3456|STATUS:NO_RECORDS_MATCH_SELECTION_CRITERIA|SEVERITY:INFO|STATUSCODE:1120|STATUSMESSAGE:NA
我不知道你的数据集究竟如何,但你可以 return 使用 value_counts
reference.
df_unique = (df['SERVICE'].value_counts()
.rename_axis('service')
.reset_index(name='COUNT'))