如何将大字符串转换为数据框?
How can you convert a large string into a data frame?
早上好,我有一个关于 pandas 和 python 的非常重要的问题。我是 Python 的新手,我正在使用 PARAMIKO 库,通过它我得到一个 "form" 类似于 table 的答案,但我想应用数据框并更好地获取对我最有用的列 "interfaces" "InUti" 和 "OutUti" 的值,问题是我正在阅读 pandas 但我不知道如何应用 属性 的 pandas 从我的字符串中获取我的数据框 obtained 。然后我留下我从答案中得到的字符串,我需要应用数据框。正如上面突出显示的,只有 "Interface" "InUti" "OutUti" 列让我感兴趣。
GigabitEthernet7 / 1/0 (10G), GigabitEthernet16 / 1/0 (10G)等名称开头的空格...默认是这样的,所以一定要考虑进去帐号。
我正在阅读 pandas,但我没有找到或多或少适合我需要的内容,但如果您知道任何有趣的文档,我将不胜感激
print(data)
Interface PHY Protocol InUti OutUti inErrors outErrors
40GE7/0/0 up up 6.97% 14.85% 0 0
40GE16/0/0 up up 25.69% 0.75% 0 0
Eth-Trunk1 up up 18.55% 10.07% 506 0
GigabitEthernet7/1/0(10G) up up 17.61% 10.16% 222 0
GigabitEthernet16/1/0(10G) up up 19.49% 9.97% 284 0
Eth-Trunk8 up up 39.19% 46.10% 0 0
GigabitEthernet7/1/9(10G) up up 39.80% 46.09% 0 0
GigabitEthernet16/1/9(10G) up up 38.58% 46.11% 0 0
GigabitEthernet0/0/0 up up 0.01% 0.01% 0 0
GigabitEthernet1/2/0 up up 0.04% 0.01% 0 0
GigabitEthernet1/2/3 up up 0.01% 9.67% 0 0
GigabitEthernet1/2/6 up up 0.01% 0.01% 0 0
GigabitEthernet1/2/7 up up 13.94% 26.52% 0 0
GigabitEthernet1/2/8 up up 0.23% 0.01% 0 0
GigabitEthernet1/2/11 up up 0.39% 5.34% 0 0
GigabitEthernet1/2/12 up up 1.10% 4.09% 0 0
GigabitEthernet1/2/13 up up 21.65% 7.33% 0 0
GigabitEthernet1/2/15 up up 0.23% 4.76% 0 0
GigabitEthernet1/2/16 up up 5.10% 13.55% 0 0
GigabitEthernet7/1/6(10G) up up 4.23% 5.71% 0 0
GigabitEthernet7/1/7(10G) up up 4.48% 13.07% 0 0
GigabitEthernet7/1/11(10G) up up 0.92% 4.56% 0 0
GigabitEthernet7/1/12(10G) up up 4.53% 16.12% 0 0
GigabitEthernet7/1/13(10G) up up 6.43% 17.46% 0 0
GigabitEthernet16/1/7(10G) up up 2.85% 8.15% 0 0
GigabitEthernet16/1/11(10G) up up 6.75% 19.73% 0 0
GigabitEthernet16/1/12(10G) up up 0.01% 12.43% 0 0
LoopBack150 up up(s) 0% 0% 0 0
LoopBack160 up up(s) 0% 0% 0 0
LoopBack170 up up(s) 0% 0% 0 0
LoopBack199 up up(s) 0% 0% 0 0
LoopBack200 up up(s) 0% 0% 0 0
NULL0 up up(s) 0% 0% 0 0
[更新]:
除了同伴@crayxt 提出的解决方案之外,我还实现了一个解决方案,它包括使用 pandas.read_fwf 并且代码(针对我的问题)如下:
df = pd.read_fwf (io.StringIO (data), widths = [27,4,11,8,9,8,11,10])
print (df)
print (df.loc [0])
Interface 40GE7/0/0
PHY up
Protocol up
InUti 7.60%
OutUti 14.95%
inErrors 0
outErrors 0
Unnamed: 7 NaN
Name: 0, dtype: object
您可以使用 io 库来做到这一点,这很好,因为没有列内部有空格,但是行开头的空格会被解析器吞没:
import io
data = your_string_here
df = pd.read_csv(io.StringIO(data), sep='\s+')
>>> df.head()
Interface PHY Protocol InUti OutUti inErrors outErrors
0 40GE7/0/0 up up 6.97% 14.85% 0 0
1 40GE16/0/0 up up 25.69% 0.75% 0 0
2 Eth-Trunk1 up up 18.55% 10.07% 506 0
3 GigabitEthernet7/1/0(10G) up up 17.61% 10.16% 222 0
4 GigabitEthernet16/1/0(10G) up up 19.49% 9.97% 284 0
>>>
>>>
>>> df.shape
(33, 7)
Interface
列:
>>> df.Interface.values
array(['40GE7/0/0', '40GE16/0/0', 'Eth-Trunk1',
'GigabitEthernet7/1/0(10G)', 'GigabitEthernet16/1/0(10G)',
'Eth-Trunk8', 'GigabitEthernet7/1/9(10G)',
'GigabitEthernet16/1/9(10G)', 'GigabitEthernet0/0/0',
'GigabitEthernet1/2/0', 'GigabitEthernet1/2/3',
'GigabitEthernet1/2/6', 'GigabitEthernet1/2/7',
'GigabitEthernet1/2/8', 'GigabitEthernet1/2/11',
'GigabitEthernet1/2/12', 'GigabitEthernet1/2/13',
'GigabitEthernet1/2/15', 'GigabitEthernet1/2/16',
'GigabitEthernet7/1/6(10G)', 'GigabitEthernet7/1/7(10G)',
'GigabitEthernet7/1/11(10G)', 'GigabitEthernet7/1/12(10G)',
'GigabitEthernet7/1/13(10G)', 'GigabitEthernet16/1/7(10G)',
'GigabitEthernet16/1/11(10G)', 'GigabitEthernet16/1/12(10G)',
'LoopBack150', 'LoopBack160', 'LoopBack170', 'LoopBack199',
'LoopBack200', 'NULL0'], dtype=object)
早上好,我有一个关于 pandas 和 python 的非常重要的问题。我是 Python 的新手,我正在使用 PARAMIKO 库,通过它我得到一个 "form" 类似于 table 的答案,但我想应用数据框并更好地获取对我最有用的列 "interfaces" "InUti" 和 "OutUti" 的值,问题是我正在阅读 pandas 但我不知道如何应用 属性 的 pandas 从我的字符串中获取我的数据框 obtained 。然后我留下我从答案中得到的字符串,我需要应用数据框。正如上面突出显示的,只有 "Interface" "InUti" "OutUti" 列让我感兴趣。
GigabitEthernet7 / 1/0 (10G), GigabitEthernet16 / 1/0 (10G)等名称开头的空格...默认是这样的,所以一定要考虑进去帐号。
我正在阅读 pandas,但我没有找到或多或少适合我需要的内容,但如果您知道任何有趣的文档,我将不胜感激
print(data)
Interface PHY Protocol InUti OutUti inErrors outErrors
40GE7/0/0 up up 6.97% 14.85% 0 0
40GE16/0/0 up up 25.69% 0.75% 0 0
Eth-Trunk1 up up 18.55% 10.07% 506 0
GigabitEthernet7/1/0(10G) up up 17.61% 10.16% 222 0
GigabitEthernet16/1/0(10G) up up 19.49% 9.97% 284 0
Eth-Trunk8 up up 39.19% 46.10% 0 0
GigabitEthernet7/1/9(10G) up up 39.80% 46.09% 0 0
GigabitEthernet16/1/9(10G) up up 38.58% 46.11% 0 0
GigabitEthernet0/0/0 up up 0.01% 0.01% 0 0
GigabitEthernet1/2/0 up up 0.04% 0.01% 0 0
GigabitEthernet1/2/3 up up 0.01% 9.67% 0 0
GigabitEthernet1/2/6 up up 0.01% 0.01% 0 0
GigabitEthernet1/2/7 up up 13.94% 26.52% 0 0
GigabitEthernet1/2/8 up up 0.23% 0.01% 0 0
GigabitEthernet1/2/11 up up 0.39% 5.34% 0 0
GigabitEthernet1/2/12 up up 1.10% 4.09% 0 0
GigabitEthernet1/2/13 up up 21.65% 7.33% 0 0
GigabitEthernet1/2/15 up up 0.23% 4.76% 0 0
GigabitEthernet1/2/16 up up 5.10% 13.55% 0 0
GigabitEthernet7/1/6(10G) up up 4.23% 5.71% 0 0
GigabitEthernet7/1/7(10G) up up 4.48% 13.07% 0 0
GigabitEthernet7/1/11(10G) up up 0.92% 4.56% 0 0
GigabitEthernet7/1/12(10G) up up 4.53% 16.12% 0 0
GigabitEthernet7/1/13(10G) up up 6.43% 17.46% 0 0
GigabitEthernet16/1/7(10G) up up 2.85% 8.15% 0 0
GigabitEthernet16/1/11(10G) up up 6.75% 19.73% 0 0
GigabitEthernet16/1/12(10G) up up 0.01% 12.43% 0 0
LoopBack150 up up(s) 0% 0% 0 0
LoopBack160 up up(s) 0% 0% 0 0
LoopBack170 up up(s) 0% 0% 0 0
LoopBack199 up up(s) 0% 0% 0 0
LoopBack200 up up(s) 0% 0% 0 0
NULL0 up up(s) 0% 0% 0 0
[更新]: 除了同伴@crayxt 提出的解决方案之外,我还实现了一个解决方案,它包括使用 pandas.read_fwf 并且代码(针对我的问题)如下:
df = pd.read_fwf (io.StringIO (data), widths = [27,4,11,8,9,8,11,10])
print (df)
print (df.loc [0])
Interface 40GE7/0/0
PHY up
Protocol up
InUti 7.60%
OutUti 14.95%
inErrors 0
outErrors 0
Unnamed: 7 NaN
Name: 0, dtype: object
您可以使用 io 库来做到这一点,这很好,因为没有列内部有空格,但是行开头的空格会被解析器吞没:
import io
data = your_string_here
df = pd.read_csv(io.StringIO(data), sep='\s+')
>>> df.head()
Interface PHY Protocol InUti OutUti inErrors outErrors
0 40GE7/0/0 up up 6.97% 14.85% 0 0
1 40GE16/0/0 up up 25.69% 0.75% 0 0
2 Eth-Trunk1 up up 18.55% 10.07% 506 0
3 GigabitEthernet7/1/0(10G) up up 17.61% 10.16% 222 0
4 GigabitEthernet16/1/0(10G) up up 19.49% 9.97% 284 0
>>>
>>>
>>> df.shape
(33, 7)
Interface
列:
>>> df.Interface.values
array(['40GE7/0/0', '40GE16/0/0', 'Eth-Trunk1',
'GigabitEthernet7/1/0(10G)', 'GigabitEthernet16/1/0(10G)',
'Eth-Trunk8', 'GigabitEthernet7/1/9(10G)',
'GigabitEthernet16/1/9(10G)', 'GigabitEthernet0/0/0',
'GigabitEthernet1/2/0', 'GigabitEthernet1/2/3',
'GigabitEthernet1/2/6', 'GigabitEthernet1/2/7',
'GigabitEthernet1/2/8', 'GigabitEthernet1/2/11',
'GigabitEthernet1/2/12', 'GigabitEthernet1/2/13',
'GigabitEthernet1/2/15', 'GigabitEthernet1/2/16',
'GigabitEthernet7/1/6(10G)', 'GigabitEthernet7/1/7(10G)',
'GigabitEthernet7/1/11(10G)', 'GigabitEthernet7/1/12(10G)',
'GigabitEthernet7/1/13(10G)', 'GigabitEthernet16/1/7(10G)',
'GigabitEthernet16/1/11(10G)', 'GigabitEthernet16/1/12(10G)',
'LoopBack150', 'LoopBack160', 'LoopBack170', 'LoopBack199',
'LoopBack200', 'NULL0'], dtype=object)