如何将大字符串转换为数据框?

How can you convert a large string into a data frame?

早上好,我有一个关于 pandas 和 python 的非常重要的问题。我是 Python 的新手,我正在使用 PARAMIKO 库,通过它我得到一个 "form" 类似于 table 的答案,但我想应用数据框并更好地获取对我最有用的列 "interfaces" "InUti" 和 "OutUti" 的值,问题是我正在阅读 pandas 但我不知道如何应用 属性 的 pandas 从我的字符串中获取我的数据框 obtained 。然后我留下我从答案中得到的字符串,我需要应用数据框。正如上面突出显示的,只有 "Interface" "InUti" "OutUti" 列让我感兴趣。

GigabitEthernet7 / 1/0 (10G), GigabitEthernet16 / 1/0 (10G)等名称开头的空格...默认是这样的,所以一定要考虑进去帐号。

我正在阅读 pandas,但我没有找到或多或少适合我需要的内容,但如果您知道任何有趣的文档,我将不胜感激

print(data)

Interface                   PHY   Protocol  InUti OutUti   inErrors  outErrors
40GE7/0/0                   up    up        6.97% 14.85%          0          0
40GE16/0/0                  up    up       25.69%  0.75%          0          0
Eth-Trunk1                  up    up       18.55% 10.07%        506          0
  GigabitEthernet7/1/0(10G) up    up       17.61% 10.16%        222          0
  GigabitEthernet16/1/0(10G) up    up       19.49%  9.97%        284          0
Eth-Trunk8                  up    up       39.19% 46.10%          0          0
  GigabitEthernet7/1/9(10G) up    up       39.80% 46.09%          0          0
  GigabitEthernet16/1/9(10G) up    up       38.58% 46.11%          0          0
GigabitEthernet0/0/0        up    up        0.01%  0.01%          0          0
GigabitEthernet1/2/0        up    up        0.04%  0.01%          0          0
GigabitEthernet1/2/3        up    up        0.01%  9.67%          0          0
GigabitEthernet1/2/6        up    up        0.01%  0.01%          0          0
GigabitEthernet1/2/7        up    up       13.94% 26.52%          0          0
GigabitEthernet1/2/8        up    up        0.23%  0.01%          0          0
GigabitEthernet1/2/11       up    up        0.39%  5.34%          0          0
GigabitEthernet1/2/12       up    up        1.10%  4.09%          0          0
GigabitEthernet1/2/13       up    up       21.65%  7.33%          0          0
GigabitEthernet1/2/15       up    up        0.23%  4.76%          0          0
GigabitEthernet1/2/16       up    up        5.10% 13.55%          0          0
GigabitEthernet7/1/6(10G)   up    up        4.23%  5.71%          0          0
GigabitEthernet7/1/7(10G)   up    up        4.48% 13.07%          0          0
GigabitEthernet7/1/11(10G)  up    up        0.92%  4.56%          0          0
GigabitEthernet7/1/12(10G)  up    up        4.53% 16.12%          0          0
GigabitEthernet7/1/13(10G)  up    up        6.43% 17.46%          0          0
GigabitEthernet16/1/7(10G)  up    up        2.85%  8.15%          0          0
GigabitEthernet16/1/11(10G) up    up        6.75% 19.73%          0          0
GigabitEthernet16/1/12(10G) up    up        0.01% 12.43%          0          0
LoopBack150                 up    up(s)        0%     0%          0          0
LoopBack160                 up    up(s)        0%     0%          0          0
LoopBack170                 up    up(s)        0%     0%          0          0
LoopBack199                 up    up(s)        0%     0%          0          0
LoopBack200                 up    up(s)        0%     0%          0          0
NULL0                       up    up(s)        0%     0%          0          0

[更新]: 除了同伴@crayxt 提出的解决方案之外,我还实现了一个解决方案,它包括使用 pandas.read_fwf 并且代码(针对我的问题)如下:

df = pd.read_fwf (io.StringIO (data), widths = [27,4,11,8,9,8,11,10])
    print (df)
    print (df.loc [0])

   Interface     40GE7/0/0
   PHY                  up
   Protocol             up
   InUti             7.60%
   OutUti           14.95%
   inErrors              0
   outErrors             0
   Unnamed: 7          NaN
   Name: 0, dtype: object

您可以使用 io 库来做到这一点,这很好,因为没有列内部有空格,但是行开头的空格会被解析器吞没:

import io

data = your_string_here

df = pd.read_csv(io.StringIO(data), sep='\s+')
>>> df.head()
                    Interface PHY Protocol   InUti  OutUti  inErrors  outErrors
0                   40GE7/0/0  up       up   6.97%  14.85%         0          0
1                  40GE16/0/0  up       up  25.69%   0.75%         0          0
2                  Eth-Trunk1  up       up  18.55%  10.07%       506          0
3   GigabitEthernet7/1/0(10G)  up       up  17.61%  10.16%       222          0
4  GigabitEthernet16/1/0(10G)  up       up  19.49%   9.97%       284          0
>>> 
>>> 
>>> df.shape
(33, 7)

Interface列:

>>> df.Interface.values
array(['40GE7/0/0', '40GE16/0/0', 'Eth-Trunk1',
       'GigabitEthernet7/1/0(10G)', 'GigabitEthernet16/1/0(10G)',
       'Eth-Trunk8', 'GigabitEthernet7/1/9(10G)',
       'GigabitEthernet16/1/9(10G)', 'GigabitEthernet0/0/0',
       'GigabitEthernet1/2/0', 'GigabitEthernet1/2/3',
       'GigabitEthernet1/2/6', 'GigabitEthernet1/2/7',
       'GigabitEthernet1/2/8', 'GigabitEthernet1/2/11',
       'GigabitEthernet1/2/12', 'GigabitEthernet1/2/13',
       'GigabitEthernet1/2/15', 'GigabitEthernet1/2/16',
       'GigabitEthernet7/1/6(10G)', 'GigabitEthernet7/1/7(10G)',
       'GigabitEthernet7/1/11(10G)', 'GigabitEthernet7/1/12(10G)',
       'GigabitEthernet7/1/13(10G)', 'GigabitEthernet16/1/7(10G)',
       'GigabitEthernet16/1/11(10G)', 'GigabitEthernet16/1/12(10G)',
       'LoopBack150', 'LoopBack160', 'LoopBack170', 'LoopBack199',
       'LoopBack200', 'NULL0'], dtype=object)