像 Python 中的 sscanf 一样的字符串拆分

Question

我有一个文件包含像

这样的行

12  45 some text
56 78      #another type of text
22     34 after column 2 are other data

我需要拆分每一行，将前两个元素存储在两个变量中，将第二列之后的文本存储在一个变量中。在 C 中，使用 sscanf() 这可以完成为

sscanf(line,"%d %d %s",&a,&b,textArray);

我知道 scanf python module 但显然它不是标准的并且不包含在 Debian 中。

如何使用标准 Python 工具来做到这一点？

Answer 1

假设你的字符串的第一个元素是数字，我会建议像

这样的东西

def split(line):

  list0= line.split()
  list1 = [y for y in list0 if y.isdigit() ]
  rest = ' '.join([c for c in list0 if c not in list1[:2]])

  a = list1[0]
  b = list1[2]
  return a,b,rest

# ex:

print split('22     34 after column 2 are other data')

# output >> ('22', '2', 'after column 2 are other data')

Answer 2

split 就是你所需要的。

 line.split(None, 2)

添加强调的拆分文档：

string.split(s[, sep[, maxsplit]])

Return a list of the words of the string s. If the optional second argument sep is absent or None, the words are separated by arbitrary strings of whitespace characters (space, tab, newline, return, formfeed). If the second argument sep is present and not None, it specifies a string to be used as the word separator. The returned list will then have one more item than the number of non-overlapping occurrences of the separator in the string. If maxsplit is given, at most maxsplit number of splits occur, and the remainder of the string is returned as the final element of the list (thus, the list will have at most maxsplit+1 elements). If maxsplit is not specified or -1, then there is no limit on the number of splits (all possible splits are made).

The behavior of split on an empty string depends on the value of sep. If sep is not specified, or specified as None, the result will be an empty list. If sep is specified as any string, the result will be a list containing one element which is an empty string.

像 Python 中的 sscanf 一样的字符串拆分

String splitting like sscanf in Python

python

split