像 Python 中的 sscanf 一样的字符串拆分
String splitting like sscanf in Python
我有一个文件包含像
这样的行
12 45 some text
56 78 #another type of text
22 34 after column 2 are other data
我需要拆分每一行,将前两个元素存储在两个变量中,将第二列之后的文本存储在一个变量中。在 C 中,使用 sscanf() 这可以完成为
sscanf(line,"%d %d %s",&a,&b,textArray);
我知道 scanf python module 但显然它不是标准的并且不包含在 Debian 中。
如何使用标准 Python 工具来做到这一点?
假设你的字符串的第一个元素是数字,我会建议像
这样的东西
def split(line):
list0= line.split()
list1 = [y for y in list0 if y.isdigit() ]
rest = ' '.join([c for c in list0 if c not in list1[:2]])
a = list1[0]
b = list1[2]
return a,b,rest
# ex:
print split('22 34 after column 2 are other data')
# output >> ('22', '2', 'after column 2 are other data')
split 就是你所需要的。
line.split(None, 2)
添加强调的拆分文档:
string.split(s[, sep[, maxsplit]])
Return a list of the words of the string s. If the optional second argument sep is absent or None, the words are separated by arbitrary strings of whitespace characters (space, tab, newline, return, formfeed). If the second argument sep is present and not None, it specifies a string to be used as the word separator. The returned list will then have one more item than the number of non-overlapping occurrences of the separator in the string. If maxsplit is given, at most maxsplit number of splits occur, and the remainder of the string is returned as the final element of the list (thus, the list will have at most maxsplit+1 elements). If maxsplit is not specified or -1, then there is no limit on the number of splits (all possible splits are made).
The behavior of split on an empty string depends on the value of sep. If sep is not specified, or specified as None, the result will be an empty list. If sep is specified as any string, the result will be a list containing one element which is an empty string.
我有一个文件包含像
这样的行12 45 some text
56 78 #another type of text
22 34 after column 2 are other data
我需要拆分每一行,将前两个元素存储在两个变量中,将第二列之后的文本存储在一个变量中。在 C 中,使用 sscanf() 这可以完成为
sscanf(line,"%d %d %s",&a,&b,textArray);
我知道 scanf python module 但显然它不是标准的并且不包含在 Debian 中。
如何使用标准 Python 工具来做到这一点?
假设你的字符串的第一个元素是数字,我会建议像
这样的东西def split(line):
list0= line.split()
list1 = [y for y in list0 if y.isdigit() ]
rest = ' '.join([c for c in list0 if c not in list1[:2]])
a = list1[0]
b = list1[2]
return a,b,rest
# ex:
print split('22 34 after column 2 are other data')
# output >> ('22', '2', 'after column 2 are other data')
split 就是你所需要的。
line.split(None, 2)
添加强调的拆分文档:
string.split(s[, sep[, maxsplit]])
Return a list of the words of the string s. If the optional second argument sep is absent or None, the words are separated by arbitrary strings of whitespace characters (space, tab, newline, return, formfeed). If the second argument sep is present and not None, it specifies a string to be used as the word separator. The returned list will then have one more item than the number of non-overlapping occurrences of the separator in the string. If maxsplit is given, at most maxsplit number of splits occur, and the remainder of the string is returned as the final element of the list (thus, the list will have at most maxsplit+1 elements). If maxsplit is not specified or -1, then there is no limit on the number of splits (all possible splits are made).
The behavior of split on an empty string depends on the value of sep. If sep is not specified, or specified as None, the result will be an empty list. If sep is specified as any string, the result will be a list containing one element which is an empty string.