Python：拆分字符串并获取位置

Question

我想将字符串拆分成多个部分，并希望获得字符串拆分部分的额外（起始）位置。

我可以用下面的代码来做到这一点：

str_ = '  d     A7    g7'
flag_non_space_string_started = False
positions = []
for i, letter in enumerate(str_):
    if letter is not ' ':
        if not flag_non_space_string_started:
            positions.append(i)
            flag_non_space_string_started = True
    else:
        flag_non_space_string_started = False
# this is what I want
print(str_.split())
print(positions)
# prints:
# ['d', 'A7', 'g7']
# [2, 8, 14]

是否有更短（更 pythonic）的方式来获取位置？

Answer 1

你可以在这里使用itertools.groupby with enumerate。这里我们使用 not str.isspace 在空白处对项目进行分组，因此 k 对于非空白字符为 True，对于空白字符为 False，因此 if k 条件。现在每个组都是一个迭代器，我们需要对其调用 next() 以获得起始索引和第一个字符。现在要获取其余的组项目，请使用列表理解并将其传递给 str.join 以获取字符串。不要忘记将我们之前弹出的项目添加到这个字符串中：

from itertools import groupby

str_ = '  d     A7    g7'

for k, g in groupby(enumerate(str_), lambda x: not x[1].isspace()):
    if k:
        pos, first_item = next(g)
        print pos, first_item + ''.join([x for _, x in g])

输出：

2 d
8 A7
14 g7

如果上面的解决方案看起来很复杂，那么也可以使用re.finditer。 re.finditer返回的匹配对象有.start()和group()等方法，分别对应匹配组的起始索引和组本身。

import re

str_ = '  d     A7    g7'

for m in re.finditer(r'\S+', str_):
    index, item = m.start(), m.group()
    # now do something with index, item

Python：拆分字符串并获取位置

Python: split string and get position

python

string

split