解析匹配键的文本，然后获取第一组匹配的 table 名称行

Question

尝试对一些大型的旧平面文本文件（老实说是一团糟）进行核对。我遇到的问题是我找到了我的匹配键，我试图获取具有匹配 table 名称的第一组连续行并忽略其余行。我将如何阅读我需要的而不是其余的？玩弄休息，但逻辑正在逃避我。
示例： 如果我正在寻找 101 的 PK 和 table 饮料名称，我想从下面的列表中打印

喝25
喝26

FlatTextFile.txt
pk_tbl 23 100
食物 0 0
喝 0 0
甜点 0 0

pk_tbl 101
食物 0
喝 25
喝 26
甜点 0
喝 27
喝 28
喝 29

pk_tbl 102
食物 0
喝 0
喝 0
喝 0
甜点 0

我所处位置示例的伪代码

        pk_flag = 0
            for row in d:
                if (row[0]= 'drink') and (pk_flag =='1'):
                    print(row)                    
                if (row[0]= 'pk_tbl')and (row[2] =='101'):
                    pk_flag = 1;
                elif (row[0]= 'pk_tbl')and (row[2] !='101'):
                    pk_flag = 0;

有点混乱哈哈，任何帮助表示赞赏。谢谢！

Answer 1

def get_table_data(file_path = 'FlatTextFile.txt', table_keyword = 'pk_tbl', table_num = '101', data_keyword = 'drink'):
    output_ls = []
    with open(file_path, 'r') as fh:
        table = False
        data = False
        for line in fh.readlines():
            if not len(line.strip()): # Ignoring blank lines
                continue
            row = line.split()
            if not table: # Searching for table keyword and number
                if row[0] == table_keyword and row[1] == table_num:
                    table = True
            else:
                if row[0] == table_keyword: # I'm already at next table
                    break
                if not data: # Searching for data keyword
                    if row[0] == data_keyword:
                        data = True
                        output_ls.append(line)
                else: # Searching for more consecutive data keywords
                    if row[0] == data_keyword:
                        output_ls.append(line)
                    else:
                        break
        return output_ls

Answer 2

假设文件FlatTextFile.txt中的花样存储为：

table name
food # (none or one or more)
drink # (none or one or more)
desert # (none or one or more)
(food, drink, desert pattern can repeat for a table)
(blank line)
table name (next table name)
(food, drink, desert pattern in any order)

您想在找到 table pk_tbl 101 后立即选择带有 drink 的记录。 table 的名称可以是 pk_tbl + 任意字符串或无字符串 + 101

根据上述假设，下面是从 table 101 中挑选特定饮料的代码。

with open ('FlatTextFile.txt', 'r') as f:
    table = False
    output = []
    line_count = 0
    for line in f:
        line = line.rstrip()
        x = line.split()
        if {'pk_tbl','101'} <= set(x): #checks if 'pk_tbl' and '101' are in x
            table = True
            continue
        if table and 'drink' in x: #finds values with drinks
            line_count +=1
            output.append(line)
            continue
        if line_count > 0: break #we are past drink in table pk_tbl; stop processing
print (output)

这个输出将是：

['drink 25', 'drink 26']

解析匹配键的文本，然后获取第一组匹配的 table 名称行

Parse text for matching key then grab first set of matching table name rows

python

text-parsing