从相应的行号打印单词

Question

大家好，我有两个文件 File1 和 File2，其中包含以下数据。

File1:

 TOPIC:topic_0 30063951.0
 2 19195200.0

 1 7586580.0

 3 2622580.0

TOPIC:topic_1 17201790.0
1 15428200.0

2 917930.0

10 670854.0

以此类推..共有15个主题，每个主题都有各自的权重。而第一列如 2,1,3 是在 file2 中有相应单词的数字。例如，

File 2 has:

   1 i

   2 new

   3 percent

   4 people 

   5 year

   6 two

   7 million

   8 president

   9 last

   10 government

等等.. 大概有10470行字。所以，简而言之，我应该在 file1 的第一列中有相应的单词而不是行号。我的输出应该是这样的：

TOPIC:topic_0 30063951.0

new 19195200.0

i 7586580.0

percent 2622580.0

TOPIC:topic_1 17201790.0

i 15428200.0

new 917930.0

government 670854.0

我的代码：

import sys
d1 = {}
n = 1

with open("ap_vocab.txt") as in_file2:
     for line2 in in_file2:
            #print n, line2
            d1[n] = line2[:-1]
            n = n + 1

with open("ap_top_t15.txt") as in_file:
     for line1 in in_file:
            columns = line1.split(' ')
            firstwords = columns[0]
            #print firstwords[:-8]
            if firstwords[:-8] == 'TOPIC':
                    print columns[0], columns[1]
            elif firstwords[:-8] != '\n':
                    num = columns[0]
                    print d1[n], columns[1]

当我键入 print d1[2] 时，此代码是运行，columns[1] 给出 file2 中所有行的第二个词。但是打印上面的代码时，报错

KeyError: 10472

文件2中有10472行单词。请帮助我解决这个问题。提前致谢！

Answer 1

在您的第一个 for 循环中，n 随着每一行递增，直到达到最终值 10472。您只将 d1[n] 的值设置为 10471 但是，由于您在之后放置了增量 ，因此您为给定的 n 设置了 d1，并使用以下两行：

d1[n] = line2[:-1] n = n + 1

然后就行了

print d1[n], columns[1]

在您的第二个 for 循环中（对于 in_file），您正在尝试访问 d1[10472]，这显然不存在。此外，您将 d1 定义为一个空字典，然后尝试像访问列表一样访问它，这样即使您修复了增量，您也无法像那样访问它。您必须使用带有 d1 = [] 的列表，或者必须实现 OrderedDict 以便您可以访问 "last" 键，因为字典在 Python 中通常是无序的。

您可以：
改变您的增量，以便您做在 d1[10472] 位置为 d1 设置一个值，或者简单地设置 after 你的 for 循环的最后一个位置的值。

根据您要打印的内容，您可以将最后一行替换为

print d1[-1], columns[1]

打印出您当前设置的最终索引位置的值。

从相应的行号打印单词

Print words from the corresponding line numbers

printing

words

file

python-2.7