使用 python 从文本文件的列中提取数据

Question

我是 python 文件数据处理的新手。我有以下文本文件，其中包含新大学校园的报告。我想从“colleges”列和“book_IDs_1”中提取数据，block_ABC_top 是 23。我还想知道 block_ABC_top 在colleges 列并找到 book IDs_1 列的值。可以在文本文件中吗？或者我必须将其更改为 csv？我如何为这个数据处理编写代码？请帮助我！！

Copyright 1986-2019, Inc. All Rights Reserved.

Design Information
-----------------------------------------------------------------------------------------------------------------
| Version : (lin64) Build 2729669 Thu Dec  5 04:48:12 MST 2019
| Date         : Wed Aug 26 00:46:08 2020
| Host         : running 64-bit Red Hat Enterprise Linux Server release 7.8 
| Command      : college report
| Design       : college
| Device       : laptop
| Design State : in construction
-----------------------------------------------------------------------------------------------------------------

Table of Contents
-----------------
1. Information by Hierarchy

1. Information by Hierarchy
---------------------------
+----------------------------------------------+--------------------------------------------+------------+------------+---------+------+-----+
|                   colleges                   |                   Module                   | Total mems | book IDs_1 | canteen | BUS  | UPS | 
+----------------------------------------------+--------------------------------------------+------------+------------+---------+------+-----+
| block_ABC_top                                |                                      (top) |         44 |         23 |       8 |    8 |   8 |   
|    (block_ABC_top_0)                         |                            block_ABC_top_0 |          5 |          5 |       5 |    2 |   9 |       
+----------------------------------------------+--------------------------------------------+------------+------------+---------+------+-----+

我有一个数据列表，其中包含大学的数据，例如 block_ABC_top、block_ABC_top_1、block_ABC_top、block_ABC_top_1...这是我的代码以下我面临的问题是..它只需要数据[0]的数据..但是我有数据[0]和数据[2]拥有同一所大学，我希望检查发生两次。

with open ("utility.txt", 'r') as f1:
            
            for line in f1:
                if data[x] in line:
                    line_values = line.split('|') 

                    if (int(line_values[4]) == 23 or int(line_values[7]) == 8):
                        filecheck = fullpath + "/" + filenames[x]
                        print filecheck

                        #print "check file "+ filenames[x]
                    x = x + 1

            f1.close()

Answer 1

print [x.split(' ')[0] for x in open(file).readlines()]  #colleges column
print [x.split(' ')[3] for x in open(file).readlines()]  #book_IDs_1 column

尝试运行这些。

Answer 2

与其使用 reach 字段的确切位置，更好的方法是使用 split() 函数，因为您的字段由 | 符号分隔。您可以遍历文件的行并相应地处理它们。

for loop...:
    line_values = line.split("|")

print(line_values[0]) # block_ABC_top

Answer 3

要提取 Book id 列数据，请使用以下代码

with open('report.txt') as f:
  for line in f:
    if 'block_ABC_top' in line:
      line_values = line.split('|')
      print(line_values[4]) # PRINTS 23 AND 5

使用 python 从文本文件的列中提取数据

extracting data from columns in a text file using python

python

csv

text

file

multiple-columns