使用 Python 从文本文件中提取数据 [英] Extracting data from a text file with Python
问题描述
所以我有一个很大的文本文件.它包含以下格式的一堆信息:
So I have a large text file. It contains a bunch of information in the following format:
|NAME|NUMBER(1)|AST|TYPE(0)|TYPE|NUMBER(2)||NUMBER(3)|NUMBER(4)|DESCRIPTION|
抱歉我的含糊不清.所有信息的格式与上述类似,每个描述符之间是分隔符|".我希望能够在文件中搜索NAME"并在它自己的标签中打印每个描述符,例如这个例子:
Sorry for the vagueness. All the information is formatted like the above and between each descriptor is the separator '|'. I want to be able to search the file for the 'NAME' and the print each descriptor in it's own tag such as this example:
Name
Number(1):
AST:
TYPE(0):
etc....
如果我仍然感到困惑,我希望能够搜索名称,然后打印出每个名称后面的信息,并用|"分隔.
In case I'm still confusing, I want to be able to search the name and then print out the information that follows each being separated by a '|'.
有人可以帮忙吗?
编辑这是文本文件的一部分的示例:
EDIT Here is an example of a part of the text file:
|Trevor Jones|70|AST|White|Earth|3||500|1500|住在养老院的老人|
|Trevor Jones|70|AST|White|Earth|3||500|1500|Old Man Living in a retirement home|
这是我目前的代码:
with open('LARGE.TXT') as fd:
name='Trevor Jones'
input=[x.split('|') for x in fd.readlines()]
to_search={x[0]:x for x in input}
print('\n'.join(to_search[name]))
推荐答案
类似的东西
#Opens the file in a 'safe' manner
with open('large_text_file') as fd:
#This reads in the file and splits it into tokens,
#the strip removes the extra pipes
input = [x.strip('|').split('|') for x in fd.readlines()]
#This makes it into a searchable dictionary
to_search = {x[0]:x for x in input}
然后用
to_search[NAME]
取决于您希望使用的答案的格式
Depending on the format you want the answers in use
print ' '.join(to_search[NAME])
或
print '\n'.join(to_search[NAME])
警告,此解决方案假定名称是唯一的,如果它们不是更复杂的解决方案,则可能需要.
A word of warning, this solution assumes that the names are unique, if they aren't a more complex solution may be required.
这篇关于使用 Python 从文本文件中提取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!