[Biopython] processing XML files in Biopython

Sheila the angel from.d.putto at gmail.com
Mon Jun 6 15:10:04 UTC 2011


@David- Yes it works but few small question
1. how to extract the information not sored in directly in record[0].keys()
for an example
record[0]['GBSeq_feature-table']
gives output which seems parsed in XML. From this how can I extract the
'GBQualifier_name' ?

2. just out of curiosity  'why we use record[0] to extract information e.g.
record[0]['GBSeq_definition']  '



On Mon, Jun 6, 2011 at 4:37 PM, David Suárez Pascal
<david.suarez at yahoo.com>wrote:

> Sheila,
> I don't think you have to deal with XML files. Indeed I tried your code and
> what I detected was that Entrez.read already parsed the data.
> What I get when I try your code is a list:
> >>> type(record)
> <class 'Bio.Entrez.Parser.ListElement'>
>
> which contains a dict with the following keys:
> >>> record[0].keys()
> [u'GBSeq_moltype',
>  u'GBSeq_source',
>  u'GBSeq_sequence',
>  u'GBSeq_primary-accession',
>  u'GBSeq_definition',
>  u'GBSeq_accession-version',
>  u'GBSeq_topology',
>  u'GBSeq_length',
>  u'GBSeq_feature-table',
>  u'GBSeq_create-date',
>  u'GBSeq_other-seqids',
>  u'GBSeq_division',
>  u'GBSeq_taxonomy',
>  u'GBSeq_comment',
>  u'GBSeq_source-db',
>  u'GBSeq_references',
>  u'GBSeq_update-date',
>  u'GBSeq_organism',
>  u'GBSeq_locus']
>
> If you got the same response, then you can just do:
> >>> record[0]['GBSeq_locus']
> 'NP_997807'
>
> I hope this helps.
>
> David
>
> 2011/6/6 Sheila the angel <from.d.putto at gmail.com>
>
>> Hi All,
>>
>> I am new to BioPython. I have simple question 'How can I process XML files
>> in Biopython?'
>> For example I have NCBI Reference Sequence ID 'NP_997807.1'
>> I want to download the 'xml' file and want to extract certain information
>> (e.g. GeneID, amino acid length etc.).
>> To download the file I did
>>
>> from Bio import Entrez
>> handle = Entrez.efetch(db="protein", id= "NP_997807.1", retmode="xml")
>> record = Entrez.read(handle)
>> handle.close()
>>
>> Now I have no clue how to extract certain information (like GeneID) :(
>> plz help
>>
>> --
>> Cheers
>>
>> Sheila d. Angela
>> _______________________________________________
>> Biopython mailing list  -  Biopython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
>>
>
>




More information about the Biopython mailing list