[Biopython] change rec.id problems

Frederico Moraes Ferreira ferreirafm at usp.br
Mon Jun 24 21:53:17 UTC 2013


Hi list,
I'm trying to change the rec.id as so the file name replaces the 
beginning id string itself.
The code is as follows:

     for inf in inflist:
         rec = SeqIO.read(open(inf, "rU"), "fasta")
         if inf[:-5] != rec.id.split('|')[0][:-3]:
             print rec.id
             rec.id = '%spep|%s' % (inf[:-5], 
'|'.join(rec.id.split('|')[1:]))
             print rec.id
             outf = '.'.join(inf.split('.')[:-1]) + '_new.fasta'
             SeqIO.write(rec, outf, 'fasta')

Judging by the prints bellow, the program seems to be working fine.

####output########
emm52.pep|166|Type:P
emm52.0.pep|166|Type:P
emm5-21.pep|178|Type:P
emm5.21.pep|178|Type:P
emm52-1.pep|240|Type:P
emm52.1.pep|240|Type:P
emm5-22.pep|219|Type:P
emm5.22.pep|219|Type:P
emm5-23.pep|231|Type:P
emm5.23.pep|231|Type:P
emm5-24.pep|157|Type:P
emm5.24.pep|157|Type:P
emm5-25.pep|110|Type:P

However, in the file the new and old ids were concatenated.

 >emm52.0.pep|166|Type:P emm52.pep|166|Type:P <unknown description>
GTASVAVGLTVVGAGLASQTEVKADQPVDHHRYTEANDAVLQGRTVSARALLHEINKNGQ
LRSENEELKADLQKKEQELKNLNDDVKKLNDEVALERLKNERHVHDEEVELERLKNERHD
HDKKEAERKALEDKLADKQEHLDGALRYINEKEAERKEKEAEQKKL

Am I doing something wrong?
All the best,
Fred










More information about the Biopython mailing list