[Biopython] SeqIO fasta "fakes" recognition
Marco Galardini
marco.galardini at unifi.it
Thu Feb 23 11:05:53 EST 2012
Hi all,
i was wondering if you are aware of a method to distinguish between
"real" fasta files and files that just happen to have a ">" character.
I would like to scan a directory and return only the "real" fasta files.
I tried to open a .png file and surprisingly it gave me the following
results:
SeqIO.parse(open('Screenshot.png'),'fasta').next()
SeqRecord(seq=Seq('Ȏ;9r$?���8�n���˗�ݻ7M�4��ɓ\�r���0����$It��I...q+',
SingleLetterAlphabet()),
id='>>DEE\xd1\xaaU+\x8e\x1f?Nxx8g\xce\x9c1\xb8]``',
name='>>DEE\xd1\xaaU+\x8e\x1f?Nxx8g\xce\x9c1\xb8]``',
description='>>DEE\xd1\xaaU+\x8e\x1f?Nxx8g\xce\x9c1\xb8]``
\x81\x81\x81\xec\xdb\xb7Ok\xf9\xd5\xabW\xf1\xf0\xf0`\xe2\xc4\x89\x8c\x181\x82\x9e={j\x95+\x14',
dbxrefs=[])
I tried to use some Alphabets but i experienced the same results.
Thanks in advance,
Marco
--
-------------------------------------------------
Marco Galardini
DBE - Department of Evolutionary Biology
University of Florence - Italy
e-mail: marco.galardini at unifi.it
www: http://www.unifi.it/dblage/CMpro-v-p-51.html
phone: +39 055 2288249
mobile: +39 340 2808041
-------------------------------------------------
More information about the Biopython
mailing list