[Biopython] cleaning sequences
Liam Thompson
dejmail at gmail.com
Tue Jul 14 07:09:49 EDT 2009
Hi everyone
I was wondering if there was a built in method for determining whether a
sequence (Genbank or FASTA) is an Ambiguous or Unambiguous sequence. The
reason I ask is I am trying to subtype a couple hundred viral DNA sequences,
and due to bad sequencing, the sequences often have ambiguous characters in
them, which the algorithm used to subtype doesn't like. I realise I can
compare each letter of each genome in a loop with GATC to determine
ambiguity, but it might be easier if there was a built in function.
Thanks
Liam
--
-----------------------------------------------------------
Antiviral Gene Therapy Research Unit
University of the Witwatersrand
Faculty of Health Sciences, Room 7Q07
7 York Road, Parktown
2193
Tel: 2711 717 2465/7
Fax: 2711 717 2395
Email: liam.thompson at students.wits.ac.za / dejmail at gmail.com
More information about the Biopython
mailing list