[Biopython] cleaning sequences

Liam Thompson dejmail at gmail.com
Tue Jul 14 07:09:49 EDT 2009

Hi everyone

I was wondering if there was a built in method for determining whether a
sequence (Genbank or FASTA) is an Ambiguous or Unambiguous sequence. The
reason I ask is I am trying to subtype a couple hundred viral DNA sequences,
and due to bad sequencing, the sequences often have ambiguous characters in
them, which the algorithm used to subtype doesn't like. I realise I can
compare each letter of each genome in a loop with GATC to determine
ambiguity, but it might be easier if there was a built in function.


Antiviral Gene Therapy Research Unit
University of the Witwatersrand
Faculty of Health Sciences, Room 7Q07
7 York Road, Parktown

Tel: 2711 717 2465/7
Fax: 2711 717 2395
Email: liam.thompson at students.wits.ac.za / dejmail at gmail.com

More information about the Biopython mailing list