[Biopython] cleaning sequences
Chris Fields
cjfields at illinois.edu
Tue Jul 14 14:48:04 UTC 2009
If you do come up with something, let us Bioperl guys know. We have a
preliminary trimming/cleaning version that we're thinking of adding,
but it would be nice to coalesce around a similar implementation.
chris
On Jul 14, 2009, at 7:45 AM, Brad Chapman wrote:
> Hi Liam;
> I don't believe there is built in functionality for doing this. The
> problem itself is hard because it is a bit underspecified: what
> should be done when encountering ambiguous characters? Depending on
> your situation this can be a couple of different things:
>
> - Trim the sequence to remove the bases. This might be a
> post-sequencing step, and there was some discussion between Peter
> and Giles about the parameters of doing this earlier this month:
>
> http://lists.open-bio.org/pipermail/biopython/2009-July/005342.html
>
> - Replace the bases with an accepted ambiguity character (say, N or
> x)
>
> So it's a bit hard to generalize. Saying that, we'd be happy for
> thoughts on an implementation that would tackle these sorts of
> issues.
>
> Brad
>
>> I was wondering if there was a built in method for determining
>> whether a
>> sequence (Genbank or FASTA) is an Ambiguous or Unambiguous
>> sequence. The
>> reason I ask is I am trying to subtype a couple hundred viral DNA
>> sequences,
>> and due to bad sequencing, the sequences often have ambiguous
>> characters in
>> them, which the algorithm used to subtype doesn't like. I realise I
>> can
>> compare each letter of each genome in a loop with GATC to determine
>> ambiguity, but it might be easier if there was a built in function.
>>
>> Thanks
>> Liam
>>
>>
>>
>> --
>> -----------------------------------------------------------
>> Antiviral Gene Therapy Research Unit
>> University of the Witwatersrand
>> Faculty of Health Sciences, Room 7Q07
>> 7 York Road, Parktown
>> 2193
>>
>> Tel: 2711 717 2465/7
>> Fax: 2711 717 2395
>> Email: liam.thompson at students.wits.ac.za / dejmail at gmail.com
>> _______________________________________________
>> Biopython mailing list - Biopython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
> _______________________________________________
> Biopython mailing list - Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
More information about the Biopython
mailing list