[Bioperl-l] removing duplicate fasta records

Paul Boutros pcboutro@engmail.uwaterloo.ca
Tue, 17 Dec 2002 16:29:33 -0500 (EST)


A non-bioperl solution is to load the sequences into a database field with
a uniqueness constraint required.  This assumes that your sequences aren't
no longer than a few kb, otherwise some DB's won't want to enforce
constraints on them.

HTH,
Paul

> Date: Tue, 17 Dec 2002 12:41:15 -0700 (MST)
> From: "Amit Indap <indapa@cs.arizona.edu>"
> <indapa@amadeus.biosci.arizona.edu>
> To: bioperl-l@bioperl.org
> Subject: [Bioperl-l] removing duplicate fasta records
> 
> I have a file with a list of fasta sequences. Is there a way to
> remove records with the identical sequence? I am a newbie to BioPerl,
> and my search through the documentation hasn't found anything.
> 
> Thank you.
> 
> Amit Indap