counting fasta files
Florence Servant
flo at ebi.ac.uk
Wed Sep 11 17:32:52 UTC 2002
Fernan Aguero wrote:
> +----[ Asi hablaba Ted Chiang (tchiang at bioinfo.sickkids.on.ca):
> |
> | I have a file that contains several hundreds of fasta sequences. Is there
> | a function/program that will count the number of sequences in this file
> | and report it?
> |
> +----]
>
> Here's another one
>
> cat file.fasta | grep -c \>
>
Hi all,
I would suggest to add a contraint which is that the line must start with
>:
cat file.fasta | grep -c ^\>
To be really sure it does work for all the fasta files you can have, you
also have to take in account that several comment lines starting with > can
occur for a single sequence.
ggrep -A 1 ^\> file.fasta | grep -v ^-- | grep -v ^\> | wc -l
Flo
--
Florence SERVANT
EBI - European Bioinformatics Institute - Room A2-40
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom
Tel : (+44) 01223 494 686
More information about the EMBOSS
mailing list