[Bioperl-l] Question: How to manipulate files
Marco Blanchette
mblanche at berkeley.edu
Thu Mar 30 01:20:22 UTC 2006
Michael--
Something like:
#!/usr/bin/perl
use Bio::SeqIO;
my $file = shift;
my $seqio_o = Bio::SeqIO->new(-file => $file);
while ($seq_o =$seqio_o->next_seq){
my ($id) = $seq_o->display_id =~ /_(\d*)$/;
print ">", $seq_o->display_id, "\n", $seq_o->seq, "\n", if $id >= 7;
}
If you redirect the standard output, this script would do what you try to
achieve.
Just call:
$perl theScript.pl myfile.fasta > myNewFile.fasta
On 3/29/06 14:41, "Michael Craige" <mcraige at genetics.emory.edu> wrote:
> I am attempting to develop a script to open a DNA file contain 15 FASTA
> sequences and then delete the first 7 sequences and close the file leaving
> the remainder 8 sequences intact.
>
> Can someone help me with a Perl script or point me to some doc that can
> help? Here is a sample, the first sequence in the file header is show below.
> All the header is the same except for the number "001 to 015"
>
>
>> 10kb_NN_Analysis.txt.nmrc_001
> NTNTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTNNNNNNNN
> AANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
> NNNNNNNNNNNNNNNNNNNNNNNN
>
> I trying to get the script to find the first sequences ".nmrc_001" and then
> delete files content to the end of file ".nmrc_007" without affect the
> header with ".nmrc_008"
>
> Is there something already exist to do this?
>
>
> Michael Craige
> Emory University
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
______________________________
Marco Blanchette, Ph.D.
mblanche at uclink.berkeley.edu
Donald C. Rio's lab
Department of Molecular and Cell Biology
16 Barker Hall
University of California
Berkeley, CA 94720-3204
Tel: (510) 642-1084
Cell: (510) 847-0996
Fax: (510) 642-6062
--
More information about the Bioperl-l
mailing list