[Bioperl-l] Genbank seq CODE

Peter Kos kos@rite.or.jp" <kos@rite.or.jp
Tue, 11 Jun 2002 21:46:04 +0900


Hi Melissa,

Strange message.
I agree with Jason in the capital letters' matter.

So what is the question? It is tough to troubleshoot if there is no 
trouble.
Does this work or not? Possibly not but why don't you just try and 
read the error messages?

STDOUT may not have anything to do with write->seq() if you want to 
write in the file gbcu.fsa
However, you may want to insert a > sign in front of the output file 
name like this

$fastafile = Bio::SeqIO->new('-file' => ">gbcu.fsa", '-format' => 
'Fasta');

You may wrap the gzip -d ... in system ();

If you have doubts about the spelling of gonorrhoeae why do you 
insist to be strict about the ending. It is not likely to have 
something called gonorrhoeaplix or similar, so you may as well just 
search with gonorrhoea and, moreover, you may not need to chop off 
the last character of gonorrhoeae and then you may search with that 
word. Or use a regex. There can be even misspellings sometimes.

I hope it helps (if you needed help at all).

Peter

On Tuesday, June 11, 2002 3:23 AM, Melissa L. Kimball 
[SMTP:mkimball@med.unc.edu] wrote:
> I THINK I AM DOING IT RIGHT????? MAYBE SINCE I OPENED AN "FTP" FILE
> HANDLE,
> DO I HAVE TO SPECIFY "STDOUT" BEFORE I USE  write->seq() ?
>
> #!/usr/bin/perl -v
>
> use Bio::SeqIO;
> use Bio::DB::GenBank;
> use Bio::Seq;
> use Bio::DB::NCBIHelper;
> use Bio::Annotation::Collection;
> use diagnostics;
>
> my $ftp = "/usr/bin/ftp";
> my $tmp = "genbankflatfile.txt";
> my $remotefile = "gbcu.flat.gz";
> my $localfile = "gbcu.flat.gz";
> my $host = "ftp.ncbi.nih.gov";
> my $dir = "/genbank/daily";
>
> open(FTP,"| $ftp -n -v $host > $tmp");
>
> print FTP "user anonymous mkimball\@med.unc.edu\n";
> print FTP "cd $dir\n";
> print FTP "binary\n";
> print FTP "get $remotefile $localfile\n";
> print FTP "quit\n";
>
> #close(FTP);
>
> #`gzip -d gbcu.flat.gz`
>
> $genbankfile = Bio::SeqIO->new('-file' => "gbcu.flat",'-format' =>
> 'genbank');
> $fastafile = Bio::SeqIO->new('-file' => "gbcu.fsa", '-format' =>
> 'Fasta');
>
> while (my $sequence = $genbankfile->next_seq())
> {
>         my $thespecies = $sequence->species();   //YOUR WAY IS MUCH
>         BETTER!!
>         my $specsci = $thespecies->species();
>
>         chop($specsci);
>
>        if ($specsci =~ /^gonorrhoea\b/i) {
>
>                 print "$specsci\n\n";
>
>                 $fastafile->write_seq($sequence);
>         }
> }
>
>
> IN THE CONDITION, I CHECK FOR ALL THOSE ENTRIES THAT ARE
> "gonorrhoea."  WHEN
> I ACTUALLY LOOK AT A *.seq FILE IT IS SPELLED "gonorrhoeae."  ALL
> OTHER
> SCIENTIFIC LITERATURE SPELLS IT THIS WAY.  STRANGE.
>
> HERE IS A CHUNK OF ANNOTATION.  I WILL DEFINITELY NEED THE
> DEFINITION LINE,
> SOURCE LINE, AND ORGANISM LINE.  POSSIBLY KEYWORDS, TITLE, AND
> FEATURES.
> THE QUERY WOULD BE ON THE STRING "gonorrhoeae":
>
>
> LOCUS       AB032563                1407 bp    DNA     linear   BCT
> 23-SEP-2000
> DEFINITION  Neisseria gonorrhoeae gene for efflux transporter
> membrane
> protein
>             AgrA, complete cds.
> ACCESSION   AB032563
> VERSION     AB032563.1  GI:10280997
> KEYWORDS    AgrA.
> SOURCE      Neisseria gonorrhoeae (strain:ATCC19424) DNA.
>   ORGANISM  Neisseria gonorrhoeae
>             Bacteria; Proteobacteria; beta subdivision;
>             Neisseriaceae;
>             Neisseria.
> REFERENCE   1  (bases 1 to 1407)
>   AUTHORS   Murata,T., Gotoh,N., Sakota,E., Otsuki,M. and 
Nishino,T.
>
>   TITLE     agrA gene involving to aminoglycoside resistance in
>   Neisseria
>             gonorrhoeae
>   JOURNAL   Published Only in DataBase (2000) In press
> REFERENCE   2  (bases 1 to 1407)
>   AUTHORS   Murata,T., Gotoh,N., Sakota,E., Otsuki,M. and 
Nishino,T.
>
>   TITLE     Direct Submission
>   JOURNAL   Submitted (20-SEP-1999) Takeshi Murata, Kyoto
>   Phamaceutical
>             University, Microbiology; Misasagi Yamashina, Kyoto,
>             Kyoto
>             607-8414, Japan (E-mail:murata@mb.kyoto-phu.ac.jp,
>             Tel:81-75-595-4642)
> FEATURES             Location/Qualifiers
>      source          1..1407
>                      /organism="Neisseria gonorrhoeae"
>                      /strain="ATCC19424"
>                      /db_xref="taxon:485"
>      gene            1..1407
>                      /gene="agrA"
>
>
>
> THANK YOU! THANK YOU! THANK YOU!
>