[Bioperl-l] parsing coded_by subfeature

Jason Stajich jason at cgt.duhs.duke.edu
Sun Jul 20 15:57:47 EDT 2003


Bio::Factory::FTLocationFactory should parse these strings fine.

They are "fuzzy" locations - see GenBank release notes and the
Feature Table definition:
http://www.ncbi.nlm.nih.gov/projects/collab/FT/


On Sun, 20 Jul 2003, Jack Chen wrote:

> Thanks Jason! I have looked into the script before but it does not
> actually handle the join cases though.
>
> I am also curious how to handle the cases where the 'coded_by' subfeature
> contains the ">" and "<" signs. I am not really sure what they mean. And I
> noticed that wherever these signs appear, the protein sequences retrieved
> are different from the conceptual translation from the nucleotide
> sequences. For example:
>
> [nchen at whey blast_db_checked]$ ./test.pl "gi|8573628|gb|AAF77462.1|"
> Protein obtained from GenBank:
> MPQMAPISWLLLFIIFSITFILFCSINYYSYMPNSPKSNELKNINLNSMNWKW
> CDS sequence is:
> ATCCCACAAATAGCACCAATTAGATGATTATTACTATTTATTATTTTTTCTATTACATTTATTTTATTTTGTTCTATTAATTATTATTCTTATATGCCAAATTCACCTAAATCTAATGAATTAAAAAACATCAATTTAAATTCAATAAACTGAAAATGATAA
> Conceptual translation is:
> IPQIAPIR*LLLFIIFSITFILFCSINYYSYMPNSPKSNELKNINLNSIN*K**
>
> Jack
>
> ++++++++++++++++++++++++++++++++++++++++++++
>     o-o     Jack Chen, Stein Laboratory
>     o---o   Cold Spring Harbor Laboratory
>   o----o    #5 Williams, 1 Bungtown Road
>  O----O     Cold Spring Harbor, NY, 11724
>  0--o       Tel: 1 516 367 8394
>    O        e-mail: chenn at cshl.org
>   o-o       Website: http://www.wormbase.org
> +++++++++++++++++++++++++++++++++++++++++++++
>
>
>
> On Sun, 20 Jul 2003, Jason Stajich wrote:
>
> > See the FAQ this question #5.4
> > http://www.bioperl.org/Core/Latest/FAQ.html#Q5.4
> >
> > -jason
> > On Sat, 19 Jul 2003, Jack Chen wrote:
> >
> > > Hi All,
> > >
> > > I'd like to retrieve nucleotide sequence for a give protein sequence. I
> > > know that I could do it through coded_by subfeature, which can be rather
> > > messy. Say, it could be one of the following formats
> > >
> > > 	 #AF264924.1:1749..>2110
> > >          #AF264924.1:<254..1563
> > >          #join(AY260053.1:497..545,AY260053.1:610..3342,
> > > AY260053.1:3409..3750,AY260053.1:3810..4511, AY260053.1:4569..4960)
> > >
> > > Is their a good and unified way to do it?
> > >
> > > Thanks
> > >
> > > Jack
> > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> >
> > --
> > Jason Stajich
> > Duke University
> > jason at cgt.mc.duke.edu
> >
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list