[Bioperl-l] Get variation included in genbank file

Jessica Sun jessica.sun at gmail.com
Fri Jun 11 16:33:07 UTC 2010


That was what I thought as well. It will be very nice that it can be done
within bioperl as well.


On Fri, Jun 11, 2010 at 12:30 PM, Chris Fields <cjfields at illinois.edu>wrote:

> My guess is that NCBI has something that does this internally, and the
> result is either cached or run on-the-fly.  When I retrieved the full-length
> record from NCBI it lacked SNPs as well.
>
> chris
>
> On Jun 11, 2010, at 11:26 AM, Jessica Sun wrote:
>
> > Great ! Yet, how do you add this SNP as an Feature tag named as Variation
> > into the gbk file format automatically?
> >
> > thx
> >
> >
> > On Thu, Jun 10, 2010 at 4:11 PM, Dave Messina <David.Messina at sbc.su.se
> >wrote:
> >
> >> Nice, Chris!
> >>
> >> I've added it to the EUtils cookbook.
> >>
> >> Dave
> >>
> >>
> >>
> >> On Jun 10, 2010, at 2:06 AM, Chris Fields wrote:
> >>
> >>> It's much easier to work with the GI than the accession.  NCBI
> >> unfortunately just recently 'broke' their acc->gi stuff via efetch; you
> have
> >> to use rettype='seqid' and munge ASN.1 to get everything (though it is
> nice
> >> in a way for ID mapping).
> >>>
> >>> After the initial step of grabbing the GI for NG_011506, though, you
> can
> >> use elink to grab the SNP IDs, then use efetch to get the actual SNP
> files,
> >> or esummary for the summary info.
> >>>
> >>> #!/usr/bin/perl -w
> >>>
> >>> use Modern::Perl;
> >>> use Bio::DB::EUtilities;
> >>>
> >>> my $id = '224809339';
> >>>
> >>> my $eutil = Bio::DB::EUtilities->new(-eutil => 'elink',
> >>>                                    -id    => $id,
> >>>                                    -email  => 'setyourown at foo.bar',
> >>>                                    -verbose   => 1,
> >>>                                    -dbfrom => 'nuccore',
> >>>                                    -db  => 'snp',
> >>>                                    -cmd   => 'neighbor_history',
> >>> );
> >>>
> >>> my $hist = $eutil->next_History || die "No history data returned";
> >>>
> >>> $eutil->set_parameters(-eutil => 'efetch',
> >>>                      -history   => $hist,
> >>>                      -retmode => 'text',
> >>>                      # 'chr', 'flt', 'brief', 'rsr', 'docset'
> >>>                      -rettype => 'chr'
> >>> );
> >>>
> >>> $eutil->get_Response(-file => 'snps.txt');
> >>>
> >>> # or ...
> >>>
> >>> $eutil->set_parameters(-eutil => 'esummary',
> >>>                      -history   => $hist,
> >>> );
> >>>
> >>> $eutil->print_all;
> >>>
> >>> # chris
> >>>
> >>> On Jun 9, 2010, at 1:37 PM, Jessica Sun wrote:
> >>>
> >>>> Thanks Dave.
> >>>> the variation information is not present in the version of NG_011506 I
> >> found
> >>>> at Genbank.) -- Yes, then if you click on the right side customer view
> >> there
> >>>> is a check box Features added by NCBI :209 snps, if you check that it
> >> will
> >>>> add all the variations in the gbk fomat. I found this would be a neat
> >>>> feature if it can automatically load by bioperl with an option turn
> on.
> >>>>
> >>>>
> >>>>
> >>>> On Wed, Jun 9, 2010 at 1:51 PM, Dave Messina <David.Messina at sbc.su.se
> >>> wrote:
> >>>>
> >>>>> Hi Jessica,
> >>>>>
> >>>>> Please keep the BioPerl list on the Cc line so everyone can follow
> >> along.
> >>>>>
> >>>>>
> >>>>>> Follow your approach it did not seem to me you can have Variation
> tag
> >>>>> included which
> >>>>>> list the know dbSNP location, id and allele changes?
> >>>>>
> >>>>> Ah okay, I assumed the file you attached was obtained directly from
> >> Genbank
> >>>>> and that the variation info therein was already included. (It appears
> >> that's
> >>>>> not the case — the variation information is not present in the
> version
> >> of
> >>>>> NG_011506 I found at Genbank.)
> >>>>>
> >>>>> If you want to include your own custom information in a genbank file,
> >>>>> you'll have to pull it out of dbSNP (or wherever the variation info
> >> is).
> >>>>> There are a couple of scripts that might be able to help with that
> >> (search
> >>>>> for snp):
> >>>>>
> >>>>>     http://www.bioperl.org/wiki/Bioperl_scripts
> >>>>>
> >>>>>
> >>>>> You can then insert them into a RichSeq object as features and output
> >> in
> >>>>> genbank format. For that part, see the HOWTO:
> >>>>>
> >>>>>     http://www.bioperl.org/wiki/HOWTO:Feature-Annotation
> >>>>>
> >>>>>
> >>>>> Dave
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Jessica Jingping Sun
> >>>>
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>
> >>
> >
> >
> > --
> > Jessica Jingping Sun
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


-- 
Jessica Jingping Sun




More information about the Bioperl-l mailing list