[Bioperl-l] Get variation included in genbank file

Chris Fields cjfields at illinois.edu
Fri Jun 11 16:30:59 UTC 2010


My guess is that NCBI has something that does this internally, and the result is either cached or run on-the-fly.  When I retrieved the full-length record from NCBI it lacked SNPs as well.

chris

On Jun 11, 2010, at 11:26 AM, Jessica Sun wrote:

> Great ! Yet, how do you add this SNP as an Feature tag named as Variation
> into the gbk file format automatically?
> 
> thx
> 
> 
> On Thu, Jun 10, 2010 at 4:11 PM, Dave Messina <David.Messina at sbc.su.se>wrote:
> 
>> Nice, Chris!
>> 
>> I've added it to the EUtils cookbook.
>> 
>> Dave
>> 
>> 
>> 
>> On Jun 10, 2010, at 2:06 AM, Chris Fields wrote:
>> 
>>> It's much easier to work with the GI than the accession.  NCBI
>> unfortunately just recently 'broke' their acc->gi stuff via efetch; you have
>> to use rettype='seqid' and munge ASN.1 to get everything (though it is nice
>> in a way for ID mapping).
>>> 
>>> After the initial step of grabbing the GI for NG_011506, though, you can
>> use elink to grab the SNP IDs, then use efetch to get the actual SNP files,
>> or esummary for the summary info.
>>> 
>>> #!/usr/bin/perl -w
>>> 
>>> use Modern::Perl;
>>> use Bio::DB::EUtilities;
>>> 
>>> my $id = '224809339';
>>> 
>>> my $eutil = Bio::DB::EUtilities->new(-eutil => 'elink',
>>>                                    -id    => $id,
>>>                                    -email  => 'setyourown at foo.bar',
>>>                                    -verbose   => 1,
>>>                                    -dbfrom => 'nuccore',
>>>                                    -db  => 'snp',
>>>                                    -cmd   => 'neighbor_history',
>>> );
>>> 
>>> my $hist = $eutil->next_History || die "No history data returned";
>>> 
>>> $eutil->set_parameters(-eutil => 'efetch',
>>>                      -history   => $hist,
>>>                      -retmode => 'text',
>>>                      # 'chr', 'flt', 'brief', 'rsr', 'docset'
>>>                      -rettype => 'chr'
>>> );
>>> 
>>> $eutil->get_Response(-file => 'snps.txt');
>>> 
>>> # or ...
>>> 
>>> $eutil->set_parameters(-eutil => 'esummary',
>>>                      -history   => $hist,
>>> );
>>> 
>>> $eutil->print_all;
>>> 
>>> # chris
>>> 
>>> On Jun 9, 2010, at 1:37 PM, Jessica Sun wrote:
>>> 
>>>> Thanks Dave.
>>>> the variation information is not present in the version of NG_011506 I
>> found
>>>> at Genbank.) -- Yes, then if you click on the right side customer view
>> there
>>>> is a check box Features added by NCBI :209 snps, if you check that it
>> will
>>>> add all the variations in the gbk fomat. I found this would be a neat
>>>> feature if it can automatically load by bioperl with an option turn on.
>>>> 
>>>> 
>>>> 
>>>> On Wed, Jun 9, 2010 at 1:51 PM, Dave Messina <David.Messina at sbc.su.se
>>> wrote:
>>>> 
>>>>> Hi Jessica,
>>>>> 
>>>>> Please keep the BioPerl list on the Cc line so everyone can follow
>> along.
>>>>> 
>>>>> 
>>>>>> Follow your approach it did not seem to me you can have Variation tag
>>>>> included which
>>>>>> list the know dbSNP location, id and allele changes?
>>>>> 
>>>>> Ah okay, I assumed the file you attached was obtained directly from
>> Genbank
>>>>> and that the variation info therein was already included. (It appears
>> that's
>>>>> not the case — the variation information is not present in the version
>> of
>>>>> NG_011506 I found at Genbank.)
>>>>> 
>>>>> If you want to include your own custom information in a genbank file,
>>>>> you'll have to pull it out of dbSNP (or wherever the variation info
>> is).
>>>>> There are a couple of scripts that might be able to help with that
>> (search
>>>>> for snp):
>>>>> 
>>>>>     http://www.bioperl.org/wiki/Bioperl_scripts
>>>>> 
>>>>> 
>>>>> You can then insert them into a RichSeq object as features and output
>> in
>>>>> genbank format. For that part, see the HOWTO:
>>>>> 
>>>>>     http://www.bioperl.org/wiki/HOWTO:Feature-Annotation
>>>>> 
>>>>> 
>>>>> Dave
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Jessica Jingping Sun
>>>> 
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>> 
>> 
> 
> 
> -- 
> Jessica Jingping Sun
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list