[Bioperl-l] Polyproteins, ribo slippage, and mat_peptide in viruses?
bill at genenformics.com
bill at genenformics.com
Tue Oct 27 17:47:02 EDT 2009
These mature proteins do have gi.
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2accession.gz
>grep NC_001959 gene2accession
11983 1491970 REVIEWED - - NP_786945.1 28416959
NC_001959.2 106060735 4 5373 + -
11983 1491970 REVIEWED - - NP_786946.1 28416960
NC_001959.2 106060735 4 5373 + -
11983 1491970 REVIEWED - - NP_786947.1 28416961
NC_001959.2 106060735 4 5373 + -
11983 1491970 REVIEWED - - NP_786948.1 28416962
NC_001959.2 106060735 4 5373 + -
11983 1491970 REVIEWED - - NP_786949.1 28416963
NC_001959.2 106060735 4 5373 + -
11983 1491970 REVIEWED - - NP_786950.1 28416964
NC_001959.2 106060735 4 5373 + -
11983 1491971 PROVISIONAL - - NP_056822.1 9630806
NC_001959.2 106060735 6949 7587 + -
11983 1491972 PROVISIONAL - - NP_056821.2 106060736
NC_001959.2 106060735 5357 6949 + -
Bill
> On Tue, Oct 27, 2009 at 8:46 PM, Chris Fields <cjfields at illinois.edu>
> wrote:
>>>
>>> Ah. That's a shame. I did just take a few minutes to try out the
>>> EFetch idea (using Biopython) and it does work beautifully for
>>> this "nice" example virus which the NCBI have annotated.
>>
>> Interesting thing about that example: if you follow the hyperlinks for
>> the
>> mat_peptide feature key, they relate back to the full protein sequence
>> with
>> from/to, not to the protein_id for the feature. Example:
>>
>> # link from the first mat_peptide
>> http://www.ncbi.nlm.nih.gov/protein/9630804?from=1&to=398&report=gpwithparts
>>
>> # protein_id
>> http://www.ncbi.nlm.nih.gov/protein/28416959
>
> Right - the protein ID link is just based on the GI number, 28416959.
> This link (or EFetch) gives you the (short) mature peptide.
>
>> This record doesn't appear to contain any mapping information along
>> those
>> lines, which makes me think this is an autogenerated record using the
>> Gene
>> record, which does have those mappings:
>>
>> http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene&cmd=Retrieve&dopt=full_report&list_uids=1491970
>
> Are you suggesting one option is (if the mat_peptide annotation
> is lacking a protein or GI number) to go online to the Gene
> database using the gene ID of the precursor (parent) protein
> to find the IDs of the mature (child) peptides?
>
> Peter
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list