[Bioperl-l] Assistance with a BioPerl/Perl project

Thu Mar 24 12:54:50 EST 2005

Hello list,

I am a 22 year old bioinformatics and molecular biology major at the
University of Denver. I just accepted a position with a researcher here, and
already have a first assignment. We are working on a comprehensive
chromosome 21 gene database and map and my first task is to update a list of
known (and curated) Human chromosome 21 genes. I have become rapidly
familiar with BioPerl however my adviser needs me to use Entrez Gene to
compare the currently known Chr 21 genes (from query: '21[CHR] AND Homo
sapiens[ORGN] AND NOT Pseudogene' ) with a list of genes that she has
provided in xls and xml format. 

The idea is to take the accession numbers in the provided files, pull the
nucleotide sequence from them, and run those against the sequences for
records found with the Entrez Gene query in order to find any newly
annotated/(discovered/elucidated?) genes for that sequence. I am familiar
with the current problem of BioPerl not directly being able to parse the
EntrezGene object, but have played with the Bio::SeqIO::Gene2accession (&
geneinfo) and the egparser. My programming skills are not completely up to
par, so egparser is tough for me to grasp. Bio::SeqIO::Gene2accession is
more intuitive, however I am having a terrible time figuring out how to
convert my desired entrezgene results into the legacy gene_info and
gene2accession formats? Any suggestions are greatly appreciated, I am very
new at this, so very simple coding examples and explanations help and are
the best way for me to learn.

Thanks all!

colin