[Bioperl-l] Getting UTRs

Stephen Baird sbaird at mgcheo3.med.uottawa.ca
Tue Apr 29 17:33:14 EDT 2003


Mike,
 Having looked a bit at the UTRs, the consistency in the
annotations of the features table for UTRs is not good.  The beginning and
end of the CDS is the only consistent feature. For the other ends of the
UTRs you will have to use 'source', 'mRNA', or 'gene' depending on the
entry. Multiple entries are more troublesome with alternatively spliced
transcripts.
  One somewhat easier route is to use the REFSEQ database which is
somewhat more consistent.....or download the non-redundant database of
UTRs from http://bighost.area.ba.cnr.it/BIG/UTRHome/ which is in EMBL
format.  One note for bioperl parsing of UTRdb is that the DR line with
the original entry's accession number has to have a "." at the end of it to
parse it properly.

Stephen Baird
Molecular Genetics
Children's Hospital of Eastern Ontario
Ottawa, Ontario
Canada

 On Tue, 29 Apr 2003, Michael Muratet wrote:

> Greetings
>
> This may be a question for the database partners who define features (I
> did ask Genbank), but here it is:
>
> When dealing with a record that has a CDS feature (which may have
> introns) and no mRNA feature, how does one (or bioperl) deal with the
> UTRs? Is it as simple as the first base of source to the first base of
> CDS for 5' and the last base of CDS to the last base of source for 3'?
> The answer I got from Genbank was in terms of an mRNA feature. What if
> there isn't one?
>
> Cheers
>
> Mike
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>



More information about the Bioperl-l mailing list