[Bioperl-l] Indexing CDS file
Chris Fields
cjfields at illinois.edu
Wed Feb 11 08:24:30 EST 2009
I'm guessing that line would be similar to DBSOURCE in GenPept files.
Could probably use Bio::Annotation::DBLink or Bio::Annotation::Target
for it (if it corresponds to a particular subset of the sequence).
chris
On Feb 11, 2009, at 6:44 AM, Heikki Lehvaslaiho wrote:
> Dave,
>
> Looks good. Are you going to do the changes in to the EMBL parser?
>
> -Heikki
>
> 2009/2/11 Dave Messina <David.Messina at sbc.su.se>:
>> Thanks, Heikki.
>>
>> I took a closer look at the EBI ftp site where Sviya and I got the
>> file, and
>> in their README (ftp://ftp.ebi.ac.uk/pub/databases/embl/cds/README.txt
>> ) it
>> says:
>>
>> PA line - contains the accession.version of the "parent" EMBL entry
>> (entry where the CDS is annotated)
>>
>>
>> So, unfortunately they've decided that a CDS record, which has no
>> accession
>> of its own, doesn't get its parent's accession number, but gets to
>> refer to
>> its parent's accession number via the PA line.
>>
>> Furthermore, there's an
>>
>> OX line - contains the NCBI taxid for the organism; taxonomic data
>> are taken
>> from the parent EMBL entries
>>
>> which is also not part of the the formal spec. (although this one
>> is a more
>> worthwhile addition, IMO)
>>
>> Sooooo, I think we'll need to add support for these.
>>
>> 'PA' seems easy enough -- the EMBL parser can look for it if there
>> isn't an
>> 'AC' line.
>>
>> As for 'OX', is there a standard slot for a taxonID in a RichSeq
>> SeqFeature
>> table? Coming from a Genbank record or a vanilla EMBL record, this is
>> normally encoded as
>>
>> primary tag: source
>> tag: db_xref
>> value: taxon:9606
>>
>> right?
>>
>> Should do the same if we're coming from an EMBL entry, even though
>> it's not
>> actually in the feature table?
>>
>>
>> Dave
>>
>>
>
>
>
> --
> -Heikki
> Heikki Lehvaslaiho - heikki lehvaslaiho gmail com
> Sent from: Johannesburg Gauteng South Africa.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list