[Bioperl-l] SeqIO; writing to custom out format?

Brian Osborne brian_osborne at cognia.com
Mon Feb 24 08:33:14 EST 2003


Charles,

Postgres, yes! I'm pleading ignorance of the details but a lot of work has
gone into biosql, chado, and bioperl-db over the last few weeks, you should
take a look at the latest bioperl-db, you will find postgres compatibility I
believe. You should be able to find it here:

http://cvs.open-bio.org/

However, I can't say anything useful about Genbank and chado, I think you
want to discuss this with Chris Mungall, on this list.

Brian O.

-----Original Message-----
From: bioperl-l-bounces at bioperl.org [mailto:bioperl-l-bounces at bioperl.org]On
Behalf Of Charles Hauser
Sent: Friday, February 21, 2003 10:05 AM
To: Brian Osborne
Cc: BioPerl-List
Subject: Re: [Bioperl-l] SeqIO; writing to custom out format?

Hi Brian,

Primary aim is to parse genbank data into chado schema (postgres).

In addition,I really need a better way to manage the data locally. I
have been swamped of late and have not had time to look into biosql, but
it may be well suited for the task.  My impression was that it was a
MySQL DB, is there a postgres port?

I have a postgres EST/contig DB on the web server for web-based queries.

In addition to the genbank data I have lots of EST sequences, associated
quality and  several sets of contigs assembled from the ESTs to
store/manage/retrieve.

I know I am depending too heavily on data retrieval from flat files -
just have not made the time to get it done yet.

All suggestions/pointer most welcome.

regards,

Charles


On Thu, 2003-02-20 at 20:39, Brian Osborne wrote:
> Charles,
>
> Do you mean you'd like to load your Genbank files into postgres? Do you
> need to use your own schema or can you use the biosql database? Are you
> simply going to discard the fasta files after? Excuse the many
> questions but the answers are slightly different depending on what you
> want to do.
>
> Brian O.
>
>
> On Friday, February 21, 2003, at 05:22 AM, Charles Hauser wrote:
>
> > All,
> >
> > I think there is a clean way using SeqIO to write to a custom format,
> > but am missing it.
> >
> > Parsing genbank files, I would like to write a modified fasta outfile
> > which includes/or uses the gene name as the top line in lie of the
> > default
> >     $name = format_name($feat->_tag_value('gene'));
> >  to generate :
> >
> >> 'gene name'        'accession'
> > seq
> >
> >
> > Or am I better off outputting a GFF file?
> >
> > I am going to be using these to load a database(postgres).
> >
> > regards,
> >
> > Chuck
> >
> >
> > my %outfile = ('Cr' => {
> >                         'Fasta' => Bio::SeqIO->new('-file' => '>Cr.fa',
> >                                                    '-format' =>
> > 'fasta')
> >                        }
> >                );
> >
> >
> > FEATURES             Location/Qualifiers
> >      source          1..5131
> >                      /organism="Chlamydomonas reinhardtii"
> >                      /strain="2137"
> >                      /db_xref="taxon:3055"
> >                      /dev_stage="vegetative"
> >      gene
> > join(21..117,199..264,618..685,1031..1123,2513..2578,
> >                      2892..3023,3355..3505,3906..4109,4383..4498)
> >                      /gene="Pgp1"
> >      CDS
> > join(21..117,199..264,618..685,1031..1123,2513..2578,
> >                      2892..3023,3355..3505,3906..4109,4383..4498)
> >                      /gene="Pgp1"
> >                      /codon_start=1
> >                      /product="phosphoglycolate phosphatase precursor"
> >                      /protein_id="BAC56941.1"
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
>
>


_______________________________________________
Bioperl-l mailing list
Bioperl-l at bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list