[BioSQL-l] Is there any tools can convert a bacteria_accession number( hole genome) to ffn format( gene multi fasta) ?

Peter biopython at maubp.freeserve.co.uk
Wed Jan 5 09:40:54 UTC 2011


On Wed, Jan 5, 2011 at 9:39 AM, Peter <biopython at maubp.freeserve.co.uk> wrote:
> On Wed, Jan 5, 2011 at 8:58 AM, 徐朋 <xupeng86 at gmail.com> wrote:
>>
>> Is there any tools can convert a bacteria_accession number( hole genome) to
>> ffn format( gene multi fasta) ,
>
> You can download *.ffn files from the NCBI's FTP site, e.g.
> ftp://ftp.ncbi.nih.gov/genomes/Bacteria/
>
> If you want most/all of the available genomes as ffn files, I would
> just download them all as a gzipped file:
> ftp://ftp.ncbi.nih.gov/genomes/Bacteria/all.ffn.tar.gz
>
> Alternatively, you can probably do this via the NCBI Entrez API.
> I've not tried through. My guess is you'd need to map the genome
> accession to a list of gene IDs (using ELink), then fetch them
> as FASTA entries (using EFetch).

All of the above remarks would apply to BioPerl, Biopython, etc
(and are not really relevant to the BioSQL mailing list).

>> or  can convert sequence in biosql to genbank files ?
>>
>> Many thanks!
>
> If you have loaded the genomes into a BioSQL database (e.g.
> from the GenBank files), then you can easily get the genomes
> back again as SeqRecord objects, and save those as GenBank
> files. However, in order to get the nucleotide sequences of the
> genes you would have to use the SeqFeature objects and their
> extract method.

The above applies if you are using BioSQL with Biopython.
I would expect BioPerl etc to offer similar functionality.

Peter




More information about the BioSQL-l mailing list