[Bioperl-l] Refseq Version

shalu sharma sharmashalu.bio at gmail.com
Mon Feb 8 16:37:54 UTC 2010


Thanks a lot Russell.
But i am still confused. Actually i asked the server admin and he said that
this is Refseq's latest vesrion (the one i am using).
But the number of sequences which i am getting from blast report are not
matching with the refseq 38 release ( or i don't know which numbers to
match).

Like from blast report i am getting :
$ fastacmd -I -d /db/ncbiblast/refseq/refseq_
protein
Database: NCBI Protein Reference Sequences
           7,585,993 sequences; 2,644,770,521 total letters

And when i am looking at refseq release notes , i don't understand that
which numbers to match with because i don't see these numbers in release
notes.

Thanks a lot, I really appreciate your help.

Thanks
Shalu




On Sun, Feb 7, 2010 at 4:05 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz> wrote:

> AAArrrgg, what is it with Outlook this morning!!!
> Formatting kaput again but I'm sure you can work it out from there!
>
> --Russell
>
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > bounces at lists.open-bio.org] On Behalf Of Smithies, Russell
> > Sent: Monday, 8 February 2010 9:59 a.m.
> > To: 'shalu sharma'
> > Cc: 'bioperl-l at lists.open-bio.org'
> > Subject: Re: [Bioperl-l] Refseq Version
> >
> > I should have known it would break the formatting :-(
> >
> > Try this:
> >
> > Release 1:June 30, 2003;Release Size: 4672871949 bases, 263588685 amino
> > acids, 1061675 records
> > Release 2:October 21, 2003;Release Size: 2124 organisms, 7745398573
> > nucleotide bases, 286957682 amino acids, 1097404 records
> > Release 3:January 13, 2004;Release Size: 2218 organisms, 7992741222
> > nucleotide bases, 294647847 amino acids, 1101244 records
> > Release 4:March 24, 2004;Release Size: 2358 organisms, 8175128887
> > nucleotide bases, 318253841 amino acids, 1193457 records
> > Release 5:May 2 , 2004;Release Size: 2395 organisms, 8325515623
> nucleotide
> > bases, 337229387 amino acids, 1255613 records
> > Release 6:July 5, 2004;Release Size: 2467 organisms, 8696371716
> nucleotide
> > bases, 365446682 amino acids, 1367206 records
> > Release 7:September 12, 2004;Release Size: 2558 organisms, 21072808460
> > nucleotide bases, 405233619 amino acids, 1579579 records
> > Release 8:October 31, 2004;Release Size: 2645 organisms, 26814386658
> > nucleotide bases, 430300369 amino acids, 1709723 records
> > Release 9:January 9, 2005;Release Size:  2780 organisms, 36786975473
> > nucleotide bases, 470534907 amino acids, 1843944 records
> > Release 10:March 6, 2005;Release Size:2827 organisms, 36893741150
> > nucleotide bases, 482862858 amino acids, 1893478 records
> > Release 11:May 8, 2005;Release Size:2928 organisms, 39731702362
> nucleotide
> > bases, 507980644 amino acids, 2477893 records
> > Release 12:July 10, 2005;Release Size:2969 organisms, 43043256058
> > nucleotide bases, 608493108 amino acids, 2869675 records
> > Release 13:September 11, 2005;Release Size:3060 organisms, 44727484853
> > nucleotide bases, 686768902 amino acids, 3400773 records
> > Release 14:November 20, 2005;Release Size:3198 organisms, 47364955367
> > nucleotide bases, 763761075 amino acids, 3272776 records
> > Release 15:January 1, 2006;Release Size:3244 organisms, 52645441913
> > nucleotide bases, 810009733 amino acids, 3436263 records
> > Release 16:March 11, 2006;Release Size:3397 organisms, 56175443059
> > nucleotide bases, 887509001 amino acids, 3715260 records
> > Release 17:May 1, 2006;Release Size:3497 organisms, 62130037371
> nucleotide
> > bases, 927587669 amino acids, 3999859 records
> > Release 18:July 11, 2006;Release Size:3695 organisms, 70474041999
> > nucleotide bases, 974374765 amino acids, 4186692 records
> > Release 19:September 10, 2006;Release Size: 3774 organisms, 70694879544
> > nucleotide bases, 1012985077 amino acids, 4311543 records
> > Release 20:November 5, 2006;Release Size:3919 organisms, 72679681505
> > nucleotide bases, 1061797276 amino acids, 4567569 records
> > Release 21:January 6, 2007;Release Size:4079 organisms, 73864990566
> > nucleotide bases, 1144795927 amino acids, 4742335 records
> > Release 22:March 5, 2007;Release Size:4187 organisms, 82441128546
> > nucleotide bases, 1215085694 amino acids, 5207865 records
> > Release 23:May 8, 2007;Release Size:4300 organisms, 83148327110
> nucleotide
> > bases, 1291050995 amino acids, 5503385 records
> > Release 24:July 10, 2007;Release Size:4511 organisms, 89856995521
> > nucleotide bases, 1365916222 amino acids, 6073814 records
> > Release 25:September 11, 2007;Release Size:4646 organisms, 91265840843
> > nucleotide bases, 1470475398 amino acids, 6515132 records
> > Release 26:November 4, 2007;Release Size:4737 organisms, 99105705485
> > nucleotide bases, 1495032507 amino acids, 6698250 records
> > Release 27:January 6, 2008;Release Size:4926 organisms, 101059552113
> > nucleotide bases, 1556356987 amino acids, 7025715 records
> > Release 28:March 9, 2008;Release Size: 5059 organisms, 102051350525
> > nucleotide bases, 1770627427 amino acids, 7914560 records
> > Release 29:May 4, 2008;Release Size:5168 organisms, 104671101150
> > nucleotide bases, 1870214220 amino acids, 8376141 records
> > Release 30:July 7, 2008;Release Size:5395 organisms, 105074486709
> > nucleotide bases, 1913447691 amino acids, 8572852 records
> > Release 31:August 30, 2008;Release Size: 5513 organisms, 109214348591
> > nucleotide bases, 2026768719 amino acids, 9145702 records
> > Release 32:November 10, 2008;Release Size: 5726 organisms, 111122203221
> > nucleotide bases, 2089596746 amino acids, 9501764 records
> > Release 33:January 16, 2009;Release Size:7773 organisms, 116001583818
> > nucleotide bases, 2204073443 amino acids, 10325282 records
> > Release 34:March 6, 2009;Release Size: 8054 organisms, 111792574830
> > nucleotide bases, 2299682138 amino acids, 10021870 records
> > Release 35:May 4, 2009;Release Size: 8393 organisms, 113210655336
> > nucleotide bases, 2565199170 amino acids, 10993891 records
> > Release 36:July 2, 2009;Release Size: 8665 organisms, 117013741530
> > nucleotide bases, 2756884219 amino acids, 12141825 records
> > Release 37:September 3, 2009;Release Size: 9005 organisms, 119151229820
> > nucleotide bases, 2965450333 amino acids, 12941750 records
> > Release 38:November 7, 2009;Release Size: 9166 organisms, 119196622435
> > nucleotide bases, 3115246540 amino acids, 13436447 records
> >
> >
> >
> > > -----Original Message-----
> > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > > bounces at lists.open-bio.org] On Behalf Of Smithies, Russell
> > > Sent: Monday, 8 February 2010 9:47 a.m.
> > > To: 'shalu sharma'
> > > Cc: 'bioperl-l at lists.open-bio.org'
> > > Subject: Re: [Bioperl-l] Refseq Version
> > >
> > > Release 39 was Jan 30 and according to the README releases only come
> out
> > > in odd months (January, March, May, July, September, November)
> > > The stats file is here: ftp://ftp.ncbi.nih.gov/refseq/release/release-
> > > statistics/RefSeq-release39.01232010.stats.txt
> > >
> > > The numbers of sequences between the fasta release and the pre-build
> > blast
> > > databases seem to differ but I guess only NCBI can explain that.
> > > I can't see any way of extracting the release number from the pre-build
> > > blast databases (apart from the build date) but it might be worth
> asking
> > > NCBI if they'd include the information in future releases.
> > >
> > >
> > > FYI, here's the old release stats.
> > > (I wget'ed and grep'ed all the stats files)
> > >
> > > Release
> > >
> > > Date
> > >
> > > Year
> > >
> > > Organisms
> > >
> > > Nucleotide Bases
> > >
> > > Amino Acids
> > >
> > > Records
> > >
> > > 1
> > >
> > > Jun-30
> > >
> > > 2003
> > >
> > >             4,672,871,949
> > >
> > >             263,588,685
> > >
> > >           1,061,675
> > >
> > > 2
> > >
> > > Oct-21
> > >
> > > 2003
> > >
> > >         2,124
> > >
> > >             7,745,398,573
> > >
> > >             286,957,682
> > >
> > >           1,097,404
> > >
> > > 3
> > >
> > > Jan-13
> > >
> > > 2004
> > >
> > >         2,218
> > >
> > >             7,992,741,222
> > >
> > >             294,647,847
> > >
> > >           1,101,244
> > >
> > > 4
> > >
> > > Mar-24
> > >
> > > 2004
> > >
> > >         2,358
> > >
> > >             8,175,128,887
> > >
> > >             318,253,841
> > >
> > >           1,193,457
> > >
> > > 5
> > >
> > > May-02
> > >
> > > 2004
> > >
> > >         2,395
> > >
> > >             8,325,515,623
> > >
> > >             337,229,387
> > >
> > >           1,255,613
> > >
> > > 6
> > >
> > > Jul-05
> > >
> > > 2004
> > >
> > >         2,467
> > >
> > >             8,696,371,716
> > >
> > >             365,446,682
> > >
> > >           1,367,206
> > >
> > > 7
> > >
> > > Sep-12
> > >
> > > 2004
> > >
> > >         2,558
> > >
> > >           21,072,808,460
> > >
> > >             405,233,619
> > >
> > >           1,579,579
> > >
> > > 8
> > >
> > > Oct-31
> > >
> > > 2004
> > >
> > >         2,645
> > >
> > >           26,814,386,658
> > >
> > >             430,300,369
> > >
> > >           1,709,723
> > >
> > > 9
> > >
> > > Jan-09
> > >
> > > 2005
> > >
> > >         2,780
> > >
> > >           36,786,975,473
> > >
> > >             470,534,907
> > >
> > >           1,843,944
> > >
> > > 10
> > >
> > > Mar-06
> > >
> > > 2005
> > >
> > >         2,827
> > >
> > >           36,893,741,150
> > >
> > >             482,862,858
> > >
> > >           1,893,478
> > >
> > > 11
> > >
> > > May-08
> > >
> > > 2005
> > >
> > >         2,928
> > >
> > >           39,731,702,362
> > >
> > >             507,980,644
> > >
> > >           2,477,893
> > >
> > > 12
> > >
> > > Jul-10
> > >
> > > 2005
> > >
> > >         2,969
> > >
> > >           43,043,256,058
> > >
> > >             608,493,108
> > >
> > >           2,869,675
> > >
> > > 13
> > >
> > > Sep-11
> > >
> > > 2005
> > >
> > >         3,060
> > >
> > >           44,727,484,853
> > >
> > >             686,768,902
> > >
> > >           3,400,773
> > >
> > > 14
> > >
> > > Nov-20
> > >
> > > 2005
> > >
> > >         3,198
> > >
> > >           47,364,955,367
> > >
> > >             763,761,075
> > >
> > >           3,272,776
> > >
> > > 15
> > >
> > > Jan-01
> > >
> > > 2006
> > >
> > >         3,244
> > >
> > >           52,645,441,913
> > >
> > >             810,009,733
> > >
> > >           3,436,263
> > >
> > > 16
> > >
> > > Mar-11
> > >
> > > 2006
> > >
> > >         3,397
> > >
> > >           56,175,443,059
> > >
> > >             887,509,001
> > >
> > >           3,715,260
> > >
> > > 17
> > >
> > > May-01
> > >
> > > 2006
> > >
> > >         3,497
> > >
> > >           62,130,037,371
> > >
> > >             927,587,669
> > >
> > >           3,999,859
> > >
> > > 18
> > >
> > > Jul-11
> > >
> > > 2006
> > >
> > >         3,695
> > >
> > >           70,474,041,999
> > >
> > >             974,374,765
> > >
> > >           4,186,692
> > >
> > > 19
> > >
> > > Sep-10
> > >
> > > 2006
> > >
> > >         3,774
> > >
> > >           70,694,879,544
> > >
> > >          1,012,985,077
> > >
> > >           4,311,543
> > >
> > > 20
> > >
> > > Nov-05
> > >
> > > 2006
> > >
> > >         3,919
> > >
> > >           72,679,681,505
> > >
> > >          1,061,797,276
> > >
> > >           4,567,569
> > >
> > > 21
> > >
> > > Jan-06
> > >
> > > 2007
> > >
> > >         4,079
> > >
> > >           73,864,990,566
> > >
> > >          1,144,795,927
> > >
> > >           4,742,335
> > >
> > > 22
> > >
> > > Mar-05
> > >
> > > 2007
> > >
> > >         4,187
> > >
> > >           82,441,128,546
> > >
> > >          1,215,085,694
> > >
> > >           5,207,865
> > >
> > > 23
> > >
> > > May-08
> > >
> > > 2007
> > >
> > >         4,300
> > >
> > >           83,148,327,110
> > >
> > >          1,291,050,995
> > >
> > >           5,503,385
> > >
> > > 24
> > >
> > > Jul-10
> > >
> > > 2007
> > >
> > >         4,511
> > >
> > >           89,856,995,521
> > >
> > >          1,365,916,222
> > >
> > >           6,073,814
> > >
> > > 25
> > >
> > > Sep-11
> > >
> > > 2007
> > >
> > >         4,646
> > >
> > >           91,265,840,843
> > >
> > >          1,470,475,398
> > >
> > >           6,515,132
> > >
> > > 26
> > >
> > > Nov-04
> > >
> > > 2007
> > >
> > >         4,737
> > >
> > >           99,105,705,485
> > >
> > >          1,495,032,507
> > >
> > >           6,698,250
> > >
> > > 27
> > >
> > > Jan-06
> > >
> > > 2008
> > >
> > >         4,926
> > >
> > >          101,059,552,113
> > >
> > >          1,556,356,987
> > >
> > >           7,025,715
> > >
> > > 28
> > >
> > > Mar-09
> > >
> > > 2008
> > >
> > >         5,059
> > >
> > >          102,051,350,525
> > >
> > >          1,770,627,427
> > >
> > >           7,914,560
> > >
> > > 29
> > >
> > > May-04
> > >
> > > 2008
> > >
> > >         5,168
> > >
> > >          104,671,101,150
> > >
> > >          1,870,214,220
> > >
> > >           8,376,141
> > >
> > > 30
> > >
> > > Jul-07
> > >
> > > 2008
> > >
> > >         5,395
> > >
> > >          105,074,486,709
> > >
> > >          1,913,447,691
> > >
> > >           8,572,852
> > >
> > > 31
> > >
> > > Aug-30
> > >
> > > 2008
> > >
> > >         5,513
> > >
> > >          109,214,348,591
> > >
> > >          2,026,768,719
> > >
> > >           9,145,702
> > >
> > > 32
> > >
> > > Nov-10
> > >
> > > 2008
> > >
> > >         5,726
> > >
> > >          111,122,203,221
> > >
> > >          2,089,596,746
> > >
> > >           9,501,764
> > >
> > > 33
> > >
> > > Jan-16
> > >
> > > 2009
> > >
> > >         7,773
> > >
> > >          116,001,583,818
> > >
> > >          2,204,073,443
> > >
> > >         10,325,282
> > >
> > > 34
> > >
> > > Mar-06
> > >
> > > 2009
> > >
> > >         8,054
> > >
> > >          111,792,574,830
> > >
> > >          2,299,682,138
> > >
> > >         10,021,870
> > >
> > > 35
> > >
> > > May-04
> > >
> > > 2009
> > >
> > >         8,393
> > >
> > >          113,210,655,336
> > >
> > >          2,565,199,170
> > >
> > >         10,993,891
> > >
> > > 36
> > >
> > > Jul-02
> > >
> > > 2009
> > >
> > >         8,665
> > >
> > >          117,013,741,530
> > >
> > >          2,756,884,219
> > >
> > >         12,141,825
> > >
> > > 37
> > >
> > > Sep-03
> > >
> > > 2009
> > >
> > >         9,005
> > >
> > >          119,151,229,820
> > >
> > >          2,965,450,333
> > >
> > >         12,941,750
> > >
> > > 38
> > >
> > > Nov-07
> > >
> > > 2009
> > >
> > >         9,166
> > >
> > >          119,196,622,435
> > >
> > >          3,115,246,540
> > >
> > >         13,436,447
> > >
> > >
> > >
> > > --Russell
> > >
> > >
> > > From: shalu sharma [mailto:sharmashalu.bio at gmail.com]
> > > Sent: Saturday, 6 February 2010 3:56 a.m.
> > > To: Smithies, Russell
> > > Cc: bioperl-l at lists.open-bio.org
> > > Subject: Re: [Bioperl-l] Refseq Version
> > >
> > > Hi Russell,
> > >                Thanks for your response.
> > > I am getting the number of sequence in the database but not the release
> > > number (like 38, 39).
> > > This is what i did:
> > >
> > > $ fastacmd -I -d /db/ncbiblast/refseq/refseq_protein
> > > Database: NCBI Protein Reference Sequences
> > >            7,585,993 sequences; 2,644,770,521 total letters
> > >
> > > File names:
> > > /db/ncbiblast/refseq/refseq_protein.00
> > >    Date: Jan 30, 2010  8:34 PM    Version: 4    Longest sequence:
> 36,805
> > > res
> > > /db/ncbiblast/refseq/refseq_protein.01
> > >    Date: Jan 30, 2010  8:34 PM    Version: 4    Longest sequence:
> 33,403
> > > res
> > > /db/ncbiblast/refseq/refseq_protein.02
> > >    Date: Jan 30, 2010  8:34 PM    Version: 4    Longest sequence:
> 15,830
> > > res
> > >
> > > I am still confuse that how i can get the release number. I know refseq
> > 39
> > > was released on Jan 30, 2010 but i don't know how to confirm this. I
> > also
> > > tried look refseq release file but was not able to get any thing.
> > >
> > > I would really appreciate if anyone can help me out with this.
> > >
> > > Thanks
> > > Shalu
> > >
> > > On Thu, Feb 4, 2010 at 6:39 PM, Smithies, Russell
> > >
> > <Russell.Smithies at agresearch.co.nz<mailto:
> Russell.Smithies at agresearch.co.n
> > > z>> wrote:
> > > If you have access to the blast database, use fastacmd -I -d
> > databasename
> > > Otherwise, it's usually at the bottom of your blast result.
> > >
> > > --Russell
> > >
> > > > -----Original Message-----
> > > > From: bioperl-l-bounces at lists.open-bio.org<mailto:bioperl-l-
> > > bounces at lists.open-bio.org> [mailto:bioperl-l-<mailto:bioperl-l->
> > > > bounces at lists.open-bio.org<mailto:bounces at lists.open-bio.org>] On
> > Behalf
> > > Of shalu sharma
> > > > Sent: Friday, 5 February 2010 11:02 a.m.
> > > > To: bioperl-l at lists.open-bio.org<mailto:bioperl-l at lists.open-bio.org
> >
> > > > Subject: [Bioperl-l] Refseq Version
> > > >
> > > > Hi All,
> > > >       This is not a bioperl query.
> > > > Is there any way to check refseq version (release). Actually i am
> > using
> > > > some
> > > > server to blast my sequences (blastall) against refseq. Is there any
> > way
> > > i
> > > > can get the version information on the refseq database (from the
> blast
> > > > file
> > > > or directly from the database)?
> > > >
> > > > Thanks
> > > > Shalu
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l at lists.open-bio.org<mailto:Bioperl-l at lists.open-bio.org>
> > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > > =======================================================================
> > > Attention: The information contained in this message and/or attachments
> > > from AgResearch Limited is intended only for the persons or entities
> > > to which it is addressed and may contain confidential and/or privileged
> > > material. Any review, retransmission, dissemination or other use of, or
> > > taking of any action in reliance upon, this information by persons or
> > > entities other than the intended recipients is prohibited by AgResearch
> > > Limited. If you have received this message in error, please notify the
> > > sender immediately.
> > > =======================================================================
> > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



More information about the Bioperl-l mailing list