From jason at dev.open-bio.org Thu Nov 1 10:51:57 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Thu, 01 Nov 2007 14:51:57 +0000
Subject: [Bioperl-guts-l] bioperl-live/t/data codeml4.mlc,NONE,1.1
Message-ID: <200711011451.lA1EpvFi031895@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/t/data
In directory dev.open-bio.org:/tmp/cvs-serv31869/t/data
Added Files:
codeml4.mlc
Log Message:
add Codeml pairwise result for PAML4
--- NEW FILE: codeml4.mlc ---
seed used = 480745521
5 855
human GTG CTG TCT CCT GCC GAC AAG ACC AAC GTC AAG GCC GCC TGG GGC AAG GTT GGC GCG CAC GCT GGC GAG TAT GGT GCG GAG GCC CTG GAG AGG ATG TTC CTG TCC TTC CCC ACC ACC AAG ACC TAC TTC CCG CAC TTC GAC CTG AGC CAC GGC TCT GCC CAG GTT AAG GGC CAC GGC AAG AAG GTG GCC GAC GCG CTG ACC AAC GCC GTG GCG CAC GTG GAC GAC ATG CCC AAC GCG CTG TCC GCC CTG AGC GAC CTG CAC GCG CAC AAG CTT CGG GTG GAC CCG GTC AAC TTC AAG CTC CTA AGC CAC TGC CTG CTG GTG ACC CTG GCC GCC CAC CTC CCC GCC GAG TTC ACC CCT GCG GTG CAC GCC TCC CTG GAC AAG TTC CTG GCT TCT GTG AGC ACC GTG CTG ACC TCC AAA TAC CGT CTG ACT CCT GAG GAG AAG TCT GCC GTT ACT GCC CTG TGG GGC AAG GTG AAC GTG GAT GAA GTT GGT GGT GAG GCC CTG GGC AGG CTG CTG GTG GTC TAC CCT TGG ACC CAG AGG TTC TTT GAG TCC TTT GGG GAT CTG TCC ACT CCT GAT GCT GTT ATG GGC AAC CCT AAG GTG AAG GCT CAT GGC AAG AAA GTG CTC GGT GCC TTT AGT GAT GGC CTG GCT CAC CTG GAC AAC CTC AAG GGC ACC TTT GCC ACA CTG AGT GAG CTG CAC TGT GAC AAG CTG CAC GTG GAT CCT !
GAG AAC TTC AGG CTC CTG GGC AAC GTG CTG GTC TGT GTG CTG GCC CAT CAC TTT GGC AAA GAA TTC ACC CCA CCA GTG CAG GCT GCC TAT CAG AAA GTG GTG GCT GGT GTG GCT AAT GCC CTG GCC CAC AAG TAT CAC
goat-cow GTG CTG TCT GCC GCC GAC AAG TCC AAT GTC AAG GCC GCC TGG GGC AAG GTT GGC GGC AAC GCT GGA GCT TAT GGC GCA GAG GCT CTG GAG AGG ATG TTC CTG AGC TTC CCC ACC ACC AAG ACC TAC TTC CCC CAC TTC GAC CTG AGC CAC GGC TCG GCC CAG GTC AAG GGC CAC GGC GAG AAG GTG GCC GCC GCG CTG ACC AAA GCG GTG GGC CAC CTG GAC GAC CTG CCC GGT ACT CTG TCT GAT CTG AGT GAC CTG CAC GCC CAC AAG CTG CGT GTG GAC CCG GTC AAC TTT AAG CTT CTG AGC CAC TCC CTG CTG GTG ACC CTG GCC TGC CAC CTC CCC AAT GAT TTC ACC CCC GCG GTC CAC GCC TCC CTG GAC AAG TTC TTG GCC AAC GTG AGC ACC GTG CTG ACC TCC AAA TAC CGT CTG ACT GCT GAG GAG AAG GCT GCC GTC ACC GCC TTT TGG GGC AAG GTG AAA GTG GAT GAA GTT GGT GGT GAG GCC CTG GGC AGG CTG CTG GTT GTC TAC CCC TGG ACT CAG AGG TTC TTT GAG TCC TTT GGG GAC TTG TCC ACT GCT GAT GCT GTT ATG AAC AAC CCT AAG GTG AAG GCC CAT GGC AAG AAG GTG CTA GAT TCC TTT AGT AAT GGC ATG AAG CAT CTC GAT GAC CTC AAG GGC ACC TTT GCT GCG CTG AGT GAG CTG CAC TGT GAT AAG CTG CAT GTG GAT CCT !
GAG AAC TTC AAG CTC CTG GGC AAC GTG CTA GTG GTT GTG CTG GCT CGC AAT TTT GGC AAG GAA TTC ACC CCG GTG CTG CAG GCT GAC TTT CAG AAG GTG GTG GCT GGT GTG GCC AAT GCC CTG GCC CAC AGA TAT CAT
rabbit GTG CTG TCT CCC GCT GAC AAG ACC AAC ATC AAG ACT GCC TGG GAA AAG ATC GGC AGC CAC GGT GGC GAG TAT GGC GCC GAG GCC GTG GAG AGG ATG TTC TTG GGC TTC CCC ACC ACC AAG ACC TAC TTC CCC CAC TTC GAC TTC ACC CAC GGC TCT GAG CAG ATC AAA GCC CAC GGC AAG AAG GTG TCC GAA GCC CTG ACC AAG GCC GTG GGC CAC CTG GAC GAC CTG CCC GGC GCC CTG TCT ACT CTC AGC GAC CTG CAC GCG CAC AAG CTG CGG GTG GAC CCG GTG AAT TTC AAG CTC CTG TCC CAC TGC CTG CTG GTG ACC CTG GCC AAC CAC CAC CCC AGT GAA TTC ACC CCT GCG GTG CAT GCC TCC CTG GAC AAG TTC CTG GCC AAC GTG AGC ACC GTG CTG ACC TCC AAA TAT CGT CTG TCC AGT GAG GAG AAG TCT GCG GTC ACT GCC CTG TGG GGC AAG GTG AAT GTG GAA GAA GTT GGT GGT GAG GCC CTG GGC AGG CTG CTG GTT GTC TAC CCA TGG ACC CAG AGG TTC TTC GAG TCC TTT GGG GAC CTG TCC TCT GCA AAT GCT GTT ATG AAC AAT CCT AAG GTG AAG GCT CAT GGC AAG AAG GTG CTG GCT GCC TTC AGT GAG GGT CTG AGT CAC CTG GAC AAC CTC AAA GGC ACC TTT GCT AAG CTG AGT GAA CTG CAC TGT GAC AAG CTG CAC GTG GAT CCT !
GAG AAC TTC AGG CTC CTG GGC AAC GTG CTG GTT ATT GTG CTG TCT CAT CAT TTT GGC AAA GAA TTC ACT CCT CAG GTG CAG GCT GCC TAT CAG AAG GTG GTG GCT GGT GTG GCC AAT GCC CTG GCT CAC AAA TAC CAC
rat GTG CTC TCT GCA GAT GAC AAA ACC AAC ATC AAG AAC TGC TGG GGG AAG ATT GGT GGC CAT GGT GGT GAA TAT GGC GAG GAG GCC CTA CAG AGG ATG TTC GCT GCC TTC CCC ACC ACC AAG ACC TAC TTC TCT CAC ATT GAT GTA AGC CCC GGC TCT GCC CAG GTC AAG GCT CAC GGC AAG AAG GTT GCT GAT GCC TTG GCC AAA GCT GCA GAC CAC GTC GAA GAC CTG CCT GGT GCC CTG TCC ACT CTG AGC GAC CTG CAT GCC CAC AAA CTG CGT GTG GAT CCT GTC AAC TTC AAG TTC CTG AGC CAC TGC CTG CTG GTG ACC TTG GCT TGC CAC CAC CCT GGA GAT TTC ACA CCC GCC ATG CAC GCC TCT CTG GAC AAA TTC CTT GCC TCT GTG AGC ACT GTG CTG ACC TCC AAG TAC CGT CTA ACT GAT GCT GAG AAG GCT GCT GTT AAT GCC CTG TGG GGA AAG GTG AAC CCT GAT GAT GTT GGT GGC GAG GCC CTG GGC AGG CTG CTG GTT GTC TAC CCT TGG ACC CAG AGG TAC TTT GAT AGC TTT GGG GAC CTG TCC TCT GCC TCT GCT ATC ATG GGT AAC CCT AAG GTG AAG GCC CAT GGC AAG AAG GTG ATA AAC GCC TTC AAT GAT GGC CTG AAA CAC TTG GAC AAC CTC AAG GGC ACC TTT GCT CAT CTG AGT GAA CTC CAC TGT GAC AAG CTG CAT GTG GAT CCT !
GAG AAC TTC AGG CTC CTG GGC AAT ATG ATT GTG ATT GTG TTG GGC CAC CAC CTG GGC AAG GAA TTC ACC CCC TGT GCA CAG GCT GCC TTC CAG AAG GTG GTG GCT GGA GTG GCC AGT GCC CTG GCT CAC AAG TAC CAC
marsupial GTG CTC TCG GAT GCT GAC AAG ACT CAC GTG AAA GCC ATC TGG GGT AAG GTG GGA GGC CAC GCC GGT GCC TAC GCA GCT GAA GCT CTT GCC AGA ACC TTC CTC TCC TTC CCC ACT ACC AAA ACT TAC TTC CCC CAC TTC GAC CTG TCC CCC GGC TCC GCC CAG ATC CAG GGT CAT GGT AAG AAG GTA GCC GAT GCC CTT TCC CAG GCT GTT GCC CAC CTG GAC GAC CTG CCC GGA ACC ATG TCC AAA CTA AGC GAC CTG CAC GCC CAC AAG CTG AGA GTG GAT CCC GTG AAC TTC AAG CTC CTC TCT CAC TGC CTG ATC GTG ACT CTG GCC GCC CAT CTG AGC AAG GAT TTG ACT CCC GAA GTG CAC GCC TCC ATG GAC AAG TTC TTT GCC TCT GTG GCT ACC GTG CTG ACC TCG AAG TAC CGT TTG ACT TCT GAG GAG AAG AAC TGC ATC ACT ACC ATC TGG TCT AAG GTG CAG GTT GAC CAG ACT GGT GGT GAG GCC CTT GGC AGG ATG CTC GTT GTC TAC CCC TGG ACC ACC AGG TTT TTT GGG AGC TTT GGT GAT CTG TCC TCT CCT GGC GCT GTC ATG TCA AAT TCT AAG GTT CAA GCC CAT GGT GCT AAG GTG TTG ACC TCC TTC GGT GAA GCA GTC AAG CAT TTG GAC AAC CTG AAG GGT ACT TAT GCC AAG TTG AGT GAG CTC CAC TGT GAC AAG CTG CAT GTG GAC CCT !
GAG AAC TTC AAG ATG CTG GGG AAT ATC ATT GTG ATC TGC CTG GCT GAG CAC TTT GGC AAG GAT TTT ACT CCT GAA TGT CAG GTT GCT TGG CAG AAG CTC GTG GCT GGA GTT GCC CAT GCC CTG GCC CAC AAG TAC CAC
Printing out site pattern counts
5 669 P
human GTG CTG TCT CCT GCC GAC AAG ACC AAC GTC AAG GCC GCC TGG GGC AAG GTT GGC GCG CAC GCT GGC GAG TAT GGT GCG GAG GCC CTG GAG AGG ATG TTC CTG TCC CCC ACC ACC TAC CCG CAC TTC GAC CTG AGC CAC GGC TCT GCC CAG GTT AAG GGC CAC GGC AAG GTG GCC GAC GCG CTG ACC AAC GCC GTG GCG GTG GAC ATG CCC AAC GCG CTG TCC GCC CTG AGC CTG CAC GCG CTT CGG GAC CCG GTC AAC TTC CTC CTA AGC TGC CTG CTG GCC GCC CTC CCC GCC GAG TTC ACC CCT GCG GTG CAC GCC TCC CTG GCT TCT AGC ACC TCC AAA TAC CGT CTG ACT CCT GAG GAG TCT GCC GTT ACT GCC CTG GGC AAC GTG GAT GAA GTT GGT GGT CTG AGG CTG GTG GTC CCT ACC CAG TTC TTT GAG TCC TTT GGG GAT CTG TCC ACT CCT GAT GCT GTT ATG GGC AAC CCT GTG AAG GCT CAT AAG AAA CTC GGT GCC TTT AGT GAT GGC CTG GCT CAC CTG GAC AAC CTC AAG TTT GCC ACA CTG AGT GAG TGT CAC GAT CCT AAC AGG CTC GGC AAC GTG CTG GTC TGT GTG GCC CAT CAC TTT AAA GAA TTC ACC CCA CCA GTG GCT GCC TAT GTG GGT AAT GCC AAG TAT CAC
goat-cow ... ... ... G.C ... ... ... T.. ..T ... ... ... ... ... ... ... ... ... .GC A.. ... ..A .CT ... ..C ..A ... ..T ... ... ... ... ... ... AG. ... ... ... ... ..C ... ... ... ... ... ... ... ..G ... ... ..C ... ... ... ... G.. ... ... .C. ... ... ... ..A ..G ... .GC C.. ... C.. ... GGT A.T ... ..T .AT ... ..T ... ... ..C ..G ..T ... ... ... ... ..T ..T ..G ... .C. ... ... ... TG. ... ... AAT ..T ... ... ..C ... ..C ... ... ... T.. ..C AAC ... ... ... ... ... ... ... ... G.. ... ... G.. ... ..C ..C ... T.T ... ..A ... ... ... ... ... ... ... ... ... ..T ... ..C ..T ... ... ... ... ... ... ... ..C T.. ... ... G.. ... ... ... ... AA. ... ... ... ... ..C ... ... ..G ..A .A. T.. ... ... A.. ... A.. AAG ..T ..C ..T G.. ... ... ... ..T G.G ... ... ... ... ..T ... ... ... .A. ... ... ... ... ..A ..G GT. ... ..T .GC A.T ... ..G ... ... ... ..G GTG C.. ... .A. .T. ... ... ... ... .GA ... ..T
rabbit ... ... ... ..C ..T ... ... ... ... A.. ... A.T ... ... .AA ... A.C ... AGC ... .G. ... ... ... ..C ..C ... ... G.. ... ... ... ... T.. GG. ... ... ... ... ..C ... ... ... T.C .C. ... ... ... .AG ... A.C ..A .C. ... ... ... ... T.. ..A ..C ... ... ..G ... ... .GC C.. ... C.. ... GG. ..C ... ..T A.T ..C ... ... ... ... ..G ... ... ... ..G ..T ... ... ..G TC. ... ... ... ... AA. .A. ... AGT ..A ... ... ... ... ... ..T ... ... ... ..C AAC ... ... ... ... ..T ... ... T.C AG. ... ... ... ..G ..C ... ... ... ... ..T ... ..A ... ... ... ... ... ... ... ..T ... ..A ... ... ... ..C ... ... ... ... ..C ... ... T.. G.A A.. ... ... ... AA. ..T ... ... ... ... ... ... ..G ..G .C. ... ..C ... ..G ..T ... AG. ... ... ... ... ... ..A ... ..T .AG ... ... ..A ... ... ... ... ... ... ... ... ... ... ... ..T AT. ... T.T ... ..T ... ... ... ... ..T ..T .AG ... ... ... ... ... ... ... ..T ..A ..C ...
rat ... ..C ... G.A .AT ... ..A ... ... A.. ... AA. TG. ... ..G ... A.. ..T .GC ..T .G. ..T ..A ... ..C .A. ... ... ..A C.. ... ... ... GCT G.. ... ... ... ... T.T ... A.T ..T G.A ... .C. ... ... ... ... ..C ... .CT ... ... ... ..T ..T ..T ..C T.. G.. ..A ..T .CA .AC ..C ..A C.. ..T GGT ..C ... ... A.T ... ... ... ..T ..C ..G ..T ..T ..T ... ... ... T.. ..G ... ... ... T.. ..T TG. .A. ..T .GA ..T ... ..A ..C ..C A.. ... ... ..T ..T ..C ... ... ..T ... ..G ... ... ..A ... GA. .CT ... G.. ..T ... .A. ... ... ..A ... CCT ... ..T ... ... ..C ... ... ... ..T ... ... ... ... .A. ... ..T AG. ... ... ..C ... ... T.. G.C TC. ... A.C ... ..T ... ... ... ... ..C ... ... ..G A.A AAC ... ..C .A. ... ... ... AAA ... T.. ... ... ... ... ... ..T CAT ... ... ..A ... ..T ... ... ... ... ... ... ..T A.. A.T ..G AT. ... .G. ..C ... C.G ..G ... ... ... ..C TGT .CA ... ... .TC ... ..A .G. ..T ... ..C ...
marsupial ... ..C ..G GA. ..T ... ... ..T C.. ..G ..A ... AT. ... ..T ... ..G ..A .GC ... ..C ..T .CC ..C .CA ..T ..A ..T ..T .CC ..A .CC ... ..C ... ... ..T ... ... ..C ... ... ... ... TC. .C. ... ..C ... ... A.C C.. ..T ..T ..T ... ..A ... ..T ..C ..T T.. C.G ..T ..T ..C C.. ... C.. ... GGA A.C A.. ... AAA ..A ... ... ... ..C ..G A.A ..T ..C ..G ... ... ... ..C TCT ... A.C ... ... ... ..G AG. AAG ..T ..G ..T ..C .AA ... ... ... ... T.T ..C ... GCT ... ..G ..G ... ... T.. ... T.. ... ... AAC TG. A.C ... A.. A.C TCT C.G ..T ..C C.G AC. ... ... ..T ... ..C ..T ... ..C ... ACC ..T ... .G. AG. ... ..T ... ... ... T.. ... .GC ... ..C ... TCA ..T T.. ..T C.A ..C ... GCT ..G T.G ACC T.. ..C G.. ..A .CA G.C AAG ..T T.. ... ... ..G ... .A. ... .AG T.. ... ... ... ..T ..C ... ... .A. A.G ..G ..T A.C A.T ..G ATC TGC ..T G.G ... ... ..G ..T ..T ..T ..T GA. TGT .T. ..T .GG C.C ..A C.. ... ... ..C ...
9 2 1 1 1 4 3 1 1 1 2 1 1 3 1
7 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 5 1 1 1 4 2 2 1 6 1 1 1 1
1 3 1 1 3 1 1 1 2 3 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 3 1 1
1 1 6 1 1 1 1 1 1 1 1 1 1 1 1
1 1 2 1 1 1 1 1 1 1 1 1 1 1 1
3 1 1 2 1 1 1 1 1 1 1 1 1 1 1
3 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 2 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 2 1 1 1 1 1 2 1 1 1
1 2 1 1 1 1 1 1 1 1 1 1 1 2 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
CODONML (in paml version 4, June 2007) abglobin.nuc Model: One dN/dS ratio
Codon frequencies: Fequal
ns = 5 ls = 285
Codon usage in sequences
--------------------------------------------------------------------------------------------------
Phe TTT 5 8 3 3 6 | Ser TCT 4 2 6 7 6 | Tyr TAT 3 2 3 1 1 | Cys TGT 2 1 1 2 2
TTC 10 9 13 11 8 | TCC 6 7 7 3 8 | TAC 3 3 3 5 5 | TGC 1 1 1 3 3
Leu TTA 0 0 0 0 0 | TCA 0 0 0 0 1 | *** TAA 0 0 0 0 0 | *** TGA 0 0 0 0 0
TTG 0 2 1 4 5 | TCG 0 1 0 0 2 | TAG 0 0 0 0 0 | Trp TGG 3 3 3 3 4
--------------------------------------------------------------------------------------------------
Leu CTT 1 1 0 1 3 | Pro CCT 7 2 4 7 3 | His CAT 2 4 4 5 6 | Arg CGT 1 2 1 2 1
CTC 5 4 4 4 7 | CCC 3 6 5 4 7 | CAC 16 11 15 14 12 | CGC 0 1 0 0 0
CTA 1 2 0 2 1 | CCA 2 0 1 0 0 | Gln CAA 0 0 0 0 1 | CGA 0 0 0 0 0
CTG 29 28 30 21 15 | CCG 2 2 1 0 0 | CAG 4 4 5 5 7 | CGG 1 0 1 0 0
--------------------------------------------------------------------------------------------------
Ile ATT 0 0 1 4 1 | Thr ACT 3 4 4 3 10 | Asn AAT 1 5 5 3 2 | Ser AGT 2 3 5 2 1
ATC 0 0 3 2 7 | ACC 12 11 12 9 9 | AAC 9 7 7 8 4 | AGC 4 4 3 5 3
ATA 0 0 0 1 0 | ACA 1 0 0 1 0 | Lys AAA 4 3 5 5 3 | Arg AGA 0 1 0 0 2
Met ATG 3 3 2 4 5 | ACG 0 0 0 0 0 | AAG 18 21 19 19 21 | AGG 4 3 4 4 2
--------------------------------------------------------------------------------------------------
Val GTT 5 5 4 4 6 | Ala GCT 8 11 8 13 10 | Asp GAT 5 8 1 11 6 | Gly GGT 5 4 5 6 10
GTC 4 6 2 4 3 | GCC 21 18 16 18 19 | GAC 10 10 10 8 10 | GGC 14 15 14 12 5
GTA 0 0 0 1 1 | GCA 0 1 1 3 2 | Glu GAA 2 2 7 4 4 | GGA 0 1 0 3 3
GTG 21 19 21 14 14 | GCG 7 4 3 0 0 | GAG 10 9 10 5 6 | GGG 1 1 1 2 2
--------------------------------------------------------------------------------------------------
Codon position x base (3x4) table for each sequence.
#1: human
position 1: T:0.12982 C:0.25965 A:0.21404 G:0.39649
position 2: T:0.29474 C:0.26667 A:0.30526 G:0.13333
position 3: T:0.18947 C:0.41404 A:0.03509 G:0.36140
Average T:0.20468 C:0.31345 A:0.18480 G:0.29708
#2: goat-cow
position 1: T:0.13684 C:0.23509 A:0.22807 G:0.40000
position 2: T:0.30526 C:0.24211 A:0.31228 G:0.14035
position 3: T:0.21754 C:0.39649 A:0.03509 G:0.35088
Average T:0.21988 C:0.29123 A:0.19181 G:0.29708
#3: rabbit
position 1: T:0.14386 C:0.24912 A:0.24561 G:0.36140
position 2: T:0.29474 C:0.23860 A:0.32982 G:0.13684
position 3: T:0.19298 C:0.40351 A:0.04912 G:0.35439
Average T:0.21053 C:0.29708 A:0.20819 G:0.28421
#4: rat
position 1: T:0.14737 C:0.22807 A:0.24561 G:0.37895
position 2: T:0.28070 C:0.23860 A:0.32632 G:0.15439
position 3: T:0.25965 C:0.38596 A:0.07018 G:0.28421
Average T:0.22924 C:0.28421 A:0.21404 G:0.27251
#5: marsupial
position 1: T:0.17895 C:0.22105 A:0.24561 G:0.35439
position 2: T:0.28772 C:0.27018 A:0.30877 G:0.13333
position 3: T:0.25965 C:0.38596 A:0.06316 G:0.29123
Average T:0.24211 C:0.29240 A:0.20585 G:0.25965
Sums of codon usage counts
------------------------------------------------------------------------------
Phe F TTT 25 | Ser S TCT 25 | Tyr Y TAT 10 | Cys C TGT 8
TTC 51 | TCC 31 | TAC 19 | TGC 9
Leu L TTA 0 | TCA 1 | *** * TAA 0 | *** * TGA 0
TTG 12 | TCG 3 | TAG 0 | Trp W TGG 16
------------------------------------------------------------------------------
Leu L CTT 6 | Pro P CCT 23 | His H CAT 21 | Arg R CGT 7
CTC 24 | CCC 25 | CAC 68 | CGC 1
CTA 6 | CCA 3 | Gln Q CAA 1 | CGA 0
CTG 123 | CCG 5 | CAG 25 | CGG 2
------------------------------------------------------------------------------
Ile I ATT 6 | Thr T ACT 24 | Asn N AAT 16 | Ser S AGT 13
ATC 12 | ACC 53 | AAC 35 | AGC 19
ATA 1 | ACA 2 | Lys K AAA 20 | Arg R AGA 3
Met M ATG 17 | ACG 0 | AAG 98 | AGG 17
------------------------------------------------------------------------------
Val V GTT 24 | Ala A GCT 50 | Asp D GAT 31 | Gly G GGT 30
GTC 19 | GCC 92 | GAC 48 | GGC 60
GTA 2 | GCA 7 | Glu E GAA 19 | GGA 7
GTG 89 | GCG 14 | GAG 40 | GGG 7
------------------------------------------------------------------------------
Codon position x base (3x4) table, overall
position 1: T:0.14737 C:0.23860 A:0.23579 G:0.37825
position 2: T:0.29263 C:0.25123 A:0.31649 G:0.13965
position 3: T:0.22386 C:0.39719 A:0.05053 G:0.32842
Average T:0.22129 C:0.29567 A:0.20094 G:0.28211
Codon frequencies under model, for use in evolver (TTT TTC TTA TTG ... GGG):
0.01639344 0.01639344 0.01639344 0.01639344
0.01639344 0.01639344 0.01639344 0.01639344
0.01639344 0.01639344 0.00000000 0.00000000
0.01639344 0.01639344 0.00000000 0.01639344
0.01639344 0.01639344 0.01639344 0.01639344
0.01639344 0.01639344 0.01639344 0.01639344
0.01639344 0.01639344 0.01639344 0.01639344
0.01639344 0.01639344 0.01639344 0.01639344
0.01639344 0.01639344 0.01639344 0.01639344
0.01639344 0.01639344 0.01639344 0.01639344
0.01639344 0.01639344 0.01639344 0.01639344
0.01639344 0.01639344 0.01639344 0.01639344
0.01639344 0.01639344 0.01639344 0.01639344
0.01639344 0.01639344 0.01639344 0.01639344
0.01639344 0.01639344 0.01639344 0.01639344
0.01639344 0.01639344 0.01639344 0.01639344
Nei & Gojobori 1986. dN/dS (dN, dS)
(Note: This matrix is not used in later m.l. analysis.
Use runmode = -2 for ML pairwise comparison.)
human
goat-cow 0.2507 (0.0863 0.3443)
rabbit 0.2627 (0.0867 0.3301) 0.2943 (0.1054 0.3581)
rat 0.2045 (0.1261 0.6164) 0.2462 (0.1493 0.6065) 0.2178 (0.1348 0.6187)
marsupial 0.1902 (0.1931 1.0148) 0.1891 (0.1910 1.0099) 0.2184 (0.2111 0.9668) 0.2716 (0.2404 0.8852)
pairwise comparison, codon frequencies: Fequal.
2 (goat-cow) ... 1 (human)
lnL =-1596.739984
0.43894 1.87997 0.29075
t= 0.4389 S= 236.2 N= 618.8 dN/dS= 0.2908 dN= 0.0874 dS= 0.3006
3 (rabbit) ... 1 (human)
lnL =-1593.524138
0.43016 1.74883 0.29827
t= 0.4302 S= 234.0 N= 621.0 dN/dS= 0.2983 dN= 0.0872 dS= 0.2924
3 (rabbit) ... 2 (goat-cow)
lnL =-1634.706680
0.49216 1.87718 0.33712
t= 0.4922 S= 236.1 N= 618.9 dN/dS= 0.3371 dN= 0.1063 dS= 0.3154
4 (rat) ... 1 (human)
lnL =-1722.645692
0.71450 1.56250 0.22967
t= 0.7145 S= 230.6 N= 624.4 dN/dS= 0.2297 dN= 0.1250 dS= 0.5445
4 (rat) ... 2 (goat-cow)
lnL =-1750.493581
0.75917 2.15224 0.28700
t= 0.7592 S= 240.3 N= 614.7 dN/dS= 0.2870 dN= 0.1490 dS= 0.5192
4 (rat) ... 3 (rabbit)
lnL =-1730.718528
0.73744 1.94983 0.24972
t= 0.7374 S= 237.3 N= 617.7 dN/dS= 0.2497 dN= 0.1340 dS= 0.5368
5 (marsupial) ... 1 (human)
lnL =-1854.925323
1.18627 1.01786 0.17389
t= 1.1863 S= 218.3 N= 636.7 dN/dS= 0.1739 dN= 0.1787 dS= 1.0276
5 (marsupial) ... 2 (goat-cow)
lnL =-1853.486254
1.12573 1.26005 0.19796
t= 1.1257 S= 224.3 N= 630.7 dN/dS= 0.1980 dN= 0.1819 dS= 0.9189
5 (marsupial) ... 3 (rabbit)
lnL =-1873.309372
1.18527 1.16455 0.20554
t= 1.1853 S= 222.0 N= 633.0 dN/dS= 0.2055 dN= 0.1972 dS= 0.9593
5 (marsupial) ... 4 (rat)
lnL =-1901.596050
1.20106 1.12357 0.24932
t= 1.2011 S= 221.0 N= 634.0 dN/dS= 0.2493 dN= 0.2251 dS= 0.9030
From jason at dev.open-bio.org Thu Nov 1 10:52:58 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Thu, 01 Nov 2007 14:52:58 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/Tools/Phylo PAML.pm,1.55,1.56
Message-ID: <200711011452.lA1EqwYM031935@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/Tools/Phylo
In directory dev.open-bio.org:/tmp/cvs-serv31902/Bio/Tools/Phylo
Modified Files:
PAML.pm
Log Message:
Parsing PAML4 and PAML3.15 should work now. Dealing with variable order for the sequences and summary results in the top of the MLC files
Index: PAML.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/Tools/Phylo/PAML.pm,v
retrieving revision 1.55
retrieving revision 1.56
diff -C2 -d -r1.55 -r1.56
*** PAML.pm 30 Oct 2007 16:33:54 -0000 1.55
--- PAML.pm 1 Nov 2007 14:52:56 -0000 1.56
***************
*** 267,270 ****
--- 267,271 ----
m/^pairwise comparison, codon frequencies:/) {
# runmode = -2, CODONML
+ $self->debug("pairwise Ka/Ks\n");
$self->_pushback($_);
%data = $self->_parse_PairwiseCodon;
***************
*** 369,373 ****
}
}
! } elsif ($seqtype eq 'YN00') {
while ($_ = $self->_readline) {
if( m/^Estimation by the method|\(B\) Yang & Nielsen \(2000\) method/ ) {
--- 370,375 ----
}
}
! } elsif ($seqtype eq 'YN00')
! {
while ($_ = $self->_readline) {
if( m/^Estimation by the method|\(B\) Yang & Nielsen \(2000\) method/ ) {
***************
*** 426,433 ****
/ox
) {
! @{$self->{'_summary'}}{qw(seqtype version seqfile model)} = ($1,
! $2,
! $3,
! $4);
defined $self->{'_summary'}->{'model'} &&
$self->{'_summary'}->{'model'} =~ s/Model:\s+//;
--- 428,433 ----
/ox
) {
! @{$self->{'_summary'}}{qw(seqtype version seqfile model)} =
! ($1, $2,$3,$4);
defined $self->{'_summary'}->{'model'} &&
$self->{'_summary'}->{'model'} =~ s/Model:\s+//;
***************
*** 437,440 ****
--- 437,443 ----
$self->{'_summary'} = {};
$self->{'_summary'}->{'multidata'}++;
+ } elsif( m/^Before\s+deleting\s+alignment\s+gaps/ ) {
+ my ($phylip_header) = $self->_readline;
+ $self->_parse_seqs;
}
}
***************
*** 657,665 ****
my ($patternct, at patterns,$ns,$ls);
while( defined($_ = $self->_readline) ) {
! if( /^Codon position/ ) {
! $self->_pushback($_);
! last;
! } elsif( /^Codon usage/ ) {
! $self->_pushback($_);
last;
} elsif( $patternct ) {
--- 660,665 ----
my ($patternct, at patterns,$ns,$ls);
while( defined($_ = $self->_readline) ) {
! if( /^Codon\s+(usage|position)/ ) {
! $self->_pushback($_);
last;
} elsif( $patternct ) {
***************
*** 686,696 ****
# an array but we'll stay with this for now
my ($self) = @_;
my (@firstseq, at seqs);
while( defined ($_ = $self->_readline) ) {
! if( /^(TREE|Codon)/ ) { $self->_pushback($_); last }
last if( /^\s+$/ && @seqs > 0 );
next if ( /^\s+$/ );
next if( /^\d+\s+$/ );
my ($name,$seqstr) = split(/\s+/,$_,2);
$seqstr =~ s/\s+//g; # remove whitespace
--- 686,701 ----
# an array but we'll stay with this for now
my ($self) = @_;
+ # Use this flag to deal with paml 4 vs 3 differences
+ # In PAML 4 the sequences precede the CODONML|BASEML|AAML
+ # while in PAML3 the files start off with this
+ return 1 if $self->{'_already_parsed_seqs'};
my (@firstseq, at seqs);
while( defined ($_ = $self->_readline) ) {
! if( /^(Printing|After|TREE|Codon)/ ) { $self->_pushback($_); last }
last if( /^\s+$/ && @seqs > 0 );
next if ( /^\s+$/ );
next if( /^\d+\s+$/ );
+ # we are reading PHYLIP format
my ($name,$seqstr) = split(/\s+/,$_,2);
$seqstr =~ s/\s+//g; # remove whitespace
***************
*** 708,717 ****
$i = $v;
}
- $self->debug( "adding seq $seqstr\n");
push @seqs, Bio::PrimarySeq->new(-display_id => $name,
-seq => $seqstr);
}
}
! $self->{'_summary'}->{'seqs'} = \@seqs;
1;
}
--- 713,724 ----
$i = $v;
}
push @seqs, Bio::PrimarySeq->new(-display_id => $name,
-seq => $seqstr);
}
}
! if( @seqs > 0 ) {
! $self->{'_summary'}->{'seqs'} = \@seqs;
! $self->{'_already_parsed_seqs'} = 1;
! }
1;
}
From jason at dev.open-bio.org Thu Nov 1 10:52:58 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Thu, 01 Nov 2007 14:52:58 +0000
Subject: [Bioperl-guts-l] bioperl-live/t PAML.t,1.30,1.31
Message-ID: <200711011452.lA1EqweW031940@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/t
In directory dev.open-bio.org:/tmp/cvs-serv31902/t
Modified Files:
PAML.t
Log Message:
Parsing PAML4 and PAML3.15 should work now. Dealing with variable order for the sequences and summary results in the top of the MLC files
Index: PAML.t
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/t/PAML.t,v
retrieving revision 1.30
retrieving revision 1.31
diff -C2 -d -r1.30 -r1.31
*** PAML.t 27 Jun 2007 10:16:37 -0000 1.30
--- PAML.t 1 Nov 2007 14:52:56 -0000 1.31
***************
*** 8,12 ****
use BioperlTest;
! test_begin(-tests => 193,
-requires_module => 'IO::String');
--- 8,12 ----
use BioperlTest;
! test_begin(-tests => 202,
-requires_module => 'IO::String');
***************
*** 385,386 ****
--- 385,404 ----
is($MLmat->[0]->[1]->{'dN'}, '0.0210');
is($MLmat->[0]->[1]->{'dS'}, 0.0644);
+
+ ## PAML 4
+ $paml = Bio::Tools::Phylo::PAML->new(-file => test_input_file('codeml4.mlc'));
+ $result = $paml->next_result;
+
+ is($result->model, 'One dN/dS ratio');
+ like($result->version, qr'4');
+ $MLmat = $result->get_MLmatrix;
+ $NGmat = $result->get_NGmatrix;
+
+ is($NGmat->[0]->[1]->{'omega'}, 0.2507);
+ is($NGmat->[0]->[1]->{'dN'}, 0.0863);
+ is($NGmat->[0]->[1]->{'dS'}, 0.3443);
+
+ is($MLmat->[0]->[1]->{'omega'}, 0.29075);
+ is($MLmat->[0]->[1]->{'dN'}, '0.0874');
+ is($MLmat->[0]->[1]->{'dS'}, 0.3006);
+ is($MLmat->[0]->[1]->{'lnL'}, -1596.739984);
From jason at dev.open-bio.org Thu Nov 1 11:18:12 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Thu, 01 Nov 2007 15:18:12 +0000
Subject: [Bioperl-guts-l] bioperl-run/t PAML.t,1.15,1.16
Message-ID: <200711011518.lA1FICnA032195@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-run/t
In directory dev.open-bio.org:/tmp/cvs-serv32169/t
Modified Files:
PAML.t
Log Message:
PAML4 tests
Index: PAML.t
===================================================================
RCS file: /home/repository/bioperl/bioperl-run/t/PAML.t,v
retrieving revision 1.15
retrieving revision 1.16
diff -C2 -d -r1.15 -r1.16
*** PAML.t 14 Jun 2007 15:23:09 -0000 1.15
--- PAML.t 1 Nov 2007 15:18:10 -0000 1.16
***************
*** 23,27 ****
use Test;
! $NUMTESTS = 18;
plan tests => $NUMTESTS;
--- 23,27 ----
use Test;
! $NUMTESTS = 19;
plan tests => $NUMTESTS;
***************
*** 64,68 ****
use Bio::Tools::Run::Phylo::PAML::Yn00;
use Bio::AlignIO;
! my $codeml = Bio::Tools::Run::Phylo::PAML::Codeml->new(-verbose => $verbose);
unless ($codeml->executable) {
warn("PAML not is installed. skipping tests $Test::ntest to $NUMTESTS\n");
--- 64,73 ----
use Bio::Tools::Run::Phylo::PAML::Yn00;
use Bio::AlignIO;
! my $codeml = Bio::Tools::Run::Phylo::PAML::Codeml->new
! (-params => {'runmode' => -2,
! 'seqtype' => 1,
! 'model' => 0,
! },
! -verbose => $verbose);
unless ($codeml->executable) {
warn("PAML not is installed. skipping tests $Test::ntest to $NUMTESTS\n");
***************
*** 78,81 ****
--- 83,87 ----
ok($rc,1);
+
if( ! defined $results ) {
exit(0);
***************
*** 83,89 ****
my $result = $results->next_result;
my $MLmatrix = $result->get_MLmatrix;
! my ($vnum) = ($result->version =~ /(\d+\.\d+)/);
# PAML 2.12 results
if( $vnum == 3.12 ) {
--- 89,99 ----
my $result = $results->next_result;
+ if( ! defined $result ) {
+ exit(0);
+ }
+
my $MLmatrix = $result->get_MLmatrix;
! my ($vnum) = ($result->version =~ /(\d+(\.\d+)?)/);
# PAML 2.12 results
if( $vnum == 3.12 ) {
***************
*** 94,98 ****
ok($MLmatrix->[0]->[1]->{'N'}, 728.5);
ok($MLmatrix->[0]->[1]->{'t'}, 1.0895);
! } elsif( $vnum >= 3.13 ) {
# PAML 2.13 results
ok($MLmatrix->[0]->[1]->{'dN'}, 0.0713);
--- 104,111 ----
ok($MLmatrix->[0]->[1]->{'N'}, 728.5);
ok($MLmatrix->[0]->[1]->{'t'}, 1.0895);
!
! skip($MLmatrix->[0]->[1]->{'lnL'}, "I don't know what this should be, if you run this part, email the list so we can update the value");
!
! } elsif( $vnum >= 3.13 && $vnum < 4) {
# PAML 2.13 results
ok($MLmatrix->[0]->[1]->{'dN'}, 0.0713);
***************
*** 102,105 ****
--- 115,129 ----
ok($MLmatrix->[0]->[1]->{'N'}, 723.2);
ok(sprintf("%.4f",$MLmatrix->[0]->[1]->{'t'}), 1.1946);
+ skip($MLmatrix->[0]->[1]->{'lnL'}, "I don't know what this should be, if you run this part, email the list so we can update the value");
+
+ } elsif( $vnum == 4 ) {
+ ok($MLmatrix->[0]->[1]->{'dN'}, 0.0693);
+ ok($MLmatrix->[0]->[1]->{'dS'},1.1459);
+ ok(sprintf("%.4f",$MLmatrix->[0]->[1]->{'omega'}), 0.0605);
+ ok($MLmatrix->[0]->[1]->{'S'}, 273.5);
+ ok($MLmatrix->[0]->[1]->{'N'}, 728.5);
+ ok(sprintf("%.4f",$MLmatrix->[0]->[1]->{'t'}), 1.0895);
+ ok($MLmatrix->[0]->[1]->{'lnL'}, -1957.064254);
+
} else {
for( 1..6) {
From jason at dev.open-bio.org Thu Nov 1 11:28:57 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Thu, 01 Nov 2007 15:28:57 +0000
Subject: [Bioperl-guts-l] bioperl-run/Bio/Tools/Run/Phylo/PAML Codeml.pm,
1.48, 1.49
Message-ID: <200711011528.lA1FSvuE032317@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-run/Bio/Tools/Run/Phylo/PAML
In directory dev.open-bio.org:/tmp/cvs-serv32291/Bio/Tools/Run/Phylo/PAML
Modified Files:
Codeml.pm
Log Message:
merge prepare and run codes so that there is no code duplication
Index: Codeml.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-run/Bio/Tools/Run/Phylo/PAML/Codeml.pm,v
retrieving revision 1.48
retrieving revision 1.49
diff -C2 -d -r1.48 -r1.49
*** Codeml.pm 29 Oct 2007 19:59:59 -0000 1.48
--- Codeml.pm 1 Nov 2007 15:28:55 -0000 1.49
***************
*** 457,480 ****
open(CODEML, ">$codeml_ctl") or $self->throw("cannot open $codeml_ctl for writing");
print CODEML "seqfile = $tempseqfile\n";
-
my $outfile = $self->outfile_name;
! my ($temptreeFH,$temptreefile);
! if( ! ref($tree) && -e $tree ) {
! $temptreefile = $tree;
! } else {
! ($temptreeFH,$temptreefile) = $self->io->tempfile
! ('-dir' => $tempdir,
! UNLINK => ($self->save_tempfiles ? 0 : 1));
!
! my $treeout = Bio::TreeIO->new('-format' => 'newick',
! '-fh' => $temptreeFH);
! $treeout->write_tree($tree);
! $treeout->close();
! close($temptreeFH);
}
- print CODEML "treefile = $temptreefile\n";
- print CODEML "outfile = $outfile\n";
my %params = $self->get_parameters;
while( my ($param,$val) = each %params ) {
--- 457,481 ----
open(CODEML, ">$codeml_ctl") or $self->throw("cannot open $codeml_ctl for writing");
print CODEML "seqfile = $tempseqfile\n";
my $outfile = $self->outfile_name;
+ print CODEML "outfile = $outfile\n";
! if( $tree ) {
! my ($temptreeFH,$temptreefile);
! if( ! ref($tree) && -e $tree ) {
! $temptreefile = $tree;
! } else {
! ($temptreeFH,$temptreefile) = $self->io->tempfile
! ('-dir' => $tempdir,
! UNLINK => ($self->save_tempfiles ? 0 : 1));
!
! my $treeout = Bio::TreeIO->new('-format' => 'newick',
! '-fh' => $temptreeFH);
! $treeout->write_tree($tree);
! $treeout->close();
! close($temptreeFH);
! }
! print CODEML "treefile = $temptreefile\n";
}
my %params = $self->get_parameters;
while( my ($param,$val) = each %params ) {
***************
*** 493,500 ****
=head2 run
Title : run
! Usage : my ($rc,$parser) = $codeml->run($aln);
Function: run the codeml analysis using the default or updated parameters
the alignment parameter must have been set
--- 494,502 ----
+
=head2 run
Title : run
! Usage : my ($rc,$parser) = $codeml->run($aln,$tree);
Function: run the codeml analysis using the default or updated parameters
the alignment parameter must have been set
***************
*** 507,568 ****
sub run {
! my ($self,$aln,$tree) = @_;
! unless ( $self->save_tempfiles ) {
! # brush so we don't get plaque buildup ;)
! $self->cleanup();
! }
! $tree = $self->tree unless $tree;
! $aln = $self->alignment unless $aln;
! if( ! $aln ) {
! $self->warn("must have supplied a valid alignment file in order to run codeml");
! return 0;
! }
! my ($tmpdir) = $self->tempdir();
! my ($tempseqFH,$tempseqfile);
! if( ! ref($aln) && -e $aln ) {
! $tempseqfile = $aln;
! } else {
! ($tempseqFH,$tempseqfile) = $self->io->tempfile
! ('-dir' => $tmpdir,
! UNLINK => ($self->save_tempfiles ? 0 : 1));
! my $alnout = Bio::AlignIO->new('-format' => 'phylip',
! '-fh' => $tempseqFH,
! '-interleaved' => 0,
! '-idlength' => $MINNAMELEN > $aln->maxdisplayname_length() ? $MINNAMELEN : $aln->maxdisplayname_length() +1);
!
! $alnout->write_aln($aln);
! $alnout->close();
! undef $alnout;
! close($tempseqFH);
! undef $tempseqFH;
! }
! # now let's print the codeml.ctl file.
! # many of the these programs are finicky about what the filename is
! # and won't even run without the properly named file. Ack
!
! my $codeml_ctl = "$tmpdir/codeml.ctl";
! open(my $mlfh, ">$codeml_ctl") or $self->throw("cannot open $codeml_ctl for writing");
! print $mlfh "seqfile = $tempseqfile\n";
!
my $outfile = $self->outfile_name;
!
! if( $tree ) {
! my ($temptreeFH,$temptreefile) = $self->io->tempfile
! ('-dir' => $tmpdir,
! UNLINK => ($self->save_tempfiles ? 0 : 1));
!
! my $treeout = Bio::TreeIO->new('-format' => 'newick',
! '-fh' => $temptreeFH);
! $treeout->write_tree($tree);
! $treeout->close();
! close($temptreeFH);
! print $mlfh "treefile = $temptreefile\n";
! }
! print $mlfh "outfile = $outfile\n";
! my %params = $self->get_parameters;
! while( my ($param,$val) = each %params ) {
! print $mlfh "$param = $val\n";
! }
! close($mlfh);
my ($rc,$parser) = (1);
--- 509,515 ----
sub run {
! my ($self) = shift;;
my $outfile = $self->outfile_name;
! my $tmpdir = $self->prepare(@_);
my ($rc,$parser) = (1);
***************
*** 592,600 ****
if( $@ ) {
$self->warn($self->error_string);
! }
!
chdir($cwd);
! }
!
return ($rc,$parser);
}
--- 539,545 ----
if( $@ ) {
$self->warn($self->error_string);
! }
chdir($cwd);
! }
return ($rc,$parser);
}
From sendu at dev.open-bio.org Fri Nov 2 12:29:52 2007
From: sendu at dev.open-bio.org (Senduran Balasubramaniam)
Date: Fri, 02 Nov 2007 16:29:52 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/SeqFeature Annotated.pm, 1.42,
1.43
Message-ID: <200711021629.lA2GTqVH002565@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/SeqFeature
In directory dev.open-bio.org:/tmp/cvs-serv2534/Bio/SeqFeature
Modified Files:
Annotated.pm
Log Message:
fixed error when source() sets its own default value
Index: Annotated.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/SeqFeature/Annotated.pm,v
retrieving revision 1.42
retrieving revision 1.43
diff -C2 -d -r1.42 -r1.43
*** Annotated.pm 26 Oct 2007 16:47:08 -0000 1.42
--- Annotated.pm 2 Nov 2007 16:29:49 -0000 1.43
***************
*** 466,470 ****
unless ($self->get_Annotations('source')) {
! $self->source('.');
}
return $self->get_Annotations('source');
--- 466,470 ----
unless ($self->get_Annotations('source')) {
! $self->source(Bio::Annotation::SimpleValue->new(-value => '.'));
}
return $self->get_Annotations('source');
From bugzilla-daemon at portal.open-bio.org Fri Nov 2 12:30:07 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 2 Nov 2007 12:30:07 -0400
Subject: [Bioperl-guts-l] [Bug 2388] Incorrect parsing of wu-blast report
when echofilter option (of wu-blast) is used
In-Reply-To:
Message-ID: <200711021630.lA2GU78h003492@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2388
online at davemessina.com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |online at davemessina.com
Status|NEW |ASSIGNED
------- Comment #3 from online at davemessina.com 2007-11-02 12:30 EST -------
Yep, I can confirm this behavior still exists on cvs HEAD. I'm poking through
the code to fix...
Dave
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Nov 2 15:58:22 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 2 Nov 2007 15:58:22 -0400
Subject: [Bioperl-guts-l] [Bug 2392] New: Bio::Tools::Geneid - unable to
parse target id using _target_id
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2392
Summary: Bio::Tools::Geneid - unable to parse target id using
_target_id
Product: BioPerl
Version: unspecified
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Core Components
AssignedTo: bioperl-guts-l at bioperl.org
ReportedBy: dschneid at bsd.uchicago.edu
I am trying to pull out the sequence ID from my FASTA files using the
Bio::Tools::Geneid _target_id module. This function ends up giving me nothing
since it is looking for a ">" FASTA header. I am currently using the latest
version of geneID, using the geneid (default) output format.
############
here is an example of a geneid output where I am trying to fetch DDB0232583
## date Fri Sep 28 18:48:00 2007
## source-version: geneid_v1.3 -- geneid at imim.es
# Sequence DDB0232583 - Length = 196197 bps
# Optimal Gene Structure. 84 genes. Score = 10080.86
# Gene 1 (Forward). 2 exons. 138 aa. Score = 22.55
First 187 259 -4.24 + 0 1 8.34 9.35 -7.17 0.00AA
1: 25 DDB0232583_1
Terminal 1778 2118 26.78 + 2 0 -4.01 5.49 71.08 0.00AA
25:138 DDB0232583_1
# Gene 2 (Reverse). 5 exons. 679 aa. Score = 103.62
Terminal 3271 3542 9.96 - 2 0 2.20 5.13 31.58 0.00AA
589:679 DDB0232583_2
Internal 3840 4724 59.44 - 2 1 5.72 1.11 131.05 0.00AA
294:589 DDB0232583_2
Internal 4823 4946 4.39 - 0 1 2.56 12.21 13.01 0.00AA
253:294 DDB0232583_2
Internal 5261 5495 14.84 - 1 0 7.77 11.83 29.08 0.00AA
174:252 DDB0232583_2
First 5632 6152 14.99 - 0 2 7.19 8.94 32.85 0.00AA
1:174 DDB0232583_2
#################################
# in order to get this function to work properly I had to go in and make some
slight changes to the Geneid.pm file.
#first I added a new "or" statement
if (/^>(\S+)\|GeneId/ or /^# Sequence (\S+)/) ## ln 150
# also I removed "unless defined $self->_target_id;" inorder to continue
# generating new sequence IDs in the case there are many outputs in one file
$self->_target_id($target_id) unless defined $self->_target_id;
# becomes
$self->_target_id($target_id);
###########
# patch file to be attached
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Nov 2 16:02:22 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 2 Nov 2007 16:02:22 -0400
Subject: [Bioperl-guts-l] [Bug 2392] Bio::Tools::Geneid - unable to parse
target id using _target_id
In-Reply-To:
Message-ID: <200711022002.lA2K2MJ0012815@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2392
------- Comment #1 from dschneid at bsd.uchicago.edu 2007-11-02 16:02 EST -------
Created an attachment (id=802)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=802&action=view)
patch for Geneid.pm file
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From dave_messina at dev.open-bio.org Sat Nov 3 20:11:47 2007
From: dave_messina at dev.open-bio.org (Dave Messina)
Date: Sun, 04 Nov 2007 00:11:47 +0000
Subject: [Bioperl-guts-l] bioperl-live/t SearchIO.t,1.120,1.121
Message-ID: <200711040011.lA40BlNv005331@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/t
In directory dev.open-bio.org:/tmp/cvs-serv5289/t
Modified Files:
SearchIO.t
Log Message:
Bug 2388 fix - added support for WU-BLAST -echofilter option. added test to SearchIO.t.
All SearchIO tests pass.
Index: SearchIO.t
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/t/SearchIO.t,v
retrieving revision 1.120
retrieving revision 1.121
diff -C2 -d -r1.120 -r1.121
*** SearchIO.t 29 Aug 2007 15:13:50 -0000 1.120
--- SearchIO.t 4 Nov 2007 00:11:45 -0000 1.121
***************
*** 8,12 ****
use BioperlTest;
! test_begin(-tests => 1449);
use_ok('Bio::SearchIO');
--- 8,12 ----
use BioperlTest;
! test_begin(-tests => 1483);
use_ok('Bio::SearchIO');
***************
*** 1193,1196 ****
--- 1193,1255 ----
is($count, 2);
+ # WU-BLAST -echofilter option test (Bug 2388)
+ $searchio = Bio::SearchIO->new('-format' => 'blast',
+ '-file' => test_input_file('echofilter.wublastn'));
+
+ $result = $searchio->next_result;
+
+ is($result->database_name, 'NM_003201.fa');
+ is($result->database_letters, 1936);
+ is($result->database_entries, 1);
+ is($result->algorithm, 'BLASTN');
+ like($result->algorithm_version, qr/^2\.0MP\-WashU/);
+ like($result->query_name, qr/ref|NM_003201.1| Homo sapiens transcription factor A, mitochondrial \(TFAM\), mRNA/);
+ is($result->query_accession, 'NM_003201.1');
+
+ is($result->query_length, 1936);
+ is($result->get_statistic('lambda'), 0.192);
+ is($result->get_statistic('kappa'), 0.182);
+ is($result->get_statistic('entropy'), 0.357);
+ is($result->get_statistic('dbletters'), 1936);
+ is($result->get_statistic('dbentries'), 1);
+ is($result->get_parameter('matrix'), '+5,-4');
+
+ @valid = ( [ 'ref|NM_003201.1|', 1936, 'NM_003201', '0', 9680],);
+ $count = 0;
+ while( $hit = $result->next_hit ) {
+ my $d = shift @valid;
+
+ is($hit->name, shift @$d);
+ is($hit->length, shift @$d);
+ is($hit->accession, shift @$d);
+ is(sprintf("%g",$hit->significance), sprintf("%g",shift @$d) );
+ is($hit->raw_score, shift @$d );
+
+ if( $count == 0 ) {
+ my $hsps_left = 1;
+ while( my $hsp = $hit->next_hsp ) {
+ is($hsp->query->start, 1);
+ is($hsp->query->end, 1936);
+ is($hsp->hit->start, 1);
+ is($hsp->hit->end, 1936);
+ is($hsp->length('hsp'), 1936);
+
+ is($hsp->evalue , '0.');
+ is($hsp->pvalue , '0.');
+ is($hsp->score, 9680);
+ is($hsp->bits,1458.4);
+ is($hsp->percent_identity, 100);
+ is($hsp->frac_identical('query'), 1.00);
+ is($hsp->frac_identical('hit'), 1.00);
+ is($hsp->gaps, 0);
+ $hsps_left--;
+ }
+ is($hsps_left, 0);
+ }
+ last if( $count++ > @valid );
+ }
+ is(@valid, 0);
+
+
# Do a multiblast report test
$searchio = Bio::SearchIO->new('-format' => 'blast',
From dave_messina at dev.open-bio.org Sat Nov 3 20:11:47 2007
From: dave_messina at dev.open-bio.org (Dave Messina)
Date: Sun, 04 Nov 2007 00:11:47 +0000
Subject: [Bioperl-guts-l] bioperl-live/t/data echofilter.wublastn,NONE,1.1
Message-ID: <200711040011.lA40BlUv005334@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/t/data
In directory dev.open-bio.org:/tmp/cvs-serv5289/t/data
Added Files:
echofilter.wublastn
Log Message:
Bug 2388 fix - added support for WU-BLAST -echofilter option. added test to SearchIO.t.
All SearchIO tests pass.
--- NEW FILE: echofilter.wublastn ---
BLASTN 2.0MP-WashU [04-May-2006] [macosx-10.2-g4-ILP32F64 2006-05-07T21:56:24]
Copyright (C) 1996-2006 Washington University, Saint Louis, Missouri USA.
All Rights Reserved.
Reference: Gish, W. (1996-2006) http://blast.wustl.edu
Notice: this program and its default parameter settings are optimized to find
nearly identical sequences rapidly. To identify weak protein similarities
encoded in nucleic acid, use BLASTX, TBLASTN or TBLASTX.
Query= ref|NM_003201.1| Homo sapiens transcription factor A, mitochondrial
(TFAM), mRNA
(1936 letters)
>Unfiltered+1
CCTCGCTAGTGGCGGGCATGATAACACACGCCGGAGGGTCGCACGCGGGTTCCAGTTGTG
ATTGCTGGAGTTGTGTATTGCCAGGAGGCTCTCCGAGATTGGGGTCGGGTCACTGCCTCA
TCCACCGGAGCGATGGCGTTTCTCCGAAGCATGTGGGGCGTGCTGAGTGCCCTGGGAAGG
TCTGGAGCAGAGCTGTGCACCGGCTGTGGAAGTCGACTGCGCTCCCCCTTCAGTTTTGTG
TATTTACCGAGGTGGTTTTCATCTGTCTTGGCAAGTTGTCCAAAGAAACCTGTAAGTTCT
TACCTTCGATTTTCTAAAGAACAACTACCCATATTTAAAGCTCAGAACCCAGATGCAAAA
ACTACAGAACTAATTAGAAGAATTGCCCAGCGTTGGAGGGAACTTCCTGATTCAAAGAAA
AAAATATATCAAGATGCTTATAGGGCGGAGTGGCAGGTATATAAAGAAGAGATAAGCAGA
TTTAAAGAACAGCTAACTCCAAGTCAGATTATGTCTTTGGAAAAAGAAATCATGGACAAA
CATTTAAAAAGGAAAGCTATGACAAAAAAAAAAGAGTTAACACTGCTTGGAAAACCAAAA
AGACCTCGTTCAGCTTATAACGTTTATGTAGCTGAAAGATTCCAAGAAGCTAAGGGTGAT
TCACCGCAGGAAAAGCTGAAGACTGTAAAGGAAAACTGGAAAAATCTGTCTGACTCTGAA
AAGGAATTATATATTCAGCATGCTAAAGAGGACGAAACTCGTTATCATAATGAAATGAAG
TCTTGGGAAGAACAAATGATTGAAGTTGGACGAAAGGATCTTCTACGTCGCACAATAAAG
AAACAACGAAAATATGGTGCTGAGGAGTGTTAAAAGTAGAAGATTGAGATGTGTTCACAA
TGGATAGGCACAGGAAACCAGTTAGGTCTCAATACCTGAAGCTATCGTAAAATTAAGAAA
GGATAAAGTTGGTAAACCTTTTATATTTAGTATCTTTTTATTCAGCTCATGGACTTCTGC
CAGCATAATACTTGCTTTGGAAAACCCAGATAAAGGTTCATGCAAACTTTATTTTGTGTT
TAGGAACTACTGAGGATCAGAGTAATCCAAGCAAATGTGAATCATTTTACCTTTGACAAA
GGTAAATCAGACTATGAAGTTTTTTTTATACAGGATGATGACTATGGAAAGAGTACTCTT
GTTTCCTTATATTATGGAGGCAGGAGTTTCGTTTTCAAAATTGTTACAAATTGTAGAAGC
CACGGTGTTCTGTGATATAAGTGTGTGTTTTTCATAAAGCAGGCAGAACTCATCTAGGTA
AATTACAGTTCCTAGGTATAATTCACATTGTATTCAGAGTTGATGGTTGTACATATAAGT
GATTGCTGGTTTTAGTTGCAACTTTGTATAAAAGGGACTGAGAAATTTATAAACTTTTTT
CTTACTGTCTTTTTTCTAAAGTAAAAACAAAGAAATTATGTGCCAGATTTATGCATATTA
TTTTATGTTGCATAGAATAAAATTTTTAATCTTTAATTTTACATTTCCTAAATATATTTT
AAGACGAAACATTTGTTCTATAGCTTTTCCCTTTTTTTAAGTAAGGAATTTTATTTTTTT
CTGAATTATTTTCTCTCGTGAGTATATTGATCCAGAAAGAAAACTTGTATTATGTGTGTT
TTAAAATGAGAAATCTAAAAAACGAAAAGTCTCCAAAGTCTCTGGAATTTGAAACACTTT
GCATAACGTATAAAAGCCTGTTTAAGAGACAGCCAACTATGGCCTGTGGATCAAATCCAG
CCTGCTGCCTGCTTTTTATGGCCTGTGAGCTAGGAATTGTGTTTATAATTTTAAATGTTT
TTTTTTAAAGACTTTTATGATACTTGAAAATTAACATGAATATTTAGTGTTCATAAATAA
AGTTTGTTGAAACACA
>Unfiltered-1
TGTGTTTCAACAAACTTTATTTATGAACACTAAATATTCATGTTAATTTTCAAGTATCAT
AAAAGTCTTTAAAAAAAAACATTTAAAATTATAAACACAATTCCTAGCTCACAGGCCATA
AAAAGCAGGCAGCAGGCTGGATTTGATCCACAGGCCATAGTTGGCTGTCTCTTAAACAGG
CTTTTATACGTTATGCAAAGTGTTTCAAATTCCAGAGACTTTGGAGACTTTTCGTTTTTT
AGATTTCTCATTTTAAAACACACATAATACAAGTTTTCTTTCTGGATCAATATACTCACG
AGAGAAAATAATTCAGAAAAAAATAAAATTCCTTACTTAAAAAAAGGGAAAAGCTATAGA
ACAAATGTTTCGTCTTAAAATATATTTAGGAAATGTAAAATTAAAGATTAAAAATTTTAT
TCTATGCAACATAAAATAATATGCATAAATCTGGCACATAATTTCTTTGTTTTTACTTTA
GAAAAAAGACAGTAAGAAAAAAGTTTATAAATTTCTCAGTCCCTTTTATACAAAGTTGCA
ACTAAAACCAGCAATCACTTATATGTACAACCATCAACTCTGAATACAATGTGAATTATA
CCTAGGAACTGTAATTTACCTAGATGAGTTCTGCCTGCTTTATGAAAAACACACACTTAT
ATCACAGAACACCGTGGCTTCTACAATTTGTAACAATTTTGAAAACGAAACTCCTGCCTC
CATAATATAAGGAAACAAGAGTACTCTTTCCATAGTCATCATCCTGTATAAAAAAAACTT
CATAGTCTGATTTACCTTTGTCAAAGGTAAAATGATTCACATTTGCTTGGATTACTCTGA
TCCTCAGTAGTTCCTAAACACAAAATAAAGTTTGCATGAACCTTTATCTGGGTTTTCCAA
AGCAAGTATTATGCTGGCAGAAGTCCATGAGCTGAATAAAAAGATACTAAATATAAAAGG
TTTACCAACTTTATCCTTTCTTAATTTTACGATAGCTTCAGGTATTGAGACCTAACTGGT
TTCCTGTGCCTATCCATTGTGAACACATCTCAATCTTCTACTTTTAACACTCCTCAGCAC
CATATTTTCGTTGTTTCTTTATTGTGCGACGTAGAAGATCCTTTCGTCCAACTTCAATCA
TTTGTTCTTCCCAAGACTTCATTTCATTATGATAACGAGTTTCGTCCTCTTTAGCATGCT
GAATATATAATTCCTTTTCAGAGTCAGACAGATTTTTCCAGTTTTCCTTTACAGTCTTCA
GCTTTTCCTGCGGTGAATCACCCTTAGCTTCTTGGAATCTTTCAGCTACATAAACGTTAT
AAGCTGAACGAGGTCTTTTTGGTTTTCCAAGCAGTGTTAACTCTTTTTTTTTTGTCATAG
CTTTCCTTTTTAAATGTTTGTCCATGATTTCTTTTTCCAAAGACATAATCTGACTTGGAG
TTAGCTGTTCTTTAAATCTGCTTATCTCTTCTTTATATACCTGCCACTCCGCCCTATAAG
CATCTTGATATATTTTTTTCTTTGAATCAGGAAGTTCCCTCCAACGCTGGGCAATTCTTC
TAATTAGTTCTGTAGTTTTTGCATCTGGGTTCTGAGCTTTAAATATGGGTAGTTGTTCTT
TAGAAAATCGAAGGTAAGAACTTACAGGTTTCTTTGGACAACTTGCCAAGACAGATGAAA
ACCACCTCGGTAAATACACAAAACTGAAGGGGGAGCGCAGTCGACTTCCACAGCCGGTGC
ACAGCTCTGCTCCAGACCTTCCCAGGGCACTCAGCACGCCCCACATGCTTCGGAGAAACG
CCATCGCTCCGGTGGATGAGGCAGTGACCCGACCCCAATCTCGGAGAGCCTCCTGGCAAT
ACACAACTCCAGCAATCACAACTGGAACCCGCGTGCGACCCTCCGGCGTGTGTTATCATG
CCCGCCACTAGCGAGG
Database: NM_003201.fa
1 sequences; 1936 total letters.
Searching done
Smallest
Sum
High Probability
Sequences producing High-scoring Segment Pairs: Score P(N) N
ref|NM_003201.1| Homo sapiens transcription factor A, mit... 9680 0. 1
>ref|NM_003201.1| Homo sapiens transcription factor A, mitochondrial (TFAM),
mRNA
Length = 1936
Plus Strand HSPs:
Score = 9680 (1458.4 bits), Expect = 0., P = 0.
Identities = 1936/1936 (100%), Positives = 1936/1936 (100%), Strand = Plus / Plus
Query: 1 CCTCGCTAGTGGCGGGCATGATAACACACGCCGGAGGGTCGCACGCGGGTTCCAGTTGTG 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1 CCTCGCTAGTGGCGGGCATGATAACACACGCCGGAGGGTCGCACGCGGGTTCCAGTTGTG 60
Query: 61 ATTGCTGGAGTTGTGTATTGCCAGGAGGCTCTCCGAGATTGGGGTCGGGTCACTGCCTCA 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 61 ATTGCTGGAGTTGTGTATTGCCAGGAGGCTCTCCGAGATTGGGGTCGGGTCACTGCCTCA 120
Query: 121 TCCACCGGAGCGATGGCGTTTCTCCGAAGCATGTGGGGCGTGCTGAGTGCCCTGGGAAGG 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 121 TCCACCGGAGCGATGGCGTTTCTCCGAAGCATGTGGGGCGTGCTGAGTGCCCTGGGAAGG 180
Query: 181 TCTGGAGCAGAGCTGTGCACCGGCTGTGGAAGTCGACTGCGCTCCCCCTTCAGTTTTGTG 240
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 181 TCTGGAGCAGAGCTGTGCACCGGCTGTGGAAGTCGACTGCGCTCCCCCTTCAGTTTTGTG 240
Query: 241 TATTTACCGAGGTGGTTTTCATCTGTCTTGGCAAGTTGTCCAAAGAAACCTGTAAGTTCT 300
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 241 TATTTACCGAGGTGGTTTTCATCTGTCTTGGCAAGTTGTCCAAAGAAACCTGTAAGTTCT 300
Query: 301 TACCTTCGATTTTCTAAAGAACAACTACCCATATTTAAAGCTCAGAACCCAGATGCAAAA 360
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 301 TACCTTCGATTTTCTAAAGAACAACTACCCATATTTAAAGCTCAGAACCCAGATGCAAAA 360
Query: 361 ACTACAGAACTAATTAGAAGAATTGCCCAGCGTTGGAGGGAACTTCCTGATTCAAAGAAA 420
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 361 ACTACAGAACTAATTAGAAGAATTGCCCAGCGTTGGAGGGAACTTCCTGATTCAAAGAAA 420
Query: 421 AAAATATATCAAGATGCTTATAGGGCGGAGTGGCAGGTATATAAAGAAGAGATAAGCAGA 480
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 421 AAAATATATCAAGATGCTTATAGGGCGGAGTGGCAGGTATATAAAGAAGAGATAAGCAGA 480
Query: 481 TTTAAAGAACAGCTAACTCCAAGTCAGATTATGTCTTTGGAAAAAGAAATCATGGACAAA 540
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 481 TTTAAAGAACAGCTAACTCCAAGTCAGATTATGTCTTTGGAAAAAGAAATCATGGACAAA 540
Query: 541 CATTTAAAAAGGAAAGCTATGACAAAAAAAAAAGAGTTAACACTGCTTGGAAAACCAAAA 600
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 541 CATTTAAAAAGGAAAGCTATGACAAAAAAAAAAGAGTTAACACTGCTTGGAAAACCAAAA 600
Query: 601 AGACCTCGTTCAGCTTATAACGTTTATGTAGCTGAAAGATTCCAAGAAGCTAAGGGTGAT 660
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 601 AGACCTCGTTCAGCTTATAACGTTTATGTAGCTGAAAGATTCCAAGAAGCTAAGGGTGAT 660
Query: 661 TCACCGCAGGAAAAGCTGAAGACTGTAAAGGAAAACTGGAAAAATCTGTCTGACTCTGAA 720
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 661 TCACCGCAGGAAAAGCTGAAGACTGTAAAGGAAAACTGGAAAAATCTGTCTGACTCTGAA 720
Query: 721 AAGGAATTATATATTCAGCATGCTAAAGAGGACGAAACTCGTTATCATAATGAAATGAAG 780
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 721 AAGGAATTATATATTCAGCATGCTAAAGAGGACGAAACTCGTTATCATAATGAAATGAAG 780
Query: 781 TCTTGGGAAGAACAAATGATTGAAGTTGGACGAAAGGATCTTCTACGTCGCACAATAAAG 840
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 781 TCTTGGGAAGAACAAATGATTGAAGTTGGACGAAAGGATCTTCTACGTCGCACAATAAAG 840
Query: 841 AAACAACGAAAATATGGTGCTGAGGAGTGTTAAAAGTAGAAGATTGAGATGTGTTCACAA 900
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 841 AAACAACGAAAATATGGTGCTGAGGAGTGTTAAAAGTAGAAGATTGAGATGTGTTCACAA 900
Query: 901 TGGATAGGCACAGGAAACCAGTTAGGTCTCAATACCTGAAGCTATCGTAAAATTAAGAAA 960
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 901 TGGATAGGCACAGGAAACCAGTTAGGTCTCAATACCTGAAGCTATCGTAAAATTAAGAAA 960
Query: 961 GGATAAAGTTGGTAAACCTTTTATATTTAGTATCTTTTTATTCAGCTCATGGACTTCTGC 1020
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 961 GGATAAAGTTGGTAAACCTTTTATATTTAGTATCTTTTTATTCAGCTCATGGACTTCTGC 1020
Query: 1021 CAGCATAATACTTGCTTTGGAAAACCCAGATAAAGGTTCATGCAAACTTTATTTTGTGTT 1080
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1021 CAGCATAATACTTGCTTTGGAAAACCCAGATAAAGGTTCATGCAAACTTTATTTTGTGTT 1080
Query: 1081 TAGGAACTACTGAGGATCAGAGTAATCCAAGCAAATGTGAATCATTTTACCTTTGACAAA 1140
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1081 TAGGAACTACTGAGGATCAGAGTAATCCAAGCAAATGTGAATCATTTTACCTTTGACAAA 1140
Query: 1141 GGTAAATCAGACTATGAAGTTTTTTTTATACAGGATGATGACTATGGAAAGAGTACTCTT 1200
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1141 GGTAAATCAGACTATGAAGTTTTTTTTATACAGGATGATGACTATGGAAAGAGTACTCTT 1200
Query: 1201 GTTTCCTTATATTATGGAGGCAGGAGTTTCGTTTTCAAAATTGTTACAAATTGTAGAAGC 1260
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1201 GTTTCCTTATATTATGGAGGCAGGAGTTTCGTTTTCAAAATTGTTACAAATTGTAGAAGC 1260
Query: 1261 CACGGTGTTCTGTGATATAAGTGTGTGTTTTTCATAAAGCAGGCAGAACTCATCTAGGTA 1320
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1261 CACGGTGTTCTGTGATATAAGTGTGTGTTTTTCATAAAGCAGGCAGAACTCATCTAGGTA 1320
Query: 1321 AATTACAGTTCCTAGGTATAATTCACATTGTATTCAGAGTTGATGGTTGTACATATAAGT 1380
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1321 AATTACAGTTCCTAGGTATAATTCACATTGTATTCAGAGTTGATGGTTGTACATATAAGT 1380
Query: 1381 GATTGCTGGTTTTAGTTGCAACTTTGTATAAAAGGGACTGAGAAATTTATAAACTTTTTT 1440
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1381 GATTGCTGGTTTTAGTTGCAACTTTGTATAAAAGGGACTGAGAAATTTATAAACTTTTTT 1440
Query: 1441 CTTACTGTCTTTTTTCTAAAGTAAAAACAAAGAAATTATGTGCCAGATTTATGCATATTA 1500
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1441 CTTACTGTCTTTTTTCTAAAGTAAAAACAAAGAAATTATGTGCCAGATTTATGCATATTA 1500
Query: 1501 TTTTATGTTGCATAGAATAAAATTTTTAATCTTTAATTTTACATTTCCTAAATATATTTT 1560
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1501 TTTTATGTTGCATAGAATAAAATTTTTAATCTTTAATTTTACATTTCCTAAATATATTTT 1560
Query: 1561 AAGACGAAACATTTGTTCTATAGCTTTTCCCTTTTTTTAAGTAAGGAATTTTATTTTTTT 1620
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1561 AAGACGAAACATTTGTTCTATAGCTTTTCCCTTTTTTTAAGTAAGGAATTTTATTTTTTT 1620
Query: 1621 CTGAATTATTTTCTCTCGTGAGTATATTGATCCAGAAAGAAAACTTGTATTATGTGTGTT 1680
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1621 CTGAATTATTTTCTCTCGTGAGTATATTGATCCAGAAAGAAAACTTGTATTATGTGTGTT 1680
Query: 1681 TTAAAATGAGAAATCTAAAAAACGAAAAGTCTCCAAAGTCTCTGGAATTTGAAACACTTT 1740
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1681 TTAAAATGAGAAATCTAAAAAACGAAAAGTCTCCAAAGTCTCTGGAATTTGAAACACTTT 1740
Query: 1741 GCATAACGTATAAAAGCCTGTTTAAGAGACAGCCAACTATGGCCTGTGGATCAAATCCAG 1800
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1741 GCATAACGTATAAAAGCCTGTTTAAGAGACAGCCAACTATGGCCTGTGGATCAAATCCAG 1800
Query: 1801 CCTGCTGCCTGCTTTTTATGGCCTGTGAGCTAGGAATTGTGTTTATAATTTTAAATGTTT 1860
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1801 CCTGCTGCCTGCTTTTTATGGCCTGTGAGCTAGGAATTGTGTTTATAATTTTAAATGTTT 1860
Query: 1861 TTTTTTAAAGACTTTTATGATACTTGAAAATTAACATGAATATTTAGTGTTCATAAATAA 1920
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1861 TTTTTTAAAGACTTTTATGATACTTGAAAATTAACATGAATATTTAGTGTTCATAAATAA 1920
Query: 1921 AGTTTGTTGAAACACA 1936
||||||||||||||||
Sbjct: 1921 AGTTTGTTGAAACACA 1936
Parameters:
echofilter
ctxfactor=2.00
E=10
Query ----- As Used ----- ----- Computed ----
Strand MatID Matrix name Lambda K H Lambda K H
+1 0 +5,-4 0.192 0.182 0.357 same same same
Q=10,R=10 0.104 0.0151 0.0600 n/a n/a n/a
-1 0 +5,-4 0.192 0.182 0.357 same same same
Q=10,R=10 0.104 0.0151 0.0600 n/a n/a n/a
Query
Strand MatID Length Eff.Length E S W T X E2 S2
+1 0 1936 1936 9.2 89 11 n/a 73 0.042 83
134 2.1 89
-1 0 1936 1936 9.2 89 11 n/a 73 0.042 83
134 2.1 89
Statistics:
Database: NM_003201.fa
Title: NM_003201.fa
Posted: 7:39:50 PM CET Nov 2, 2007
Created: 7:39:50 PM CET Nov 2, 2007
Format: XDF-1
# of letters in database: 1936
# of sequences in database: 1
# of database sequences satisfying E: 1
No. of states in DFA: 257 (257 KB)
Total size of DFA: 344 KB (2092 KB)
Time to generate neighborhood: 0.00u 0.00s 0.00t Elapsed: 00:00:00
No. of threads or processors used: 1
Search cpu time: 0.00u 0.01s 0.01t Elapsed: 00:00:00
Total cpu time: 0.00u 0.01s 0.01t Elapsed: 00:00:00
Start: Fri Nov 2 19:42:26 2007 End: Fri Nov 2 19:42:26 2007
From dave_messina at dev.open-bio.org Sat Nov 3 20:11:47 2007
From: dave_messina at dev.open-bio.org (Dave Messina)
Date: Sun, 04 Nov 2007 00:11:47 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/SearchIO blast.pm,1.119,1.120
Message-ID: <200711040011.lA40BlwH005328@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/SearchIO
In directory dev.open-bio.org:/tmp/cvs-serv5289/Bio/SearchIO
Modified Files:
blast.pm
Log Message:
Bug 2388 fix - added support for WU-BLAST -echofilter option. added test to SearchIO.t.
All SearchIO tests pass.
Index: blast.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/SearchIO/blast.pm,v
retrieving revision 1.119
retrieving revision 1.120
diff -C2 -d -r1.119 -r1.120
*** blast.pm 16 Jul 2007 03:41:32 -0000 1.119
--- blast.pm 4 Nov 2007 00:11:44 -0000 1.120
***************
*** 22,25 ****
--- 22,26 ----
# from WU-BLAST in frame-specific manner
# 20060216 - cjf - fixed blast parsing for BLAST v2.2.13 output
+ # 20071104 - dmessina - added support for WUBLAST -echofilter
=head1 NAME
***************
*** 578,581 ****
--- 579,591 ----
);
}
+ # added check for WU-BLAST -echofilter option (bug 2388)
+ elsif (/^>Unfiltered[+-]1$/) {
+ # skip all of the lines of unfiltered sequence
+ while($_ !~ /^Database:/) {
+ $self->debug("Bypassing features line: $_");
+ $_ = $self->_readline;
+ }
+ $self->_pushback($_);
+ }
elsif (/Sequences producing significant alignments:/) {
$self->debug("blast.pm: Processing NCBI-BLAST descriptions\n");
From bugzilla-daemon at portal.open-bio.org Sat Nov 3 20:13:35 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 3 Nov 2007 20:13:35 -0400
Subject: [Bioperl-guts-l] [Bug 2388] Incorrect parsing of wu-blast report
when echofilter option (of wu-blast) is used
In-Reply-To:
Message-ID: <200711040013.lA40DZBO012816@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2388
online at davemessina.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
------- Comment #4 from online at davemessina.com 2007-11-03 20:13 EST -------
I just committed a fix and a test for this. Thanks for the report and the test
case, Joe!
Dave
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From scain at dev.open-bio.org Mon Nov 5 10:27:48 2007
From: scain at dev.open-bio.org (Scott Cain)
Date: Mon, 05 Nov 2007 15:27:48 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/Graphics Glyph.pm,1.138,1.139
Message-ID: <200711051527.lA5FRmEJ009520@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/Graphics
In directory dev.open-bio.org:/tmp/cvs-serv9511/Graphics
Modified Files:
Glyph.pm
Log Message:
adding name to sort order options
Index: Glyph.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/Graphics/Glyph.pm,v
retrieving revision 1.138
retrieving revision 1.139
diff -C2 -d -r1.138 -r1.139
*** Glyph.pm 24 Aug 2007 21:15:40 -0000 1.138
--- Glyph.pm 5 Nov 2007 15:27:45 -0000 1.139
***************
*** 1756,1760 ****
will cause the longest (or shortest) features to be sorted first, and
"strand" will cause the features to be sorted by strand: "+1"
! (forward) then "0" (unknown, or NA) then "-1" (reverse).
In all cases, the "left" position will be used to break any ties. To
--- 1756,1761 ----
will cause the longest (or shortest) features to be sorted first, and
"strand" will cause the features to be sorted by strand: "+1"
! (forward) then "0" (unknown, or NA) then "-1" (reverse). Finally,
! "name" will sort by the display_name of the features.
In all cases, the "left" position will be used to break any ties. To
From jason at dev.open-bio.org Tue Nov 6 14:43:41 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Tue, 06 Nov 2007 19:43:41 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio SeqFeatureI.pm,1.73,1.74
Message-ID: <200711061943.lA6Jhfvo012197@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio
In directory dev.open-bio.org:/tmp/cvs-serv12171/Bio
Modified Files:
SeqFeatureI.pm
Log Message:
Don't we need to include Bio::Tools::GFF
Index: SeqFeatureI.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/SeqFeatureI.pm,v
retrieving revision 1.73
retrieving revision 1.74
diff -C2 -d -r1.73 -r1.74
*** SeqFeatureI.pm 29 Aug 2007 21:39:50 -0000 1.73
--- SeqFeatureI.pm 6 Nov 2007 19:43:38 -0000 1.74
***************
*** 84,88 ****
use vars qw($HasInMemory);
use strict;
-
BEGIN {
eval { require Bio::DB::InMemoryCache };
--- 84,87 ----
***************
*** 382,386 ****
sub _static_gff_formatter{
my ($self, at args) = @_;
!
if( !defined $static_gff_formatter ) {
$static_gff_formatter = Bio::Tools::GFF->new('-gff_version' => 2);
--- 381,385 ----
sub _static_gff_formatter{
my ($self, at args) = @_;
! require Bio::Tools::GFF; # on the fly inclusion -- is this better?
if( !defined $static_gff_formatter ) {
$static_gff_formatter = Bio::Tools::GFF->new('-gff_version' => 2);
***************
*** 663,667 ****
# DEPRECATED - us IDHandler
my $self = shift;
! require "Bio/SeqFeature/Tools/IDHandler.pm";
Bio::SeqFeature::Tools::IDHandler->new->generate_unique_persistent_id($self);
}
--- 662,666 ----
# DEPRECATED - us IDHandler
my $self = shift;
! require Bio::SeqFeature::Tools::IDHandler;
Bio::SeqFeature::Tools::IDHandler->new->generate_unique_persistent_id($self);
}
From jason at dev.open-bio.org Tue Nov 6 14:47:41 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Tue, 06 Nov 2007 19:47:41 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/AlignIO nexus.pm,1.31,1.32
Message-ID: <200711061947.lA6JlfKO012253@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/AlignIO
In directory dev.open-bio.org:/tmp/cvs-serv12227/Bio/AlignIO
Modified Files:
nexus.pm
Log Message:
allow . in the names
Index: nexus.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/AlignIO/nexus.pm,v
retrieving revision 1.31
retrieving revision 1.32
diff -C2 -d -r1.31 -r1.32
*** nexus.pm 14 Jun 2007 14:16:10 -0000 1.31
--- nexus.pm 6 Nov 2007 19:47:39 -0000 1.32
***************
*** 395,399 ****
foreach $seq ( $aln->each_seq() ) {
my $nmid = $aln->displayname($seq->get_nse());
! if( $nmid =~ /[^\w\d]/ ) {
# put name in single quotes incase it contains any of
# the following chars: ()[]{}/\,;:=*'"`+-<> that are not
--- 395,399 ----
foreach $seq ( $aln->each_seq() ) {
my $nmid = $aln->displayname($seq->get_nse());
! if( $nmid =~ /[^\w\d\.]/ ) {
# put name in single quotes incase it contains any of
# the following chars: ()[]{}/\,;:=*'"`+-<> that are not
From jason at dev.open-bio.org Tue Nov 6 14:54:17 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Tue, 06 Nov 2007 19:54:17 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/Tools GFF.pm,1.68,1.68.2.1
Message-ID: <200711061954.lA6JsHVT012419@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/Tools
In directory dev.open-bio.org:/tmp/cvs-serv12393/Bio/Tools
Modified Files:
Tag: lightweight_feature_branch
GFF.pm
Log Message:
remove old naming
Index: GFF.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/Tools/GFF.pm,v
retrieving revision 1.68
retrieving revision 1.68.2.1
diff -C2 -d -r1.68 -r1.68.2.1
*** GFF.pm 1 Oct 2007 15:18:12 -0000 1.68
--- GFF.pm 6 Nov 2007 19:54:15 -0000 1.68.2.1
***************
*** 735,739 ****
$frame);
! foreach my $tag ( $feat->all_tags ) {
foreach my $value ( $feat->each_tag_value($tag) ) {
$str .= " $tag=$value" if $value;
--- 735,739 ----
$frame);
! foreach my $tag ( $feat->get_all_tags ) {
foreach my $value ( $feat->each_tag_value($tag) ) {
$str .= " $tag=$value" if $value;
***************
*** 812,821 ****
! my @all_tags = $feat->all_tags;
my @group;
if (@all_tags) { # only play this game if it is worth playing...
foreach my $tag ( @all_tags ) {
my @v;
! foreach my $value ( $feat->each_tag_value($tag) ) {
unless( defined $value && length($value) ) {
$value = '""';
--- 812,821 ----
! my @all_tags = $feat->get_all_tags;
my @group;
if (@all_tags) { # only play this game if it is worth playing...
foreach my $tag ( @all_tags ) {
my @v;
! foreach my $value ( $feat->get_tag_values($tag) ) {
unless( defined $value && length($value) ) {
$value = '""';
***************
*** 914,918 ****
foreach my $tag ( @all_tags ) {
my @v;
! foreach my $value ( $feat->each_tag_value($tag) ) {
unless( defined $value && length($value) ) {
$value = '""';
--- 914,918 ----
foreach my $tag ( @all_tags ) {
my @v;
! foreach my $value ( $feat->get_tag_values($tag) ) {
unless( defined $value && length($value) ) {
$value = '""';
From jason at dev.open-bio.org Tue Nov 6 14:56:15 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Tue, 06 Nov 2007 19:56:15 +0000
Subject: [Bioperl-guts-l] bioperl-live/t SeqFeature_Slim.t,NONE,1.1.2.1
Message-ID: <200711061956.lA6JuF1f012460@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/t
In directory dev.open-bio.org:/tmp/cvs-serv12427/t
Added Files:
Tag: lightweight_feature_branch
SeqFeature_Slim.t
Log Message:
Slim Feature which *should* be SuperFast(Tm)
--- NEW FILE: SeqFeature_Slim.t ---
# -*-Perl-*- Test Harness script for Bioperl
# $Id: SeqFeature_Slim.t,v 1.1.2.1 2007/11/06 19:56:13 jason Exp $
use strict;
BEGIN {
use lib 't/lib';
use BioperlTest;
test_begin(-tests => 7);
use_ok('Bio::Seq');
use_ok('Bio::SeqIO');
use_ok('Bio::SeqFeature::Slim');
}
# predeclare variables for strict
my ($feat,$str,$feat2,$pair,$comp_obj1,$comp_obj2, at sft);
my $DEBUG = test_debug();
$feat = Bio::SeqFeature::Slim->new(-start => 40,
-end => 80,
-strand => 1,
-primary => 'exon',
-source => 'internal',
-tag => {
'silly' => 20,
'new' => 1
});
is $feat->start, 40, 'start of feature location';
is $feat->end, 80, 'end of feature location';
is $feat->primary_tag, 'exon', 'primary tag';
is $feat->source_tag, 'internal', 'source tag';
$str = $feat->gff_string() || ""; # placate -w
From jason at dev.open-bio.org Tue Nov 6 14:56:15 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Tue, 06 Nov 2007 19:56:15 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/SeqFeature Slim.pm,NONE,1.1.2.1
Message-ID: <200711061956.lA6JuFKu012465@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/SeqFeature
In directory dev.open-bio.org:/tmp/cvs-serv12427/Bio/SeqFeature
Added Files:
Tag: lightweight_feature_branch
Slim.pm
Log Message:
Slim Feature which *should* be SuperFast(Tm)
--- NEW FILE: Slim.pm ---
# $Id: Slim.pm,v 1.1.2.1 2007/11/06 19:56:13 jason Exp $
#
# BioPerl module for Bio::SeqFeature::Slim
#
# Cared for by Jason Stajich
#
# Copyright Jason Stajich
#
# You may distribute this module under the same terms as perl itself
# POD documentation - main docs before the code
=head1 NAME
Bio::SeqFeature::Slim - A very lightweight Bio::SeqFeatureI implementation
=head1 SYNOPSIS
Give standard usage here
=head1 DESCRIPTION
Describe the object here
=head1 FEEDBACK
=head2 Mailing Lists
User feedback is an integral part of the evolution of this and other
Bioperl modules. Send your comments and suggestions preferably to
the Bioperl mailing list. Your participation is much appreciated.
bioperl-l at bioperl.org - General discussion
http://bioperl.org/MailList.shtml - About the mailing lists
=head2 Reporting Bugs
Report bugs to the Bioperl bug tracking system to help us keep track
of the bugs and their resolution. Bug reports can be submitted via
the web:
http://bugzilla.bioperl.org/
=head1 AUTHOR - Jason Stajich
Email jason_AT_bioperl.org
Describe contact details here
=head1 CONTRIBUTORS
Additional contributors names and emails here
=head1 APPENDIX
The rest of the documentation details each of the object methods.
Internal methods are usually preceded with a _
=cut
# Let the code begin...
package Bio::SeqFeature::Slim;
use strict;
use base 'Bio::SeqFeatureI';
use constant {
SEQ_ID => 0,
SOURCE => 1,
PRIMARY => 2,
START => 3,
STOP => 4,
SCORE => 5,
STRAND => 6,
FRAME => 7,
TAGS => 8,
NAME => 9,
PARENT => 10,
GFF_TYPE => 11,
SUBFEATURES => 12,
SEQ_OBJ => 13,
};
=head2 new
Title : new
Usage : my $obj = new Bio::SeqFeature::Slim();
Function: Builds a new Bio::SeqFeature::Slim object
Returns : an instance of Bio::SeqFeature::Slim
Args :
=cut
sub new {
my($class) = shift;
my ($start, $end, $strand, $primary_tag, $source_tag, $primary,
$source, $frame, $score, $tag, $gff_string, $gff1_string,
$seqname, $seqid, $annot, $location,$display_name) =
Bio::Root::RootI->_rearrange([qw
(START
END
STRAND
PRIMARY_TAG
SOURCE_TAG
PRIMARY
SOURCE
FRAME
SCORE
TAG
GFF_STRING
GFF1_STRING
SEQNAME
SEQ_ID
ANNOTATION
LOCATION
DISPLAY_NAME
)], @_);
if( defined $primary_tag && defined $primary ) {
Bio::Root::RootI->warn("Both primary and primary_tag are defined, only use one");
}
if( defined $source_tag && defined $source ) {
Bio::Root::RootI->warn("Both source and source_tag are defined, only use one");
}
$primary_tag = $primary if defined $primary && ! defined $primary_tag;
$source_tag = $source if defined $source && ! defined $source_tag;
my $self = bless [$seqid, #0
$source_tag, #1
$primary_tag, #2
$start, #3
$end, #4
$score, #5
$strand, #6
$frame, #7
{}, #8 tags
$display_name, #9 display name
undef, #10 parent
undef, #11 gff_type
[], #12 seqfeatures
undef, #13 seqobj
], $class;
$tag && do {
foreach my $t ( keys %$tag ) {
$self->add_tag_value($t, UNIVERSAL::isa($tag->{$t}, "ARRAY") ?
@{$tag->{$t}} : $tag->{$t});
}
};
return $self;
}
=head1 Bio::SeqFeatureI specific methods
New method interfaces.
=cut
=head2 get_SeqFeatures
Title : get_SeqFeatures
Usage : @feats = $feat->get_SeqFeatures();
Function: Returns an array of sub Sequence Features
Returns : An array
Args : none
=cut
sub get_SeqFeatures{
return @{shift->[SUBFEATURES] || []};
}
=head2 display_name
Title : display_name
Usage : $name = $feat->display_name()
Function: Returns the human-readable name of the feature for displays.
Returns : a string
Args : none
=cut
sub display_name {
my ($self) = shift;
if( @_) {
($self->[NAME]) = shift @_;
}
return $self->[NAME];
}
=head2 primary_tag
Title : primary_tag
Usage : $tag = $feat->primary_tag()
Function: Returns the primary tag for a feature,
eg 'exon'
Returns : a string
Args : none
=cut
sub primary_tag{
my ($self) = shift;
if( @_) {
($self->[PRIMARY]) = shift @_;
}
return $self->[PRIMARY];
}
=head2 source_tag
Title : source_tag
Usage : $tag = $feat->source_tag()
Function: Returns the source tag for a feature,
eg, 'genscan'
Returns : a string
Args : none
=cut
sub source_tag{
my ($self) = shift;
if( @_) {
($self->[SOURCE]) = shift @_;
}
return $self->[SOURCE];
}
=head2 has_tag
Title : has_tag
Usage : $tag_exists = $self->has_tag('some_tag')
Function:
Returns : TRUE if the specified tag exists, and FALSE otherwise
Args :
=cut
sub has_tag{
my ($self,$tag) = @_;
return unless defined $tag;
return exists($self->[TAGS]->{$tag});
}
=head2 get_tag_values
Title : get_tag_values
Usage : @values = $self->get_tag_values('some_tag')
Function:
Returns : An array comprising the values of the specified tag.
Args : a string
throws an exception if there is no such tag
=cut
sub get_tag_values {
my ($self,$tag) = @_;
return unless defined $tag;
return $self->[TAGS]->{$tag};
}
=head2 get_tagset_values
Title : get_tagset_values
Usage : @values = $self->get_tagset_values(qw(label transcript_id product))
Function:
Returns : An array comprising the values of the specified tags, in order of tags
Args : An array of strings
does NOT throw an exception if none of the tags are not present
this method is useful for getting a human-readable label for a
SeqFeatureI; not all tags can be assumed to be present, so a list of
possible tags in preferential order is provided
=cut
# interface + abstract method
sub get_tagset_values {
my ($self, @args) = @_;
my @vals = ();
foreach my $arg (@args) {
if ($self->has_tag($arg)) {
push(@vals, $self->get_tag_values($arg));
}
}
return @vals;
}
=head2 get_all_tags
Title : get_all_tags
Usage : @tags = $feat->get_all_tags()
Function: gives all tags for this feature
Returns : an array of strings
Args : none
=cut
sub get_all_tags{
my ($self) = shift;
return keys %{$self->[TAGS] || {}};
}
=head2 attach_seq
Title : attach_seq
Usage : $sf->attach_seq($seq)
Function: Attaches a Bio::Seq object to this feature. This
Bio::Seq object is for the *entire* sequence: ie
from 1 to 10000
Note that it is not guaranteed that if you obtain a feature from
an object in bioperl, it will have a sequence attached. Also,
implementors of this interface can choose to provide an empty
implementation of this method. I.e., there is also no guarantee
that if you do attach a sequence, seq() or entire_seq() will not
return undef.
The reason that this method is here on the interface is to enable
you to call it on every SeqFeatureI compliant object, and
that it will be implemented in a useful way and set to a useful
value for the great majority of use cases. Implementors who choose
to ignore the call are encouraged to specifically state this in
their documentation.
Example :
Returns : TRUE on success
Args : a Bio::PrimarySeqI compliant object
=cut
sub attach_seq {
my ($self) = shift;
if(@_) {
$self->[SEQ_OBJ] = shift @_;
return 1 if defined $self->[SEQ_OBJ];
}
return 0;
}
=head2 seq
Title : seq
Usage : $tseq = $sf->seq()
Function: returns the truncated sequence (if there is a sequence attached)
for this feature
Example :
Returns : sub seq (a Bio::PrimarySeqI compliant object) on attached sequence
bounded by start & end, or undef if there is no sequence attached
Args : none
=cut
sub seq {
my ($self) = shift;
if(defined $self->[SEQ_OBJ] ) {
if( ! ref($self->[SEQ_OBJ]) ||
! $self->[SEQ_OBJ]->isa('Bio::PrimarySeqI') ) {
$self->throw("Have a seq_obj which is not Bio::PrimarySeqI compliant");
} else {
return $self->[SEQ_OBJ]->trunc($self->start, $self->end);
}
}
return undef;
}
=head2 entire_seq
Title : entire_seq
Usage : $whole_seq = $sf->entire_seq()
Function: gives the entire sequence that this seqfeature is attached to
Example :
Returns : a Bio::PrimarySeqI compliant object, or undef if there is no
sequence attached
Args : none
=cut
sub entire_seq {
my ($self) = shift;
if(defined $self->[SEQ_OBJ] ) {
if( ! ref($self->[SEQ_OBJ]) ||
! $self->[SEQ_OBJ]->isa('Bio::PrimarySeqI') ) {
$self->throw("Have a seq_obj which is not Bio::PrimarySeqI compliant");
} else {
return $self->[SEQ_OBJ];
}
}
return undef;
}
=head2 seq_id
Title : seq_id
Usage : $obj->seq_id($newval)
Function: There are many cases when you make a feature that you
do know the sequence name, but do not know its actual
sequence. This is an attribute such that you can store
the ID (e.g., display_id) of the sequence.
This attribute should *not* be used in GFF dumping, as
that should come from the collection in which the seq
feature was found.
Returns : value of seq_id
Args : newvalue (optional)
=cut
sub seq_id {
my ($self) = shift;
if( @_) {
($self->[SEQ_ID]) = shift @_;
}
return $self->[SEQ_ID];
}
=head2 gff_string
Title : gff_string
Usage : $str = $feat->gff_string;
$str = $feat->gff_string($gff_formatter);
Function: Provides the feature information in GFF format.
The implementation provided here returns GFF2 by default. If you
want a different version, supply an object implementing a method
gff_string() accepting a SeqFeatureI object as argument. E.g., to
obtain GFF1 format, do the following:
my $gffio = Bio::Tools::GFF->new(-gff_version => 1);
$gff1str = $feat->gff_string($gff1io);
Returns : A string
Args : Optionally, an object implementing gff_string().
=cut
=head1 Decorating methods
These methods have an implementation provided by Bio::SeqFeatureI,
but can be validly overwritten by subclasses
=head2 spliced_seq
Title : spliced_seq
Usage : $seq = $feature->spliced_seq()
$seq = $feature_with_remote_locations->spliced_seq($db_for_seqs)
Function: Provides a sequence of the feature which is the most
semantically "relevant" feature for this sequence. A default
implementation is provided which for simple cases returns just
the sequence, but for split cases, loops over the split location
to return the sequence. In the case of split locations with
remote locations, eg
join(AB000123:5567-5589,80..1144)
in the case when a database object is passed in, it will attempt
to retrieve the sequence from the database object, and "Do the right thing",
however if no database object is provided, it will generate the correct
number of N's (DNA) or X's (protein, though this is unlikely).
This function is deliberately "magical" attempting to second guess
what a user wants as "the" sequence for this feature.
Implementing classes are free to override this method with their
own magic if they have a better idea what the user wants.
Args : [optional]
-db A L compliant object if
one needs to retrieve remote seqs.
-nosort boolean if the locations should not be sorted
by start location. This may occur, for instance,
in a circular sequence where a gene span starts
before the end of the sequence and ends after the
sequence start. Example : join(15685..16260,1..207)
-phase truncates the returned sequence based on the
intron phase (0,1,2).
Returns : A L object
=cut
=head2 location
Title : location
Usage : my $location = $seqfeature->location()
Function: returns a location object suitable for identifying location
of feature on sequence or parent feature
NOTE: in the implementation location is READ-ONLY!
and complicated locations can not be represented because
this is intended to be used with GFF generated locations which will
always only be start..stop
Returns : Bio::LocationI object
Args : none
=cut
sub location {
my ($self) = @_;
if( @_ ) {
$self->warn("this implementation does not let setting of LOCATION obj\n");
return undef;
}
# somewhat silly - maybe we should cache this?
Bio::Location::Simple->new(-start => $self->start,
-end => $self->end,
-strand=> $self->strand);
}
=head2 primary_id
Title : primary_id
Usage : $obj->primary_id($newval)
Function:
Example :
Returns : value of primary_id (a scalar)
Args : on set, new value (a scalar or undef, optional)
Primary ID is a synonym for the tag 'ID'
=cut
sub primary_id{
my $self = shift;
# note from cjm at fruitfly.org:
# I have commented out the following 2 lines:
#return $self->{'primary_id'} = shift if @_;
#return $self->{'primary_id'};
#... and replaced it with the following; see
# http://bioperl.org/pipermail/bioperl-l/2003-December/014150.html
# for the discussion that lead to this change
if (@_) {
if ($self->has_tag('ID')) {
$self->remove_tag('ID');
}
$self->add_tag_value('ID', shift);
}
my ($id) = $self->get_tagset_values('ID');
return $id;
}
sub generate_unique_persistent_id {
# DEPRECATED - us IDHandler
my $self = shift;
require "Bio/SeqFeature/Tools/IDHandler.pm";
Bio::SeqFeature::Tools::IDHandler->new->generate_unique_persistent_id($self);
}
=head1 Bio::RangeI methods
These methods are inherited from RangeI and can be used
directly from a SeqFeatureI interface. Remember that a
SeqFeature is-a RangeI, and so wherever you see RangeI you
can use a feature ($r in the below documentation).
=cut
=head2 start()
See L
=cut
sub start {
my ($self) = shift;
if( @_) {
($self->[START]) = shift @_;
}
return $self->[START];
}
=head2 end()
See L
=cut
sub end {
my ($self) = shift;
if( @_) {
($self->[STOP]) = shift @_;
}
return $self->[STOP];
}
=head2 strand()
See L
=cut
sub strand {
my ($self) = shift;
if( @_) {
($self->[STRAND]) = shift @_;
}
return $self->[STRAND];
}
=head2 overlaps()
See L
=head2 contains()
See L
=head2 equals()
See L
=head2 intersection()
See L
=head2 union()
See L
=head1 Bio::AnnotatableI methods
=cut
=head2 add_tag_value
Title : add_tag_value
Usage : $self->add_tag_value('note',"this is a note");
Returns : TRUE on success
Args : tag (string) and one or more values (any scalar(s))
=cut
sub add_tag_value{
my $self = shift;
my $tag = shift;
$self->[TAGS] ||= [];
push (@{$self->[TAGS]->{$tag}}, at _);
}
=head2 create_seqfeature_generic
Title : create_seqfeature_generic
Usage : my $feat = $slimfeat->create_seqfeature_generic
Function: Create a Bio::SeqFeature::Generic object from this Slim object
Returns : L
Args : None
=cut
sub create_seqfeature_generic{
my ($self) = shift;
return Bio::SeqFeature::Generic->new(-location => $self->location,
-score => $self->score,
-source_tag => $self->source_tag,
-primary_tag=> $self->primary_tag,
-frame => $self->frame,
-tag => $self->[TAGS],
-display_name=> $self->display_name,
);
}
1;
From jason at dev.open-bio.org Tue Nov 6 14:56:56 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Tue, 06 Nov 2007 19:56:56 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio SeqFeatureI.pm,1.73,1.73.2.1
Message-ID: <200711061956.lA6Juuog012543@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio
In directory dev.open-bio.org:/tmp/cvs-serv12517/Bio
Modified Files:
Tag: lightweight_feature_branch
SeqFeatureI.pm
Log Message:
I think we can do dynamic loading
Index: SeqFeatureI.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/SeqFeatureI.pm,v
retrieving revision 1.73
retrieving revision 1.73.2.1
diff -C2 -d -r1.73 -r1.73.2.1
*** SeqFeatureI.pm 29 Aug 2007 21:39:50 -0000 1.73
--- SeqFeatureI.pm 6 Nov 2007 19:56:54 -0000 1.73.2.1
***************
*** 84,88 ****
use vars qw($HasInMemory);
use strict;
-
BEGIN {
eval { require Bio::DB::InMemoryCache };
--- 84,87 ----
***************
*** 382,386 ****
sub _static_gff_formatter{
my ($self, at args) = @_;
!
if( !defined $static_gff_formatter ) {
$static_gff_formatter = Bio::Tools::GFF->new('-gff_version' => 2);
--- 381,385 ----
sub _static_gff_formatter{
my ($self, at args) = @_;
! require Bio::Tools::GFF; # on the fly inclusion -- is this better?
if( !defined $static_gff_formatter ) {
$static_gff_formatter = Bio::Tools::GFF->new('-gff_version' => 2);
***************
*** 663,667 ****
# DEPRECATED - us IDHandler
my $self = shift;
! require "Bio/SeqFeature/Tools/IDHandler.pm";
Bio::SeqFeature::Tools::IDHandler->new->generate_unique_persistent_id($self);
}
--- 662,666 ----
# DEPRECATED - us IDHandler
my $self = shift;
! require Bio::SeqFeature::Tools::IDHandler;
Bio::SeqFeature::Tools::IDHandler->new->generate_unique_persistent_id($self);
}
From jason at dev.open-bio.org Tue Nov 6 16:10:55 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Tue, 06 Nov 2007 21:10:55 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/Tools GFF.pm,1.68.2.1,1.68.2.2
Message-ID: <200711062110.lA6LAtAG012659@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/Tools
In directory dev.open-bio.org:/tmp/cvs-serv12621/Bio/Tools
Modified Files:
Tag: lightweight_feature_branch
GFF.pm
Log Message:
fixes to manage the frame setting and calculate length
Index: GFF.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/Tools/GFF.pm,v
retrieving revision 1.68.2.1
retrieving revision 1.68.2.2
diff -C2 -d -r1.68.2.1 -r1.68.2.2
*** GFF.pm 6 Nov 2007 19:54:15 -0000 1.68.2.1
--- GFF.pm 6 Nov 2007 21:10:53 -0000 1.68.2.2
***************
*** 134,138 ****
package Bio::Tools::GFF;
- use vars qw($HAS_HTML_ENTITIES);
use strict;
--- 134,137 ----
***************
*** 140,143 ****
--- 139,145 ----
use Bio::LocatableSeq;
use Bio::SeqFeature::Generic;
+ use Bio::SeqFeature::Slim;
+
+ use constant DEFAULT_FEATURE_TYPE => 'Bio::SeqFeature::Generic';
use base qw(Bio::Root::Root Bio::SeqAnalysisParserI Bio::Root::IO);
***************
*** 151,160 ****
my $writer = Bio::Tools::GFF->new(-gff_version => 3,
-file => ">filename.gff3");
! Function: Creates a new instance. Recognized named parameters are -file, -fh,
! and -gff_version.
! Returns : a new object
Args : named parameters
! -gff_version => [1,2,3]
!
=cut
--- 153,166 ----
my $writer = Bio::Tools::GFF->new(-gff_version => 3,
-file => ">filename.gff3");
! Function: Creates a new instance for reading or writing.
! Recognized named parameters are -file, -fh,
! and -gff_version.
! Returns : a new Bio::Tools::GFF object
Args : named parameters
! -gff_version => [1,2,2.5,3]
! -feature_type => type of seqfeature to create.
! default: Bio::SeqFeature::Generic
! for speed try Bio::SeqFeature::Slim
! -noparse => boolean
=cut
***************
*** 172,182 ****
}
-
sub new {
my ($class, @args) = @_;
my $self = $class->SUPER::new(@args);
! my ($gff_version, $noparse) = $self->_rearrange([qw(GFF_VERSION NOPARSE)], at args);
!
# initialize IO
$self->_initialize_io(@args);
--- 178,189 ----
}
sub new {
my ($class, @args) = @_;
my $self = $class->SUPER::new(@args);
! my ($gff_version, $noparse,
! $feature_type) = $self->_rearrange([qw(GFF_VERSION NOPARSE
! FEATURE_TYPE)], at args);
!
# initialize IO
$self->_initialize_io(@args);
***************
*** 189,192 ****
--- 196,201 ----
}
$self->{'_first'} = 1;
+ $feature_type ||= DEFAULT_FEATURE_TYPE;
+ $self->feature_type($feature_type);
return $self;
}
***************
*** 373,378 ****
}
return unless $gff_string;
!
! my $feat = Bio::SeqFeature::Generic->new();
$self->from_gff_string($feat, $gff_string);
--- 382,388 ----
}
return unless $gff_string;
!
! my $featuretype = $self->feature_type;
! my $feat = $featuretype->new();
$self->from_gff_string($feat, $gff_string);
***************
*** 965,1107 ****
sub _gff3_string {
! my ($gff, $origfeat) = @_;
! my $feat;
! if ($origfeat->isa('Bio::SeqFeature::FeaturePair')){
! $feat = $origfeat->feature2;
! } else {
! $feat = $origfeat;
! }
! my $ID = $gff->_incrementGFF3ID();
! my ($score,$frame,$name,$strand);
! if( $feat->can('score') ) {
! $score = $feat->score();
! }
! $score = '.' unless defined $score;
! if( $feat->can('frame') ) {
! $frame = $feat->frame();
! }
! $frame = '.' unless defined $frame;
! $strand = $feat->strand();
! if(! $strand) {
! $strand = ".";
! } elsif( $strand == 1 ) {
! $strand = '+';
! } elsif ( $feat->strand == -1 ) {
! $strand = '-';
! }
! if( $feat->can('seqname') ) {
! $name = $feat->seq_id();
! $name ||= 'SEQ';
! } else {
! $name = 'SEQ';
! }
! my @groups;
! # force leading ID and Parent tags
! my @all_tags = grep { !/ID/ && !/Parent/ } $feat->all_tags;
! unshift @all_tags, 'Parent' if $feat->has_tag('Parent');
! unshift @all_tags, 'ID' if $feat->has_tag('ID');
! for my $tag ( @all_tags ) {
! # next if $tag eq 'Target';
! if ($tag eq 'Target' && ! $origfeat->isa('Bio::SeqFeature::FeaturePair')){
! # simple Target,start,stop
! my($target_id, $b,$e,$strand) = $feat->get_tag_values($tag);
! next unless(defined($e) && defined($b) && $target_id);
! ($b,$e)= ($e,$b) if(defined $strand && $strand<0);
! $target_id =~ s/([\t\n\r%&\=;,])/sprintf("%%%X",ord($1))/ge;
! push @groups, sprintf("Target=%s %d %d", $target_id,$b,$e);
! next;
! }
!
! my $valuestr;
! # a string which will hold one or more values
! # for this tag, with quoted free text and
! # space-separated individual values.
! my @v;
! for my $value ( $feat->each_tag_value($tag) ) {
! if( defined $value && length($value) ) {
#$value =~ tr/ /+/; #spaces are allowed now
! if ($value =~ /[^a-zA-Z0-9\,\;\=\.:\%\^\*\$\@\!\+\_\?\-]/) {
! $value =~ s/\t/\\t/g; # substitute tab and newline
! # characters
! $value =~ s/\n/\\n/g; # to their UNIX equivalents
! # Unescaped quotes are not allowed in GFF3
! # $value = '"' . $value . '"';
! }
! $value =~ s/([\t\n\r%&\=;,])/sprintf("%%%X",ord($1))/ge;
! } else {
# if it is completely empty,
# then just make empty double
# quotes
! $value = '""';
! }
! push @v, $value;
! }
! $tag= lcfirst($tag) unless ($tag
! =~ /^(ID|Name|Alias|Parent|Gap|Target|Derives_from|Note|Dbxref|Ontology_term)$/);
!
! push @groups, "$tag=".join(",", at v);
}
! # Add Target information for Feature Pairs
! if( $feat->has_tag('Target') &&
! ! $feat->has_tag('Group') &&
! $origfeat->isa('Bio::SeqFeature::FeaturePair') ) {
! my $target_id = $origfeat->feature1->seq_id;
! $target_id =~ s/([\t\n\r%&\=;,])/sprintf("%%%X",ord($1))/ge;
!
! push @groups, sprintf("Target=%s %d %d",
! $target_id,
! ( $origfeat->feature1->strand < 0 ?
! ( $origfeat->feature1->end,
! $origfeat->feature1->start) :
! ( $origfeat->feature1->start,
! $origfeat->feature1->end)
! ));
! }
!
! # unshift @groups, "ID=autogenerated$ID" unless ($feat->has_tag('ID'));
! unshift @groups, 'Name=' . $feat->name if $feat->can('name') && defined($feat->name) ; # such as might be for Bio::DB::SeqFeature
! my $gff_string = "";
! if ($feat->location->isa("Bio::Location::SplitLocationI")) {
! my @locs = $feat->location->each_Location;
! foreach my $loc (@locs) {
! $gff_string .= join("\t",
! $name,
! $feat->source_tag() || '.',
! $feat->primary_tag(),
! $loc->start(),
! $loc->end(),
! $score,
! $strand,
! $frame,
! join(';', @groups)) . "\n";
! }
! chop $gff_string;
! return $gff_string;
! } else {
! $gff_string = join("\t",
! $name,
! $feat->source_tag() || '.',
! $feat->primary_tag(),
! $feat->start(),
! $feat->end(),
! $score,
! $strand,
! $frame,
! join(';', @groups));
}
return $gff_string;
}
--- 975,1117 ----
sub _gff3_string {
! my ($gff, $origfeat) = @_;
! my $feat;
! if ($origfeat->isa('Bio::SeqFeature::FeaturePair')){
! $feat = $origfeat->feature2;
! } else {
! $feat = $origfeat;
! }
! my $ID = $gff->_incrementGFF3ID();
! my ($score,$frame,$name,$strand);
! if( $feat->can('score') ) {
! $score = $feat->score();
! }
! $score = '.' unless defined $score;
! if( $feat->can('frame') ) {
! $frame = $feat->frame();
! }
! $frame = '.' unless defined $frame;
! $strand = $feat->strand();
! if(! $strand) {
! $strand = ".";
! } elsif( $strand == 1 ) {
! $strand = '+';
! } elsif ( $feat->strand == -1 ) {
! $strand = '-';
! }
! if( $feat->can('seqname') ) {
! $name = $feat->seq_id();
! $name ||= 'SEQ';
! } else {
! $name = 'SEQ';
! }
! my @groups;
! # force leading ID and Parent tags
! my @all_tags = grep { !/ID/ && !/Parent/ } $feat->all_tags;
! unshift @all_tags, 'Parent' if $feat->has_tag('Parent');
! unshift @all_tags, 'ID' if $feat->has_tag('ID');
! for my $tag ( @all_tags ) {
! # next if $tag eq 'Target';
! if ($tag eq 'Target' && ! $origfeat->isa('Bio::SeqFeature::FeaturePair')){
! # simple Target,start,stop
! my($target_id, $b,$e,$strand) = $feat->get_tag_values($tag);
! next unless(defined($e) && defined($b) && $target_id);
! ($b,$e)= ($e,$b) if(defined $strand && $strand<0);
! $target_id =~ s/([\t\n\r%&\=;,])/sprintf("%%%X",ord($1))/ge;
! push @groups, sprintf("Target=%s %d %d", $target_id,$b,$e);
! next;
! }
!
! my $valuestr;
! # a string which will hold one or more values
! # for this tag, with quoted free text and
! # space-separated individual values.
! my @v;
! for my $value ( $feat->each_tag_value($tag) ) {
! if( defined $value && length($value) ) {
#$value =~ tr/ /+/; #spaces are allowed now
! if ($value =~ /[^a-zA-Z0-9\,\;\=\.:\%\^\*\$\@\!\+\_\?\-]/) {
! $value =~ s/\t/\\t/g; # substitute tab and newline
! # characters
! $value =~ s/\n/\\n/g; # to their UNIX equivalents
! # Unescaped quotes are not allowed in GFF3
! # $value = '"' . $value . '"';
! }
! $value =~ s/([\t\n\r%&\=;,])/sprintf("%%%X",ord($1))/ge;
! } else {
# if it is completely empty,
# then just make empty double
# quotes
! $value = '""';
! }
! push @v, $value;
}
! $tag= lcfirst($tag) unless ($tag
! =~ /^(ID|Name|Alias|Parent|Gap|Target|Derives_from|Note|Dbxref|Ontology_term)$/);
! push @groups, "$tag=".join(",", at v);
! }
! # Add Target information for Feature Pairs
! if( $feat->has_tag('Target') &&
! ! $feat->has_tag('Group') &&
! $origfeat->isa('Bio::SeqFeature::FeaturePair') ) {
!
! my $target_id = $origfeat->feature1->seq_id;
! $target_id =~ s/([\t\n\r%&\=;,])/sprintf("%%%X",ord($1))/ge;
!
! push @groups, sprintf("Target=%s %d %d",
! $target_id,
! ( $origfeat->feature1->strand < 0 ?
! ( $origfeat->feature1->end,
! $origfeat->feature1->start) :
! ( $origfeat->feature1->start,
! $origfeat->feature1->end)
! ));
! }
!
! # unshift @groups, "ID=autogenerated$ID" unless ($feat->has_tag('ID'));
! unshift @groups, 'Name=' . $feat->name if $feat->can('name') && defined($feat->name) ; # such as might be for Bio::DB::SeqFeature
! my $gff_string = "";
! if ($feat->location->isa("Bio::Location::SplitLocationI")) {
! my @locs = $feat->location->each_Location;
! foreach my $loc (@locs) {
! $gff_string .= join("\t",
! $name,
! $feat->source_tag() || '.',
! $feat->primary_tag(),
! $loc->start(),
! $loc->end(),
! $score,
! $strand,
! $frame,
! join(';', @groups)) . "\n";
}
+ chop $gff_string;
return $gff_string;
+ } else {
+ $gff_string = join("\t",
+ $name,
+ $feat->source_tag() || '.',
+ $feat->primary_tag(),
+ $feat->start(),
+ $feat->end(),
+ $score,
+ $strand,
+ $frame,
+ join(';', @groups));
+ }
+ return $gff_string;
}
***************
*** 1230,1233 ****
--- 1240,1261 ----
}
+ =head2 feature_type
+
+ Title : feature_type
+ Usage : $obj->feature_type($newval)
+ Function:
+ Example :
+ Returns : value of feature_type (a scalar)
+ Args : on set, new value (a scalar or undef, optional)
+
+
+ =cut
+
+ sub feature_type{
+ my $self = shift;
+ return $self->{'feature_type'} = shift if @_;
+ return $self->{'feature_type'};
+ }
+
=head2 ignore_sequence
From jason at dev.open-bio.org Tue Nov 6 16:10:55 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Tue, 06 Nov 2007 21:10:55 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/SeqFeature Slim.pm, 1.1.2.1,
1.1.2.2
Message-ID: <200711062110.lA6LAtUi012654@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/SeqFeature
In directory dev.open-bio.org:/tmp/cvs-serv12621/Bio/SeqFeature
Modified Files:
Tag: lightweight_feature_branch
Slim.pm
Log Message:
fixes to manage the frame setting and calculate length
Index: Slim.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/SeqFeature/Attic/Slim.pm,v
retrieving revision 1.1.2.1
retrieving revision 1.1.2.2
diff -C2 -d -r1.1.2.1 -r1.1.2.2
*** Slim.pm 6 Nov 2007 19:56:13 -0000 1.1.2.1
--- Slim.pm 6 Nov 2007 21:10:53 -0000 1.1.2.2
***************
*** 17,21 ****
=head1 SYNOPSIS
! Give standard usage here
=head1 DESCRIPTION
--- 17,21 ----
=head1 SYNOPSIS
! use Bio::SeqFeature::Slim;
=head1 DESCRIPTION
***************
*** 232,235 ****
--- 232,255 ----
}
+ =head2 frame
+
+ Title : frame
+ Usage : $frame = $feat->frame()
+ Function: Returns the frame for a feature,
+ eg, '1'
+ Returns : '.', 0,1,2
+ Args : none
+
+
+ =cut
+
+ sub frame{
+ my ($self) = shift;
+ if( @_) {
+ ($self->[FRAME]) = shift @_;
+ }
+ return $self->[FRAME];
+ }
+
=head2 has_tag
***************
*** 619,622 ****
--- 639,660 ----
}
+ =head2 length
+
+ Title : length
+ Usage : $length = $range->length();
+ Function: get/set the length of this range
+ Returns : the length of this range
+ Args : optionally allows the length to be set
+ using $range->length($length)
+
+ =cut
+
+ sub length {
+ my $self = shift;
+ if(@_) {
+ $self->warn( ref($self). "->length() is read-only");
+ }
+ return abs($self->end - $self->start) + 1;
+ }
=head2 overlaps()
From jason at dev.open-bio.org Tue Nov 6 23:22:13 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Wed, 07 Nov 2007 04:22:13 +0000
Subject: [Bioperl-guts-l] bioperl-live/t SeqFeature_Slim.t,1.1.2.1,1.1.2.2
Message-ID: <200711070422.lA74MDle013377@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/t
In directory dev.open-bio.org:/tmp/cvs-serv13351/t
Modified Files:
Tag: lightweight_feature_branch
SeqFeature_Slim.t
Log Message:
tests for expansion and sub-features
Index: SeqFeature_Slim.t
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/t/Attic/SeqFeature_Slim.t,v
retrieving revision 1.1.2.1
retrieving revision 1.1.2.2
diff -C2 -d -r1.1.2.1 -r1.1.2.2
*** SeqFeature_Slim.t 6 Nov 2007 19:56:13 -0000 1.1.2.1
--- SeqFeature_Slim.t 7 Nov 2007 04:22:11 -0000 1.1.2.2
***************
*** 8,12 ****
use BioperlTest;
! test_begin(-tests => 7);
use_ok('Bio::Seq');
--- 8,12 ----
use BioperlTest;
! test_begin(-tests => 36);
use_ok('Bio::Seq');
***************
*** 35,36 ****
--- 35,114 ----
is $feat->source_tag, 'internal', 'source tag';
$str = $feat->gff_string() || ""; # placate -w
+ is $feat->length, 41, 'length';
+ is $feat->strand, 1, 'strand';
+
+ my @tags = $feat->get_all_tags;
+ is @tags, 2, 'number of tags';
+ is $feat->has_tag('silly'), 1, 'has silly tag';
+ my ($v) = $feat->get_tag_values('silly');
+ is $v, 20, 'tag value';
+
+ my $gf = $feat->create_seqfeature_generic;
+ is ref($gf) , 'Bio::SeqFeature::Generic', 'Create generic SF';
+
+ my $seq = Bio::Seq->new(-seq => 'ACGT'x100,
+ -id => 'generic');
+ $seq->add_SeqFeature($feat);
+
+ is $feat->entire_seq->seq, $seq->seq, 'Entire sequence object';
+ is $feat->seq->length, $feat->length, 'Feature length';
+
+ my $geneid = 'gene001';
+ my $mrnaid = 'mRNA001';
+
+ my $mRNA = Bio::SeqFeature::Slim->new(-start => 20,
+ -end => 70,
+ -strand=> 1,
+ -score => 0.70,
+ -primary=> 'mRNA',
+ -source => 'Curated',
+ -display_name => 'BTB_0001',
+ -id => $mrnaid,
+ -parent => $geneid,
+ );
+
+ is $mRNA->score, 0.70, 'Score';
+ is $mRNA->display_name, 'BTB_0001', 'Display name';
+ is $mRNA->parent_id, $geneid, 'gene parent id';
+ is $mRNA->primary_id, $mrnaid, 'mRNA id';
+
+ my $c = 0;
+ my @exons = ([20,29,1],[40,70,1]);
+ for my $subf ( @exons ) {
+ my $f = Bio::SeqFeature::Slim->new(-start => $subf->[0],
+ -end => $subf->[1],
+ -strand=> $subf->[2],
+ -primary=> 'CDS',
+ -source => 'Curated',
+ -parent => $mrnaid,
+ -id => sprintf('cds%03d',$c++),
+ );
+ $mRNA->add_SeqFeature($f);
+ }
+ is $mRNA->start, 20, 'mRNA start';
+ is $mRNA->end, 70, 'mRNA end';
+ is $mRNA->length, 51, '2 exon mRNA length';
+
+ my $i = 0;
+ for my $cds ( $mRNA->get_SeqFeatures ) {
+ is $cds->primary_tag, 'CDS', 'primary tag of cds feature';
+ is $cds->start, $exons[$i]->[0], 'exon start';
+ is $cds->end, $exons[$i]->[1], 'exon end';
+ is $cds->strand, $exons[$i]->[2], 'strand';
+ is $cds->parent_id, $mrnaid, 'cds parent id';
+ $i++;
+ }
+ is $i,2, '2 exons seen';
+
+ $mRNA->add_SeqFeature(Bio::SeqFeature::Slim->new(
+ -start => 80,
+ -end => 88,
+ -strand=> 1,
+ -primary=> 'CDS',
+ -source => 'Curated',
+ -parent => $mrnaid,
+ -id => sprintf('cds%03d',$c++),
+ ),'EXPAND');
+ is $mRNA->end, 88, '3 exon mRNA end';
+ is $mRNA->start, 20, '3 exon mRNA start';
+ is $mRNA->length, 69, '3 exon mRNA length';
From jason at dev.open-bio.org Tue Nov 6 23:32:21 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Wed, 07 Nov 2007 04:32:21 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/SeqFeature Slim.pm, 1.1.2.2,
1.1.2.3
Message-ID: <200711070432.lA74WLEl013432@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/SeqFeature
In directory dev.open-bio.org:/tmp/cvs-serv13406/Bio/SeqFeature
Modified Files:
Tag: lightweight_feature_branch
Slim.pm
Log Message:
more complete Bio::SeqFeature::Generic impl in the lightweight object
Index: Slim.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/SeqFeature/Attic/Slim.pm,v
retrieving revision 1.1.2.2
retrieving revision 1.1.2.3
diff -C2 -d -r1.1.2.2 -r1.1.2.3
*** Slim.pm 6 Nov 2007 21:10:53 -0000 1.1.2.2
--- Slim.pm 7 Nov 2007 04:32:18 -0000 1.1.2.3
***************
*** 21,25 ****
=head1 DESCRIPTION
! Describe the object here
=head1 FEEDBACK
--- 21,25 ----
=head1 DESCRIPTION
! Lightweight Bio::SeqFeatureI implemention.
=head1 FEEDBACK
***************
*** 82,85 ****
--- 82,86 ----
SUBFEATURES => 12,
SEQ_OBJ => 13,
+ VERBOSE => 14,
};
***************
*** 100,104 ****
my ($start, $end, $strand, $primary_tag, $source_tag, $primary,
$source, $frame, $score, $tag, $gff_string, $gff1_string,
! $seqname, $seqid, $annot, $location,$display_name) =
Bio::Root::RootI->_rearrange([qw
(START
--- 101,106 ----
my ($start, $end, $strand, $primary_tag, $source_tag, $primary,
$source, $frame, $score, $tag, $gff_string, $gff1_string,
! $seqname, $seqid, $annot, $location,$display_name,$pid,$id,
! $parent_id,$parent) =
Bio::Root::RootI->_rearrange([qw
(START
***************
*** 119,122 ****
--- 121,128 ----
LOCATION
DISPLAY_NAME
+ PRIMARY_ID
+ ID
+ PARENT_ID
+ PARENT
)], @_);
if( defined $primary_tag && defined $primary ) {
***************
*** 151,157 ****
--- 157,187 ----
}
};
+ if( defined $pid && defined $id ) {
+ Bio::Root::RootI->warn("Both primary_id and id are defined, only use one");
+ }
+ # save primary ID if it exists
+ $pid = $id if ! defined $pid;
+ defined $pid && $self->primary_id($pid);
+
+ if( defined $parent && defined $parent_id ) {
+ Bio::Root::RootI->warn("Both parent_id and parent are defined, only use one");
+ }
+
+ # save parent ID if it exists
+ $parent_id = $parent if ! defined $parent_id;
+ $parent_id && $self->parent_id($parent_id);
+
return $self;
}
+ sub verbose {
+ my ($self,$value) = @_;
+
+ if (defined $value || ! defined $self->[VERBOSE]) {
+ $self->[VERBOSE] = $value || 0;
+ }
+ return $self->[VERBOSE];
+ }
+
=head1 Bio::SeqFeatureI specific methods
***************
*** 160,177 ****
=cut
- =head2 get_SeqFeatures
-
- Title : get_SeqFeatures
- Usage : @feats = $feat->get_SeqFeatures();
- Function: Returns an array of sub Sequence Features
- Returns : An array
- Args : none
-
- =cut
-
- sub get_SeqFeatures{
- return @{shift->[SUBFEATURES] || []};
- }
-
=head2 display_name
--- 190,193 ----
***************
*** 269,272 ****
--- 285,306 ----
}
+ =head2 score
+
+ Title : score
+ Usage : $score = $feat->score()
+ Function: Returns the score
+ Returns : a string/number
+ Args : none
+
+ =cut
+
+ sub score {
+ my ($self) = shift;
+ if( @_) {
+ ($self->[SCORE]) = shift @_;
+ }
+ return $self->[SCORE];
+ }
+
=head2 get_tag_values
***************
*** 283,288 ****
sub get_tag_values {
my ($self,$tag) = @_;
! return unless defined $tag;
! return $self->[TAGS]->{$tag};
}
--- 317,322 ----
sub get_tag_values {
my ($self,$tag) = @_;
! return() unless defined $tag;
! return @{$self->[TAGS]->{$tag} || []};
}
***************
*** 534,538 ****
sub location {
! my ($self) = @_;
if( @_ ) {
$self->warn("this implementation does not let setting of LOCATION obj\n");
--- 568,572 ----
sub location {
! my ($self) = shift;
if( @_ ) {
$self->warn("this implementation does not let setting of LOCATION obj\n");
***************
*** 581,584 ****
--- 615,644 ----
}
+ =head2 parent_id
+
+ Title : parent_id
+ Usage : $obj->parent_id($newval)
+ Function:
+ Example :
+ Returns : value of parent_id (a scalar)
+ Args : on set, new value (a scalar or undef, optional)
+
+ Parent ID is a synonym for the tag 'Parent'
+
+ =cut
+
+ sub parent_id{
+ my $self = shift;
+
+ if (@_) {
+ if ($self->has_tag('Parent')) {
+ $self->remove_tag('Parent');
+ }
+ $self->add_tag_value('Parent', shift);
+ }
+ my ($id) = $self->get_tagset_values('Parent');
+ return $id;
+ }
+
sub generate_unique_persistent_id {
# DEPRECATED - us IDHandler
***************
*** 710,714 ****
=cut
! sub create_seqfeature_generic{
my ($self) = shift;
return Bio::SeqFeature::Generic->new(-location => $self->location,
--- 770,774 ----
=cut
! sub create_seqfeature_generic {
my ($self) = shift;
return Bio::SeqFeature::Generic->new(-location => $self->location,
***************
*** 722,724 ****
--- 782,907 ----
}
+ =head1 Methods to implement Bio::FeatureHolderI
+
+ This includes methods for retrieving, adding, and removing
+ features. Since this is already a feature, features held by this
+ feature holder are essentially sub-features.
+
+ =cut
+
+ =head2 get_SeqFeatures
+
+ Title : get_SeqFeatures
+ Usage : @feats = $feat->get_SeqFeatures();
+ Function: Returns an array of sub Sequence Features
+ Returns : An array
+ Args : none
+
+ =cut
+
+ sub get_SeqFeatures{
+ return @{shift->[SUBFEATURES] || []};
+ }
+
+ =head2 add_SeqFeature
+
+ Title : add_SeqFeature
+ Usage : $feat->add_SeqFeature($subfeat);
+ $feat->add_SeqFeature($subfeat,'EXPAND')
+ Function: adds a SeqFeature into the subSeqFeature array.
+ with no 'EXPAND' qualifer, subfeat will be tested
+ as to whether it lies inside the parent, and throw
+ an exception if not.
+
+ If EXPAND is used, the parent's start/end/strand will
+ be adjusted so that it grows to accommodate the new
+ subFeature
+ Returns : nothing
+ Args : An object which has the SeqFeatureI interface
+
+
+ =cut
+
+ #'
+ sub add_SeqFeature{
+ my ($self,$feat,$expand) = @_;
+ unless( defined $feat ) {
+ $self->warn("Called add_SeqFeature with no feature, ignoring");
+ return;
+ }
+ if ( ! $feat->isa('Bio::SeqFeatureI') ) {
+ $self->warn("$feat does not implement Bio::SeqFeatureI. Will add it anyway, but beware...");
+ }
+
+ if($expand && ($expand eq 'EXPAND')) {
+ $self->_expand_region($feat);
+ } else {
+ if ( ! $self->contains($feat) ) {
+ $self->throw("$feat is not contained within parent feature, and expansion is not valid");
+ }
+ }
+
+ $self->[SUBFEATURES] = [] unless defined ($self->[SUBFEATURES]);
+ push(@{$self->[SUBFEATURES]},$feat);
+ }
+
+ =head2 remove_SeqFeatures
+
+ Title : remove_SeqFeatures
+ Usage : $sf->remove_SeqFeatures
+ Function: Removes all sub SeqFeatures
+
+ If you want to remove only a subset, remove that subset from the
+ returned array, and add back the rest.
+
+ Example :
+ Returns : The array of Bio::SeqFeatureI implementing sub-features that was
+ deleted from this feature.
+ Args : none
+
+
+ =cut
+
+ sub remove_SeqFeatures {
+ my ($self) = @_;
+
+ my @subfeats = @{$self->[SUBFEATURES] || []};
+ $self->[SUBFEATURES] = []; # zap the array implicitly.
+ return @subfeats;
+ }
+
+ =head2 _expand_region
+
+ Title : _expand_region
+ Usage : $self->_expand_region($feature);
+ Function: Expand the total region covered by this feature to
+ accomodate for the given feature.
+
+ May be called whenever any kind of subfeature is added to this
+ feature. add_sub_SeqFeature() already does this.
+ Returns :
+ Args : A Bio::SeqFeatureI implementing object.
+
+
+ =cut
+
+ sub _expand_region {
+ my ($self, $feat) = @_;
+ if(! $feat->isa('Bio::SeqFeatureI')) {
+ $self->warn("$feat does not implement Bio::SeqFeatureI");
+ }
+ # if this doesn't have start/end set - forget it!
+ if((! defined($self->start)) && (! defined $self->end)) {
+ $self->start($feat->start);
+ $self->end($feat->end);
+ $self->strand($feat->strand) unless $self->strand;
+ } else {
+ my ($start,$end,$strand) = $self->union($feat);
+ $self->start($start);
+ $self->end($end);
+ $self->strand($strand);
+ }
+ }
+
+
1;
From jason at dev.open-bio.org Wed Nov 7 14:44:00 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Wed, 07 Nov 2007 19:44:00 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/Graphics FeatureBase.pm, 1.33,
1.33.2.1
Message-ID: <200711071944.lA7Ji0lp015217@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/Graphics
In directory dev.open-bio.org:/tmp/cvs-serv15191/Bio/Graphics
Modified Files:
Tag: lightweight_feature_branch
FeatureBase.pm
Log Message:
alias frame and phase as Bio::Tools::GFF expects frame
Index: FeatureBase.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/Graphics/FeatureBase.pm,v
retrieving revision 1.33
retrieving revision 1.33.2.1
diff -C2 -d -r1.33 -r1.33.2.1
*** FeatureBase.pm 16 Oct 2007 19:28:22 -0000 1.33
--- FeatureBase.pm 7 Nov 2007 19:43:58 -0000 1.33.2.1
***************
*** 508,511 ****
--- 508,512 ----
}
sub phase { shift->{phase} }
+ *frame = \&phase;
sub class {
my $self = shift;
From jason at dev.open-bio.org Wed Nov 7 14:44:35 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Wed, 07 Nov 2007 19:44:35 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/Tools GFF.pm,1.68.2.2,1.68.2.3
Message-ID: <200711071944.lA7JiZFt015251@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/Tools
In directory dev.open-bio.org:/tmp/cvs-serv15225/Bio/Tools
Modified Files:
Tag: lightweight_feature_branch
GFF.pm
Log Message:
dynamically include/require the module type
Index: GFF.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/Tools/GFF.pm,v
retrieving revision 1.68.2.2
retrieving revision 1.68.2.3
diff -C2 -d -r1.68.2.2 -r1.68.2.3
*** GFF.pm 6 Nov 2007 21:10:53 -0000 1.68.2.2
--- GFF.pm 7 Nov 2007 19:44:33 -0000 1.68.2.3
***************
*** 138,143 ****
use Bio::Seq::SeqFactory;
use Bio::LocatableSeq;
- use Bio::SeqFeature::Generic;
- use Bio::SeqFeature::Slim;
use constant DEFAULT_FEATURE_TYPE => 'Bio::SeqFeature::Generic';
--- 138,141 ----
***************
*** 1254,1258 ****
sub feature_type{
my $self = shift;
! return $self->{'feature_type'} = shift if @_;
return $self->{'feature_type'};
}
--- 1252,1260 ----
sub feature_type{
my $self = shift;
! $self->{'feature_type'} = shift if @_;
! my $module = $self->{'feature_type'};
! $module =~ s/::/\//g;
! $module .= ".pm";
! require "$module";
return $self->{'feature_type'};
}
From jason at dev.open-bio.org Wed Nov 7 14:44:57 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Wed, 07 Nov 2007 19:44:57 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/SeqFeature Slim.pm, 1.1.2.3,
1.1.2.4
Message-ID: <200711071944.lA7JivjQ015285@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/SeqFeature
In directory dev.open-bio.org:/tmp/cvs-serv15259/Bio/SeqFeature
Modified Files:
Tag: lightweight_feature_branch
Slim.pm
Log Message:
phase and frame are interchangeable for now
Index: Slim.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/SeqFeature/Attic/Slim.pm,v
retrieving revision 1.1.2.3
retrieving revision 1.1.2.4
diff -C2 -d -r1.1.2.3 -r1.1.2.4
*** Slim.pm 7 Nov 2007 04:32:18 -0000 1.1.2.3
--- Slim.pm 7 Nov 2007 19:44:54 -0000 1.1.2.4
***************
*** 100,104 ****
my($class) = shift;
my ($start, $end, $strand, $primary_tag, $source_tag, $primary,
! $source, $frame, $score, $tag, $gff_string, $gff1_string,
$seqname, $seqid, $annot, $location,$display_name,$pid,$id,
$parent_id,$parent) =
--- 100,104 ----
my($class) = shift;
my ($start, $end, $strand, $primary_tag, $source_tag, $primary,
! $source, $frame,$phase, $score, $tag, $gff_string, $gff1_string,
$seqname, $seqid, $annot, $location,$display_name,$pid,$id,
$parent_id,$parent) =
***************
*** 112,115 ****
--- 112,116 ----
SOURCE
FRAME
+ PHASE
SCORE
TAG
***************
*** 135,138 ****
--- 136,140 ----
$primary_tag = $primary if defined $primary && ! defined $primary_tag;
$source_tag = $source if defined $source && ! defined $source_tag;
+ $frame = $phase if ! defined $frame && defined $phase;
my $self = bless [$seqid, #0
$source_tag, #1
***************
*** 248,251 ****
--- 250,254 ----
}
+
=head2 frame
***************
*** 260,264 ****
=cut
! sub frame{
my ($self) = shift;
if( @_) {
--- 263,267 ----
=cut
! sub frame {
my ($self) = shift;
if( @_) {
***************
*** 268,271 ****
--- 271,276 ----
}
+ *phase = \&frame;
+
=head2 has_tag
From cjfields at dev.open-bio.org Wed Nov 7 21:19:31 2007
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Thu, 08 Nov 2007 02:19:31 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/DB/SeqFeature Store.pm, 1.32, 1.33
Message-ID: <200711080219.lA82JV9q015886@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/DB/SeqFeature
In directory dev.open-bio.org:/tmp/cvs-serv15851/Bio/DB/SeqFeature
Modified Files:
Store.pm
Log Message:
* alias delete_features (for GBrowse plugins)
* allow IO::String (for GBrowse plugins)
Index: Store.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/DB/SeqFeature/Store.pm,v
retrieving revision 1.32
retrieving revision 1.33
diff -C2 -d -r1.32 -r1.33
*** Store.pm 16 Oct 2007 19:28:22 -0000 1.32
--- Store.pm 8 Nov 2007 02:19:29 -0000 1.33
***************
*** 227,230 ****
--- 227,231 ----
*dna = *get_dna = *get_sequence = \&fetch_sequence;
*get_SeqFeatures = \&fetch_SeqFeatures;
+ *delete_SeqFeatures = *delete_features = \&delete;
=head1 Methods for Connecting and Initializating a Database
From cjfields at dev.open-bio.org Wed Nov 7 21:19:31 2007
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Thu, 08 Nov 2007 02:19:31 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/DB/SeqFeature/Store GFF3Loader.pm,
1.27, 1.28
Message-ID: <200711080219.lA82JV3k015883@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/DB/SeqFeature/Store
In directory dev.open-bio.org:/tmp/cvs-serv15851/Bio/DB/SeqFeature/Store
Modified Files:
GFF3Loader.pm
Log Message:
* alias delete_features (for GBrowse plugins)
* allow IO::String (for GBrowse plugins)
Index: GFF3Loader.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/DB/SeqFeature/Store/GFF3Loader.pm,v
retrieving revision 1.27
retrieving revision 1.28
diff -C2 -d -r1.27 -r1.28
*** GFF3Loader.pm 17 Jul 2007 18:37:06 -0000 1.27
--- GFF3Loader.pm 8 Nov 2007 02:19:29 -0000 1.28
***************
*** 918,921 ****
--- 918,922 ----
return IO::File->new("bunzip2 -c $thing |") if $thing =~ /\.bz2$/;
return IO::File->new("GET $thing |") if $thing =~ /^(http|ftp):/;
+ return $thing if ref $thing && $thing->isa('IO::String');
return IO::File->new($thing);
}
From cjfields at dev.open-bio.org Wed Nov 7 21:21:01 2007
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Thu, 08 Nov 2007 02:21:01 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/Graphics FeatureBase.pm, 1.33,
1.34
Message-ID: <200711080221.lA82L1Ww016003@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/Graphics
In directory dev.open-bio.org:/tmp/cvs-serv15978/Bio/Graphics
Modified Files:
FeatureBase.pm
Log Message:
Allow phase to be set
Index: FeatureBase.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/Graphics/FeatureBase.pm,v
retrieving revision 1.33
retrieving revision 1.34
diff -C2 -d -r1.33 -r1.34
*** FeatureBase.pm 16 Oct 2007 19:28:22 -0000 1.33
--- FeatureBase.pm 8 Nov 2007 02:20:59 -0000 1.34
***************
*** 507,511 ****
$str;
}
! sub phase { shift->{phase} }
sub class {
my $self = shift;
--- 507,517 ----
$str;
}
! sub phase {
! my $self = shift;
! my $d = $self->{phase};
! $self->{phase} = shift if @_;
! $d;
! }
!
sub class {
my $self = shift;
From cjfields at dev.open-bio.org Thu Nov 8 00:36:19 2007
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Thu, 08 Nov 2007 05:36:19 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/Graphics FeatureBase.pm, 1.34,
1.35
Message-ID: <200711080536.lA85aJrC016276@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/Graphics
In directory dev.open-bio.org:/tmp/cvs-serv16251/Bio/Graphics
Modified Files:
FeatureBase.pm
Log Message:
fix arg list not being passed along if GFF3 is intended
Index: FeatureBase.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/Graphics/FeatureBase.pm,v
retrieving revision 1.34
retrieving revision 1.35
diff -C2 -d -r1.34 -r1.35
*** FeatureBase.pm 8 Nov 2007 02:20:59 -0000 1.34
--- FeatureBase.pm 8 Nov 2007 05:36:17 -0000 1.35
***************
*** 531,540 ****
sub gff_string {
my $self = shift;
! my $recurse = shift;
!
if ($self->version == 3) {
return $self->gff3_string(@_);
}
!
my $name = $self->name;
my $class = $self->class;
--- 531,540 ----
sub gff_string {
my $self = shift;
!
if ($self->version == 3) {
return $self->gff3_string(@_);
}
!
! my $recurse = shift;
my $name = $self->name;
my $class = $self->class;
From cjfields at dev.open-bio.org Fri Nov 9 23:47:08 2007
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Sat, 10 Nov 2007 04:47:08 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/DB/SeqFeature/Store/DBI mysql.pm,
1.34, 1.35
Message-ID: <200711100447.lAA4l834019927@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/DB/SeqFeature/Store/DBI
In directory dev.open-bio.org:/tmp/cvs-serv19902/Bio/DB/SeqFeature/Store/DBI
Modified Files:
mysql.pm
Log Message:
Fix for mysql adaptor to recursively remove child features only if when all parents are removed
Index: mysql.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/DB/SeqFeature/Store/DBI/mysql.pm,v
retrieving revision 1.34
retrieving revision 1.35
diff -C2 -d -r1.34 -r1.35
*** mysql.pm 16 Oct 2007 19:28:22 -0000 1.34
--- mysql.pm 10 Nov 2007 04:47:05 -0000 1.35
***************
*** 1118,1124 ****
my $key = shift;
my $dbh = $self->dbh;
my $success = 0;
for my $table ($self->all_tables) {
! $success += $dbh->do("DELETE FROM $table WHERE id=$key");
}
return $success;
--- 1118,1139 ----
my $key = shift;
my $dbh = $self->dbh;
+ my $child_table = $self->_parent2child_table;
+ my $query = "SELECT child FROM $child_table WHERE id=?";
+ my $sth=$self->_prepare($query);
+ $sth->execute($key);
my $success = 0;
+ while (my ($cid) = $sth->fetchrow_array) {
+ # Backcheck looking for multiple parents, delete only if one is present. I'm
+ # sure there is a nice way to left join the parent2child table onto itself
+ # to get this in one query above, just haven't worked it out yet...
+ my $sth2 = $self->_prepare("SELECT count(id) FROM $child_table WHERE child=?");
+ $sth2->execute($cid);
+ my ($count) = $sth2->fetchrow_array;
+ if ($count == 1) {
+ $self->_deleteid($cid) || $self->throw("Couldn't remove subfeature!");
+ }
+ }
for my $table ($self->all_tables) {
! $success += $dbh->do("DELETE FROM $table WHERE id=$key") || 0;
}
return $success;
From cjfields at dev.open-bio.org Tue Nov 13 13:28:27 2007
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Tue, 13 Nov 2007 18:28:27 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/SearchIO blast.pm,1.120,1.121
Message-ID: <200711131828.lADISRHe003073@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/SearchIO
In directory dev.open-bio.org:/tmp/cvs-serv3048/Bio/SearchIO
Modified Files:
blast.pm
Log Message:
Now catches primary GI
Index: blast.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/SearchIO/blast.pm,v
retrieving revision 1.120
retrieving revision 1.121
diff -C2 -d -r1.120 -r1.121
*** blast.pm 4 Nov 2007 00:11:44 -0000 1.120
--- blast.pm 13 Nov 2007 18:28:25 -0000 1.121
***************
*** 184,188 ****
'Hit_def' => 'HIT-description',
'Hit_signif' => 'HIT-significance',
!
# For NCBI blast, the description line contains bits.
# For WU-blast, the