[BioRuby] Parsing a file in Swissprot format
Urban Hafner
urban at bettong.net
Fri Dec 2 08:00:01 EST 2005
Hej everybody,
I'm new to BioRuby and I think I'm doing something wrong while parsing a
file in Swissprot format. What I'm trying to do is to get the sequence
out of it. I do it like this:
sequence = Bio::SPTR.new(File.new(f).read)
p sequence.sq
But that doesn't work it gives me this error message:
/home/users/hafner/lib/site_ruby/1.8/bio/db/embl/sptr.rb:706:in `sq':
Invalid SQ Line: (RuntimeError)
'AAGCTTAATGTATATAATCTTTTAGAGGTAAAATCTACAGCCAGCAAAAGTCATGGTAAA
TATTCTTTGACTGAACTCTCACTAAACTCCTCTAAATTATATGTCATATTAACTGGTTAA
ATTAATATAAATTTGTGACATGACCTTAACTGGTTAGGTAGGATATTTTTCTTCATGCAA
AAATATGACTAATAATAATTTAGCACAAAAATATTTCCCAATACTTTAATTCTGTGATAG
AAAAATGTTTAACTCAGCTACTATAATCCCATAATTTTGAAAACTATTTATTAGCTTTTG
TGTTTGACCCTTCCCTAGCCAAAGGCAACTATTTAAGGACCCTTTAAAACTCTTGAAACT
ACTTTAGAGTC' from diplomarbeit/tools/smartdb-entries-without-
sequence.rb:10
I"m not sure if this is BioRuby's (I'm using the version from CVS) fault
or if the input file is faulty.
Does anybody have a clue what I'm doing wrong here?
Cheers, Urban
Here's my input file:
AC SM0000001
XX
DT 1.1.1999 00:00:00 (created); ili
DT 8.12.2004 12:49:00 (updated); ili2
XX
NA MOUSE$kappa-MAR
XX
OS mouse, Mus spec.
OC eukaryota; animalia; metazoa; chordata; vertebrata;
OC tetrapoda; mammalia; eutheria; rodentia; myomorpha; muridae;
OC murinae
XX
HO human, rabbit [2]
XX
SZ 371 bp
XX
DE G000538; immunoglobulin kappa light chain
DP Direction: 3'; Pos 1: ATG
DN Internal: y;
DC between joining and constant regions [1]; ~200 bp
DC upstream of the kappa enhancer [1]
XX
SQ AAGCTTAATGTATATAATCTTTTAGAGGTAAAATCTACAGCCAGCAAAAGTCATGGTAAA
SQ TATTCTTTGACTGAACTCTCACTAAACTCCTCTAAATTATATGTCATATTAACTGGTTAA
SQ ATTAATATAAATTTGTGACATGACCTTAACTGGTTAGGTAGGATATTTTTCTTCATGCAA
SQ AAATATGACTAATAATAATTTAGCACAAAAATATTTCCCAATACTTTAATTCTGTGATAG
SQ AAAAATGTTTAACTCAGCTACTATAATCCCATAATTTTGAAAACTATTTATTAGCTTTTG
SQ TGTTTGACCCTTCCCTAGCCAAAGGCAACTATTTAAGGACCCTTTAAAACTCTTGAAACT
SQ ACTTTAGAGTC
SC [7]
XX
FT 2 - 11: cleavage by topoisomerase II [3]
FT 2 - 15: deleted in plasmacytoma PC 7183 [3]
FT 5 - 14: cleavage by topoisomerase II [3]
FT 5 - 14: 5'-recombination junction [3]
FT 8 - 17: cleavage by topoisomerase II [3]
FT 10 - 19: cleavage by topoisomerase II [3]
FT 32 - 41: cleavage by topoisomerase II [3]
FT 53 - 62: cleavage by Drosophila topoisomerase II only
FT [3]
FT 68 - 77: cleavage by topoisomerase II [3]
FT 69 - 78: cleavage by topoisomerase II [3]
FT 73 - 82: cleavage by topoisomerase II [3]
FT 98 - 107: cleavage by Drosophila topoisomerase II only
FT [3]
FT 147 - 156: cleavage by topoisomerase II [3]
FT 163 - 284: confers MAR-like features upon any DNA when
FT contiguously reiterated in the same molecule
FT [7]
FT 164 - 170: similar motif found in human PARP MAR
FT SM0000116 [8]
FT 182 - 191: cleavage by topoisomerase II [3]
FT 189 - 198: cleavage by topoisomerase II [3]
FT 219 - 228: cleavage by topoisomerase II [3]
FT 242 - 251: cleavage by topoisomerase II [3]
FT 248 - 257: cleavage by topoisomerase II [3]
FT 253 - 253: G in [3]
FT 256 - 265: cleavage by topoisomerase II [3]
XX
SF topoisomerase II sites [1]; AT-rich sites [1];
SF contains a breakpoint for chromosomal translocation [3];
SF several short stretches of homopolymeric adenine or
SF thymine [7]
XX
BP 75% [J. Bode, direct submission]; 20% [7]
TP constitutive [1]
XX
FF prototype of a S/MAR; contributes to maximal expression of
FF the kappa gene [2]; contributes to hypermutation [9];
FF contributes to kappa expression as shown by flow cytometic
FF assay, but has little effect on accumulation of the
FF respective mRNA [9]
XX
CP liver, kidney, spleen, thymus, MPC-11, P-815, L-cell [1]
XX
EV in vitro selection of S/MAR
EC [J. Bode, direct submission]
XX
BF SB000002; lamin A [6]
MM nitrocellulose filter binding;
SO rl; rat
QA 6
BF SB000003; lamin B1 [6]
MM nitrocellulose filter binding;
SO rl; rat
QA 6
BF SB000004; lamin C [6]
MM nitrocellulose filter binding;
SO rl; rat
QA 6
BF SB000018; SP120 [4]
MM nitrocellulose filter binding;
SO brain; rat
QA 6
BF SB000018; SP120 [4]
MM southwestern blotting;
SO brain; rat
QA 6
BF SB000022; topoisomerase II [3]
MM gel retardation;
SO Drosophila; Drosophila melanogaster
QA 6
BF SB000022; topoisomerase II [3]
MM topoisomerase II cleavage assay;
SO Drosophila; Drosophila melanogaster
QA 6
BF SB000043; topoisomerase II [3]
MM topoisomerase II cleavage assay;
SO calf; calf
QA 6
BF SB000045; SMI1 [5]
MM functional analysis;
PR 254 bp fragment
SO yeast, extract; baker's yeast, Saccharomyces cerevisiae
QA 6
BF SB000052; topoisomerase II [3]
MM topoisomerase II cleavage assay;
SO mouse; mouse
QA 6
BF SB000053; topoisomerase II [3]
MM nitrocellulose filter binding;
SO HeLa; human
QA 6
BF SB000067; SMAR1 [10]
MM gel shift competition;
SO rec(mouse-E.coli); mouse
QA 6
BF SB000077; SAF-A [12]
MM supershift (antibody binding);
SO liver; mouse
QA 6
BF SB000077; SAF-A [12]
MM southwestern blotting;
SO liver; mouse
QA 6
XX
RN [1]
RX MEDLINE; 86106203 PubMed; 3002631
RA Cockerill, P. N., Garrard, W. T.
RT Chromosomal loop anchorage of the kappa immunoglobin gene
RT occurs next to the enhancer in a region containing
RT topoisomerase II sites
RL Cell 44:273-282 (1986)
RN [2]
RX MEDLINE; 90078219 PubMed; 2512290
RA Blasquez, V. C., Xu, M., Moses, S. C., Garrard, W. T.
RT Immunoglobulin kappa gene expression after stable
RT integration. I. Role of the intronic MAR and enhancer in
RT plasmacytoma cells
RL J. Biol. Chem. 264:21183-21189 (1989)
RN [3]
RX MEDLINE; 89315824 PubMed; 2546156
RA Sperry, A. O., Blasquez, V. C., Garrard, W. T.
RT Dysfunction of chromosomal llop attachment sites:
RT Illegitimate recombination linked to matrix association
RT regions and topoisomerase II
RL Proc. Natl. Acad. Sci. USA 86:5497-5501 (1989)
RN [4]
RX MEDLINE; 93286136 PubMed; 8509422
RA Tsutsui, K., Tsutsui, K., Okada, S., Watarai, S., Seki, S.,
RA Yasuda, T., Shohmori, T.
RT Identification and characterization of a nuclear scaffold
RT protein that binds the matrix attachment region DNA
RL J. Biol. Chem. 268:12886-12894 (1993)
RN [5]
RX MEDLINE; 93296190 PubMed; 8516310
RA Fishel, B. R., Sperry, A. O., Garrard, W. T.
RT Yeast calmodulin and a conserved nuclear protein
RT participate in the in vivo binding of a matrix associated
RT region
RL Proc. Natl. Acad. Sci. USA 90:5623-5627 (1993)
RN [6]
RX MEDLINE; 94344140 PubMed; 8065361
RA Luderus, M. E. E., den Blaauwen, J. L., de Smit, O. J. B.,
RA Compton, D. A., van Driel, R.
RT Binding of matrix attachment regions to lamin polymers
RT involves single-stranded regions and the minor groove
RL Mol. Cell. Biol. 14:6297-6305 (1994)
RN [7]
RX MEDLINE; 96222527 PubMed; 8670229
RA Okada, S., Tsutsui, K., Tsutsui, K., Seki, S., Shohmori, T.
RT Subdomain structure of the matrix attachment region located
RT within the mouse immunoglobulin kappa gene intron
RL Biochem. Biophys. Res. Commun. 222:472-477 (1996)
RN [8]
RA Boulikas, T., Kong, C. F., Brooks, D., Hsie, L.
RT The 3' untranslated region of the human
RT poly(ADP-ribose)polymerase gene is a nuclear matrix
RT anchoring site
RL Int. J. Oncol. 9:1287-1294 (1996)
RN [9]
RX MEDLINE; 97377037 PubMed; 9233808
RA Goyenechea, B., Klix, N., Williams, G. T., Riddell, A.,
RA Neuberger, M. S., Milstein, C.
RT Cells strongly expressing Ig kappa transgenes show clonal
RT recruitment of hypermutation: a role for both MAR and the
RT enhancers
RL EMBO J. 16:3987-3994 (1997)
RN [10]
RX MEDLINE; 20408892 PubMed; 10950932
RA Chattopadhyay, S., Kaul, R., Charest, A., Housman, D.,
RA Chen, J.
RT SMAR1, a novel, alternatively spliced gene product, binds
RT the scaffold/matrix-associated region at the T cell
RT receptor beta locus
RL Genomics 68:93-96 (2000)
RN [11]
RX MEDLINE; 20496822 PubMed; 11041885
RA Morisawa, G., Han-yama, A., Moda, I., Tamai, A., Iwabuchi,
RA M., Meshi, T.
RT AHM1, a novel type of nuclear matrix-localized, MAR binding
RT protein with a single AT hook and a J
RT domain-homologous region
RL Plant Cell 12:1903-1916 (2000)
RN [12]
RX MEDLINE; 21456956 PubMed; 11573239
RA Lobov, I. B., Tsutsui, K., Mitchell, A. R., Podgornaya, O.
RA I.
RT Specificity of SAF-A and lamin B binding in vitro
RT correlates with the satellite DNA bending state
RL J. Cell. Biochem. 83:218-229 (2001)
//
More information about the BioRuby
mailing list