From saccenti at cerm.unifi.it Fri Jul 1 05:38:34 2005 From: saccenti at cerm.unifi.it (Edoardo Saccenti) Date: Fri Jul 1 05:30:02 2005 Subject: [BioPython] blast output Message-ID: <200507011138.35631.saccenti@cerm.unifi.it> Dear all I want to blast my sequence and get all possible results If I run blast as in cookbook i get much more less results then runnig blast manually and setting to 1000 the number of descriptions How set the number of descrition NCBIWWW.qblast ??? Another question: I run manually blast and saved my output page in html format using my browser. Then I tried to parse but i get lot of errors: ================================= Traceback (most recent call last): File "ParseBlastOutput.py", line 7, in ? b_record = b_parser.parse(blast_out) File "/usr/lib/python2.3/site-packages/Bio/Blast/NCBIWWW.py", line 47, in pars e self._scanner.feed(handle, self._consumer) File "/usr/lib/python2.3/site-packages/Bio/Blast/NCBIWWW.py", line 100, in fee d self._scan_header(uhandle, consumer) File "/usr/lib/python2.3/site-packages/Bio/Blast/NCBIWWW.py", line 167, in _sc an_header self._scan_query_info(uhandle, consumer) File "/usr/lib/python2.3/site-packages/Bio/Blast/NCBIWWW.py", line 235, in _sc an_query_info read_and_call_until(uhandle, consumer.query_info, blank=1) File "/usr/lib/python2.3/site-packages/Bio/ParserSupport.py", line 340, in rea d_and_call_until method(line) File "/usr/lib/python2.3/site-packages/Bio/ParserSupport.py", line 136, in _ap ply_clean_data self._prev_attr(clean) File "/usr/lib/python2.3/site-packages/Bio/Blast/NCBIStandalone.py", line 651, in query_info "I could not find the number of letters in line\n%s" % line) File "/usr/lib/python2.3/site-packages/Bio/Blast/NCBIStandalone.py", line 1618 , in _re_search raise SyntaxError, error_msg SyntaxError: I could not find the number of letters in line 27,711,076 sequences; 14,863,934,794 total letters ====================================================== I copied my alignments in a output file get from NCBIWWW.qblast using its header. This time I get a dfferent error: ============= Traceback (most recent call last): File "ParseBlastOutput.py", line 7, in ? b_record = b_parser.parse(blast_out) File "/usr/lib/python2.3/site-packages/Bio/Blast/NCBIWWW.py", line 47, in parse self._scanner.feed(handle, self._consumer) File "/usr/lib/python2.3/site-packages/Bio/Blast/NCBIWWW.py", line 102, in feed self._scan_database_report(uhandle, consumer) File "/usr/lib/python2.3/site-packages/Bio/Blast/NCBIWWW.py", line 539, in _scan_database_report read_and_call(uhandle, consumer.noevent, blank=1) File "/usr/lib/python2.3/site-packages/Bio/ParserSupport.py", line 300, in read_and_call raise SyntaxError, errmsg SyntaxError: Expected blank line, but got: Gapped Any Idea?? Thanks Edoardo "Raffiniert ist der Herr Gott, aber boshaft ist Er nicht." --- Dr. Edoardo Saccenti CERM Nuclear Magnetic Resonance Research Center Scientific Pole - University of Florence Via Luigi Sacconi n? 6 50019 Sesto Fiorentino (FI) tel: +39 055 4574193 fax: +39 055 4574253 saccenti@cerm.unifi.it www.cerm.unifi.it From djkojeti at unity.ncsu.edu Mon Jul 4 16:00:59 2005 From: djkojeti at unity.ncsu.edu (Douglas Kojetin) Date: Mon Jul 4 15:52:08 2005 Subject: [BioPython] blast output Message-ID: <4AFACDD2-FB4C-4D89-A3B0-23D5F4EDD319@unity.ncsu.edu> Hi All- A few questions ... (1) RE: >I want to blast my sequence and get all possible results >If I run blast as in cookbook i get much more less results then >runnig blast manually and setting to 1000 the number of descriptions >How set the number of descrition NCBIWWW.qblast ??? I am also having difficulty setting the number of descriptions. If I use the following command: bresults = NCBIWWW.qblast('blastp', 'nr', seq, format_type='Text', alignments=0, descriptions=1000) Only 100 results are output. How can I increase the # of hits for a query? (2) Is it possible to turn off this warning message? WW.py:1062: UserWarning: qblast works only with blastn and blastp for now. warnings.warn("qblast works only with blastn and blastp for now.") (3) I'd like to search the 'pdb' database for non-redundant protein structures. If I do a 'nr' database search, it appears that non- redundant sequences for all databases (including the 'pdb') appear in the output. Furthermore, the database name is the first line of the search (delimited by '|', pipe, character) emb|... pdb|... gb|... Is it possible to specify to search the 'nr' database and only output those hits from the 'pdb' database? I know it would be easy to search the results of the query, using split('|') or line.find ('pdb'), but I'm not getting enough hits from the 'pdb' database during the 'nr' query (related to my question 1 above). Many thanks in advance for your help, Doug From matomatias at hotmail.com Tue Jul 5 01:54:23 2005 From: matomatias at hotmail.com (Matias Saavedra) Date: Tue Jul 5 01:45:23 2005 Subject: [BioPython] Problems with MMCIFlex module Message-ID: Hi, i have a big problem trying to use Biopython: when i want to import the PDB module, i get the following error message: >>>from Bio.PDB import * Traceback (most recent call last): File "", line 1, in ? File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/biopython-1.30/Bio/PDB/__init__.py", line 13, in ? from MMCIFParser import MMCIFParser File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/biopython-1.30/Bio/PDB/MMCIFParser.py", line 6, in ? from MMCIF2Dict import MMCIF2Dict File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/biopython-1.30/Bio/PDB/MMCIF2Dict.py", line 2, in ? import Bio.PDB.mmCIF.MMCIFlex ImportError: No module named MMCIFlex There is no module named MMCIFlex, and i have been trying to erase the modules that need MMCIFlex, but i always get a new error message. What can i do? I really need to use the PDB module in my work. Thanks for helping me. Matias Saavedra Institut Pasteur Informatique en biologie _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From saccenti at cerm.unifi.it Tue Jul 5 05:54:44 2005 From: saccenti at cerm.unifi.it (Edoardo Saccenti) Date: Tue Jul 5 05:52:43 2005 Subject: [BioPython] blast output In-Reply-To: <4AFACDD2-FB4C-4D89-A3B0-23D5F4EDD319@unity.ncsu.edu> References: <4AFACDD2-FB4C-4D89-A3B0-23D5F4EDD319@unity.ncsu.edu> Message-ID: <200507051154.44597.saccenti@cerm.unifi.it> On Monday 04 July 2005 22:00, Douglas Kojetin wrote: > Hi All- > > A few questions ... > I didn't success in getting more hits! > _______________________________________________ > BioPython mailing list - BioPython@biopython.org > http://biopython.org/mailman/listinfo/biopython -- "Raffiniert ist der Herr Gott, aber boshaft ist Er nicht." --- Dr. Edoardo Saccenti CERM Nuclear Magnetic Resonance Research Center Scientific Pole - University of Florence Via Luigi Sacconi n? 6 50019 Sesto Fiorentino (FI) tel: +39 055 4574193 fax: +39 055 4574253 saccenti@cerm.unifi.it www.cerm.unifi.it From mdehoon at c2b2.columbia.edu Tue Jul 5 11:24:44 2005 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Tue Jul 5 11:16:31 2005 Subject: [BioPython] Problems with MMCIFlex module Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE7AC203@cgcmail.cgc.cpmc.columbia.edu> Have you tried the latest version of Biopython (1.40b)? Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: biopython-bounces@portal.open-bio.org on behalf of Matias Saavedra Sent: Tue 7/5/2005 1:54 AM To: biopython@biopython.org Subject: [BioPython] Problems with MMCIFlex module Hi, i have a big problem trying to use Biopython: when i want to import the PDB module, i get the following error message: >>>from Bio.PDB import * Traceback (most recent call last): File "", line 1, in ? File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-package s/biopython-1.30/Bio/PDB/__init__.py", line 13, in ? from MMCIFParser import MMCIFParser File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-package s/biopython-1.30/Bio/PDB/MMCIFParser.py", line 6, in ? from MMCIF2Dict import MMCIF2Dict File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-package s/biopython-1.30/Bio/PDB/MMCIF2Dict.py", line 2, in ? import Bio.PDB.mmCIF.MMCIFlex ImportError: No module named MMCIFlex There is no module named MMCIFlex, and i have been trying to erase the modules that need MMCIFlex, but i always get a new error message. What can i do? I really need to use the PDB module in my work. Thanks for helping me. Matias Saavedra Institut Pasteur Informatique en biologie _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ _______________________________________________ BioPython mailing list - BioPython@biopython.org http://biopython.org/mailman/listinfo/biopython From matomatias at hotmail.com Tue Jul 5 14:38:07 2005 From: matomatias at hotmail.com (Matias Saavedra) Date: Tue Jul 5 14:29:11 2005 Subject: [BioPython] Problems with MMCIFlex module In-Reply-To: Message-ID: Yes i used the last version of biopython-1.40b. But i only decompressed the tar file, and i placed it in a folder within the pythonpath. I didnt make install of the file... i don't know if that is the problem... Thanks for helping me _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From mdehoon at c2b2.columbia.edu Tue Jul 5 14:40:02 2005 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Tue Jul 5 14:33:37 2005 Subject: [BioPython] Problems with MMCIFlex module Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE7AC207@cgcmail.cgc.cpmc.columbia.edu> Yes that would be a problem. >From your traceback: >>>from Bio.PDB import * Traceback (most recent call last): File "", line 1, in ? File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-package s/biopython-1.30/Bio/PDB/__init__.py", line 13, in ? .... you can see it is using biopython-1.30 instead of version 1.40b. Have a look at http://www.biopython.org/docs/install/Installation.html for the installation instructions, and please let us know if there is still a problem with Biopython 1.40b. --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: biopython-bounces@portal.open-bio.org on behalf of Matias Saavedra Sent: Tue 7/5/2005 2:38 PM To: biopython@biopython.org Subject: RE: [BioPython] Problems with MMCIFlex module Yes i used the last version of biopython-1.40b. But i only decompressed the tar file, and i placed it in a folder within the pythonpath. I didnt make install of the file... i don't know if that is the problem... Thanks for helping me _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ _______________________________________________ BioPython mailing list - BioPython@biopython.org http://biopython.org/mailman/listinfo/biopython From matomatias at hotmail.com Tue Jul 5 17:54:35 2005 From: matomatias at hotmail.com (Matias Saavedra) Date: Tue Jul 5 17:45:37 2005 Subject: [BioPython] Problems with MMCIFlex module In-Reply-To: Message-ID: Thanks for taking care of my problem. At this moment i still cant install biopython... but now i'm trying to install biopython-1.40b using the .tar files. when i type the installing commands python setup.py install, y receive the following message: [Macintosh:~/biopython-1.40b] mato% python setup.py install running install *** mxTextTools *** is either not installed or out of date. This package is required for many Biopython features. Please install it before you install Biopython. You can find mxTextTools at http://www.egenix.com/files/python/eGenix-mx-Extensions.html. So i went to the egenix webpage in order to get de mxtexttools that are necessary to biopython. And unfortunelly, i cant install the egenix base package, because i receive the following message when i try to build it: [Macintosh:~/egenix-mx-base-2.0.6] mato% python setup.py build running build running mx_autoconf gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -fno-common -dynamic -D_GNU_SOURCE=1 -I/usr/local/include -I/Library/Frameworks/Python.framework/Versions/2.4/include -c _configtest.c -o _configtest.o unable to execute gcc: No such file or directory failure. removing: _configtest.c _configtest.o gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -fno-common -dynamic -D_GNU_SOURCE=1 -I/Library/Frameworks/Python.framework/Versions/2.4/include/python2.4 -I/usr/local/include -I/Library/Frameworks/Python.framework/Versions/2.4/include -c _configtest.c -o _configtest.o unable to execute gcc: No such file or directory failure. removing: _configtest.c _configtest.o running build_ext building 'mx.DateTime.mxDateTime.mxDateTime' extension creating build creating build/temp.darwin-8.0.0-Power_Macintosh-2.4 creating build/temp.darwin-8.0.0-Power_Macintosh-2.4/mx creating build/temp.darwin-8.0.0-Power_Macintosh-2.4/mx/DateTime creating build/temp.darwin-8.0.0-Power_Macintosh-2.4/mx/DateTime/mxDateTime creating build/temp.darwin-8.0.0-Power_Macintosh-2.4/mx/DateTime/mxDateTime/mxDateTime creating build/temp.darwin-8.0.0-Power_Macintosh-2.4/mx/DateTime/mxDateTime/mxDateTime/mx creating build/temp.darwin-8.0.0-Power_Macintosh-2.4/mx/DateTime/mxDateTime/mxDateTime/mx/DateTime creating build/temp.darwin-8.0.0-Power_Macintosh-2.4/mx/DateTime/mxDateTime/mxDateTime/mx/DateTime/mxDateTime gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -fno-common -dynamic -DBAD_STATIC_FORWARD=1 -UHAVE_STRPTIME -Imx/DateTime/mxDateTime -I/Library/Frameworks/Python.framework/Versions/2.4/include/python2.4 -I/usr/local/include -I/Library/Frameworks/Python.framework/Versions/2.4/include -c mx/DateTime/mxDateTime/mxDateTime.c -o build/temp.darwin-8.0.0-Power_Macintosh-2.4/mx/DateTime/mxDateTime/mxDateTime/mx/DateTime/mxDateTime/mxDateTime.o unable to execute gcc: No such file or directory error: command 'gcc' failed with exit status 1 So i0m stucked at this level... as i cant install mxTextTools i cant install biopython neither. The latter problem seems to be related to GCC or C compilers, but i dont understand it very well. Can you help me with that? Thanks a lot _________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ From mdehoon at c2b2.columbia.edu Tue Jul 5 17:57:18 2005 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Tue Jul 5 17:51:57 2005 Subject: [BioPython] Problems with MMCIFlex module Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE7AC20A@cgcmail.cgc.cpmc.columbia.edu> > unable to execute gcc: No such file or directory This is your problem: you don't have a C compiler installed. You will need a C compiler for mxTextTools and also for biopython itself (and for lots of other software packages that don't come precompiled). For Mac OS X, you can download a C compiler from Apple's website. It is called "Developer Tools" or something like that. After you install it, you should be able to run gcc, and "python setup.py build" & "python setup.py install" should work. --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From matomatias at hotmail.com Tue Jul 5 22:54:07 2005 From: matomatias at hotmail.com (Matias Saavedra) Date: Tue Jul 5 22:45:05 2005 Subject: [BioPython] Problems with MMCIFlex module In-Reply-To: Message-ID: Thanks a lot Michiel, i will do that and i'll reply to tell about the results. Mathias _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From lee.byung-chul at kaist.ac.kr Wed Jul 6 02:26:53 2005 From: lee.byung-chul at kaist.ac.kr (Lee, Byung-chul) Date: Wed Jul 6 02:20:45 2005 Subject: [BioPython] Problems with MMCIFlex module Message-ID: <42CB79AD.4090407@kaist.ac.kr> In the previous mail, I made a mistake. At that time, I just thought the problem came from the deletion of old version, but that was not true. In my case, my problem came from tha 'sys.path'. At that time I erased my old biopython version, I had the directory named "Bio/" which was under my present working directory. And in this case, my default sys.path is setting like this: ['/home/myid/worikingdirectory', 'other python library path1','other python library path2', ...], so my working directory has the highest priority of path, and my python could not find the MMCIFlex module at all. In conclusion, I also think if you follow the method of Michiel De Hoon, you are able to solve that problem. -- -------------------------------------------------------- The important thing is not to stop questioning. : Albert Einstein Byung chul Lee at Detp. BioSystems KAIST, Korea Ph.D candidate 82-42-869-4357 -------------------------------------------------------- From lee.byung-chul at kaist.ac.kr Tue Jul 5 21:35:23 2005 From: lee.byung-chul at kaist.ac.kr (Lee, Byung-chul) Date: Wed Jul 6 11:45:50 2005 Subject: [BioPython] Problems with MMCIFlex module In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE7AC20A@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE7AC20A@cgcmail.cgc.cpmc.columbia.edu> Message-ID: <42CB355B.3020401@kaist.ac.kr> An HTML attachment was scrubbed... URL: http://portal.open-bio.org/pipermail/biopython/attachments/20050706/54151c84/attachment-0001.htm From pwilkinson_m at xbioinformatics.org Wed Jul 6 23:01:09 2005 From: pwilkinson_m at xbioinformatics.org (Peter Wilkinson) Date: Wed Jul 6 22:53:24 2005 Subject: [BioPython] Parsing Swissprot-Files Message-ID: <6.1.2.0.2.20050706230105.01c3d060@mail.xbioinformatics.org> is it just me .... the unigene parser is only for html output ... I would like to parse some unigene records as published from NCBI in flatfile format. I will write one if someone can confirm there isn't one. Peter From F.Zhang at surrey.ac.uk Thu Jul 7 12:10:44 2005 From: F.Zhang at surrey.ac.uk (F.Zhang) Date: Thu Jul 7 12:16:28 2005 Subject: [BioPython] About features extraction from PDB files Message-ID: Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 7957 bytes Desc: image001.jpg Url : http://portal.open-bio.org/pipermail/biopython/attachments/20050707/ae543377/attachment.jpg From djkojeti at unity.ncsu.edu Mon Jul 11 11:20:01 2005 From: djkojeti at unity.ncsu.edu (Douglas Kojetin) Date: Mon Jul 11 11:10:59 2005 Subject: [BioPython] Blast class diagram Message-ID: Can someone point me towards a web resource that describes what each of the variables are in the following figure illustrating the Blast class diagram? http://www.biopython.org/docs/tutorial/Tutorial004.html#fig:blastrecord I am specifically interested in the HSP variables. Many thanks, Doug From matomatias at hotmail.com Tue Jul 12 14:52:32 2005 From: matomatias at hotmail.com (Matias Saavedra) Date: Tue Jul 12 14:43:19 2005 Subject: [BioPython] Problems with MMCIFlex module Message-ID: Hi, i'm back. I have been trying to install a C compiler, but i realized that i have GCC 4.0 in my usr/bin directory... i think that means it is already installed? But when i try to install the mxtexttools, the error message says that gcc doesn't respond. It is a shell path problem for example? I really dont know what to do, and at the same time i think i'm so close!! Thanks. Mathias. _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From mdehoon at c2b2.columbia.edu Tue Jul 12 15:19:47 2005 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Tue Jul 12 15:13:01 2005 Subject: [BioPython] Problems with MMCIFlex module Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE7AC22B@cgcmail.cgc.cpmc.columbia.edu> -----Original Message----- From: biopython-bounces@portal.open-bio.org on behalf of Matias Saavedra Sent: Tue 7/12/2005 2:52 PM To: biopython@biopython.org Subject: [BioPython] Problems with MMCIFlex module > Hi, i'm back. I have been trying to install a C compiler, but i realized > that i have GCC 4.0 in my usr/bin directory... i think that means it is > already installed? Can you run gcc? What happens if you just run gcc from a command window? > But when i try to install the mxtexttools, the error message says that gcc > doesn't respond. It is a shell path problem for example? Please send the exact error message. Otherwise it is hard to understand what is going on. --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From matomatias at hotmail.com Tue Jul 12 15:38:32 2005 From: matomatias at hotmail.com (Matias Saavedra) Date: Tue Jul 12 15:32:46 2005 Subject: [BioPython] Problems with MMCIFlex module Message-ID: ok. If i type gcc in the command line of my shell, i obtain the following: [hoffmann:/usr/bin] mato% gcc csh: gcc: Command not found. And if i type gcc-4.0 (which is the file present in my /usr/bin directory): [hoffmann:/usr/bin] mato% gcc-4.0 powerpc-apple-darwin8-gcc-4.0.0: no input files I have to say that i incorporated the location /usr/bin/gcc-4.0 in the shell path. I don't have any file called just gcc in the /usr/bin directory. The gcc package came with the computer, and i tryied to install it from a gcc4.0.pkg file (ready to install in mac systems)... This is the information i have what do you think? Mathias _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From bob.chase at gmail.com Tue Jul 12 20:54:16 2005 From: bob.chase at gmail.com (bob.chase@tympanum.org) Date: Tue Jul 12 20:45:21 2005 Subject: [BioPython] Problems with MMCIFlex module In-Reply-To: References: Message-ID: <2d8f462605071217542eb98f3d@mail.gmail.com> i have gcc on my mac (gcc 3.3). the command is /usr/bin/gcc which is a symbolic link to /usr/bin/gcc-3.3. you could try and make your own link (ln -s /usr/bin/gcc-4.0 /usr/bin/gcc) and see if you can compile and run a simple "hello world". this would at least verify that you have a reasonable install. On 7/12/05, Matias Saavedra wrote: > ok. If i type gcc in the command line of my shell, i obtain the following: > > [hoffmann:/usr/bin] mato% gcc > csh: gcc: Command not found. > > And if i type gcc-4.0 (which is the file present in my /usr/bin directory): > > [hoffmann:/usr/bin] mato% gcc-4.0 > powerpc-apple-darwin8-gcc-4.0.0: no input files > > I have to say that i incorporated the location /usr/bin/gcc-4.0 in the shell > path. > > I don't have any file called just gcc in the /usr/bin directory. > > The gcc package came with the computer, and i tryied to install it from a > gcc4.0.pkg file (ready to install in mac systems)... > > This is the information i have > > what do you think? > > Mathias From matomatias at hotmail.com Tue Jul 12 21:49:50 2005 From: matomatias at hotmail.com (Matias Saavedra) Date: Tue Jul 12 21:41:07 2005 Subject: [BioPython] Problems with MMCIFlex module Message-ID: Hi, i created a gcc file, and i inserted it in the shell path. when i type gcc, gcc-4.0 is activated. I created a "hello" file, and i did "gcc hello", this is the result of the operation: [Macintosh:~] mato% gcc hello collect2: cannot find `ld' Does it means something?? Thanks _________________________________________________________________ Don't just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/ From djkojeti at unity.ncsu.edu Tue Jul 12 21:51:32 2005 From: djkojeti at unity.ncsu.edu (Douglas Kojetin) Date: Tue Jul 12 21:42:35 2005 Subject: [BioPython] Problems with MMCIFlex module In-Reply-To: <2d8f462605071217542eb98f3d@mail.gmail.com> References: <2d8f462605071217542eb98f3d@mail.gmail.com> Message-ID: <30C365BA-B7F3-4284-8006-5B333A9409A0@unity.ncsu.edu> Do you have /usr/sbin/gcc_select? It is on my 10.4 install, and I believe it is present on 10.3 as well. If you have it, you could try 'gcc_select 3.3' -- this creates the symbolic link to 3.3 for me on my system. % which gcc_select /usr/sbin/gcc_select % gcc_select -h usage: gcc_select [-n] [-force] [2 | 3 | 3.x | 4.x ] [-h | --help] [- v | --version] [-l | --list] [-root] 2 Select gcc 2.95.2 as the default compiler. 3 Select gcc 3.1 as the default compiler. 3.x Select gcc 3.x as the default compiler. 4.x Select gcc 4.x as the default compiler. -force Ensure the links are correct for the specified version even if it maches the current default version. -h Display this help info. --help Same as -h. -l List available compiler versions. --list Same as -l. -n Show commands to do selection but do not execute them. -root Skip 'root' check and assume you have root access. -v Display gcc_select version number. --version Same as -v. % sudo gcc_select 3.3 Default compiler has been set to: gcc version 3.3 20030304 (Apple Computer, Inc. build 1809) Doug On Jul 12, 2005, at 8:54 PM, bob.chase@tympanum.org wrote: > i have gcc on my mac (gcc 3.3). the command is /usr/bin/gcc which is a > symbolic link to /usr/bin/gcc-3.3. > > you could try and make your own link (ln -s /usr/bin/gcc-4.0 /usr/ > bin/gcc) > and see if you can compile and run a simple "hello world". this > would at > least verify that you have a reasonable install. > > On 7/12/05, Matias Saavedra wrote: > >> ok. If i type gcc in the command line of my shell, i obtain the >> following: >> >> [hoffmann:/usr/bin] mato% gcc >> csh: gcc: Command not found. >> >> And if i type gcc-4.0 (which is the file present in my /usr/bin >> directory): >> >> [hoffmann:/usr/bin] mato% gcc-4.0 >> powerpc-apple-darwin8-gcc-4.0.0: no input files >> >> I have to say that i incorporated the location /usr/bin/gcc-4.0 in >> the shell >> path. >> >> I don't have any file called just gcc in the /usr/bin directory. >> >> The gcc package came with the computer, and i tryied to install it >> from a >> gcc4.0.pkg file (ready to install in mac systems)... >> >> This is the information i have >> >> what do you think? >> >> Mathias >> > > _______________________________________________ > BioPython mailing list - BioPython@biopython.org > http://biopython.org/mailman/listinfo/biopython > From matomatias at hotmail.com Tue Jul 12 21:55:34 2005 From: matomatias at hotmail.com (Matias Saavedra) Date: Tue Jul 12 21:46:19 2005 Subject: [BioPython] Problems with MMCIFlex module Message-ID: With the new file gcc, which is a copy of gcc-4.0 file (and was incorporated into the shell path), when i run "python setup.py install" in the "egenix-mx-base-2.0.6" directory (to install mxTextTools), i get the following error message and the end of a long list of errors: error: command 'gcc' failed with exit status 1 What's happening? Thanks for your help. _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From matomatias at hotmail.com Tue Jul 12 22:01:54 2005 From: matomatias at hotmail.com (Matias Saavedra) Date: Tue Jul 12 21:52:38 2005 Subject: [BioPython] Problems with MMCIFlex module Message-ID: Hi Doug. I was looking for a gcc_select file in the /usr/bin directory... but i don't have it... and i don't know why! So i can't do what you are suggesting. I really dont understand what is happening with my system. I have the impression that the solution will be easy... but where is it? Mathias. _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From djkojeti at unity.ncsu.edu Tue Jul 12 22:13:30 2005 From: djkojeti at unity.ncsu.edu (Douglas Kojetin) Date: Tue Jul 12 22:04:48 2005 Subject: [BioPython] Problems with MMCIFlex module In-Reply-To: References: Message-ID: Hi Mathias- Try the /usr/sbin directory. Doug On Jul 12, 2005, at 10:01 PM, Matias Saavedra wrote: > Hi Doug. > > I was looking for a gcc_select file in the /usr/bin directory... > but i don't have it... and i don't know why! > > So i can't do what you are suggesting. > > I really dont understand what is happening with my system. I have > the impression that the solution will be easy... but where is it? > > Mathias. > > _________________________________________________________________ > Express yourself instantly with MSN Messenger! Download today it's > FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ > > _______________________________________________ > BioPython mailing list - BioPython@biopython.org > http://biopython.org/mailman/listinfo/biopython > From matomatias at hotmail.com Tue Jul 12 22:19:32 2005 From: matomatias at hotmail.com (Matias Saavedra) Date: Tue Jul 12 22:10:23 2005 Subject: [BioPython] Problems with MMCIFlex module Message-ID: Doug, i was wrong in the previous message, i don't have the gcc_select file in the /usr/sbin directory... i really dont have it... in fact when i do "which gcc_select" i get "gcc_select: Command not found." Mathias _________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ From djkojeti at unity.ncsu.edu Tue Jul 12 22:28:21 2005 From: djkojeti at unity.ncsu.edu (Douglas Kojetin) Date: Tue Jul 12 22:21:08 2005 Subject: [BioPython] Problems with MMCIFlex module In-Reply-To: References: Message-ID: <69500ABA-EA47-4102-8C63-52437F7A7E84@unity.ncsu.edu> Ah, sorry! You might try and update to the most recent gcc from http://developer.apple.com/ -- that may install gcc_select. Doug On Jul 12, 2005, at 10:19 PM, Matias Saavedra wrote: > Doug, i was wrong in the previous message, i don't have the > gcc_select file in the /usr/sbin directory... > i really dont have it... in fact when i do "which gcc_select" i > get "gcc_select: Command not found." > > Mathias > > _________________________________________________________________ > FREE pop-up blocking with the new MSN Toolbar - get it now! http:// > toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ > > _______________________________________________ > BioPython mailing list - BioPython@biopython.org > http://biopython.org/mailman/listinfo/biopython > From pwilkinson_m at xbioinformatics.org Wed Jul 13 15:08:06 2005 From: pwilkinson_m at xbioinformatics.org (Peter Wilkinson) Date: Wed Jul 13 14:58:26 2005 Subject: [BioPython] Unigene Flatfile Parser? Message-ID: <6.2.1.2.0.20050713150758.02e077a0@pop.videotron.ca> The Parser under /Unigene parses html. Is there a unigene flatfile parser under another directory? From matomatias at hotmail.com Wed Jul 13 16:35:32 2005 From: matomatias at hotmail.com (Matias Saavedra) Date: Wed Jul 13 16:26:26 2005 Subject: [BioPython] Problems with MMCIFlex module Message-ID: Hi. Thanks to everybody that helped me with my problem. I finally solved it, in a simple manner. There where some biblioteques that i didn't have, so i installed the hole XcodeTools package that comes with mac computers. This installed the missing things, and gcc started to work properly. After that i could install mxTextTools and biopython. Thanks again to everybody. Mathias Saavedra _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From maximilianh at gmail.com Thu Jul 14 10:25:22 2005 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Thu Jul 14 10:16:14 2005 Subject: [BioPython] Matrix search? Message-ID: <76f031ae05071407253db3c9ee@mail.gmail.com> Hi, I'm searching for a module that can scan DNA-sequences against weight matrices (PWMs) from e.g. Transfac to find putative transcription factor binding sites. I have searched biopython.org and the mailing list but couldn't find anything appropriate. There is the motif from the AlignAce package, but it's too specified and tailored for AlignACE. Is there really no module in Biopython apart from this? Others: There is internal support in BioPerl, with an external module BioPerl called TFBS, in Biojava with its "Distributions", though they don't seem to support loading/saving matrices, and a standalone program which I found via this mailing list called "tacg". Hum...any other ideas? Standalone programs/external modules for searching transcription factor binding sites that I've missed? Max -- Maximilian Haeussler, tel: +49 170 98 39 098 From jleigh at dal.ca Thu Jul 14 11:09:49 2005 From: jleigh at dal.ca (Jessica Leigh) Date: Thu Jul 14 11:00:29 2005 Subject: [BioPython] Trouble parsing BLAST reports Message-ID: <42D6803D.2020300@dal.ca> Has anyone been having trouble parsing BLAST files? I have a script I was using a couple of months ago to BLAST and parse the results, and it doesn't seem to work now. It uses qblast. Anyway, I don't use biopython very often, so I figured it was just me, and went back to the cookbook to go through the process step by step, and I'm having the same problem. The actual BLASTing seems to work fine, but when I get to the: b_record = b_parser.parse(blast_out) part, I get an exception that ends with: SyntaxError: Line does not contain 'Database': [this line is then followed with a blank line] I've taken a look at the blast_out stream, and it looks good: it's got lots of results in it, it's not the "This page will not refresh automatically..." page. I'm guessing that NCBI has changed the output format just enough to break the parser. Does anyone else get this? Am I going to have to resort to writing my own parser? Jessica From bartek at rezolwenta.eu.org Thu Jul 14 11:09:17 2005 From: bartek at rezolwenta.eu.org (bartek wilczynski) Date: Thu Jul 14 11:06:28 2005 Subject: [BioPython] Matrix search? In-Reply-To: <76f031ae05071407253db3c9ee@mail.gmail.com> References: <76f031ae05071407253db3c9ee@mail.gmail.com> Message-ID: <1121353757.42d6801dc64fb@imp.rezolwenta.eu.org> Maximilian Haeussler wrote: > Hi, > > I'm searching for a module that can scan DNA-sequences against weight > matrices (PWMs) from e.g. Transfac to find putative transcription > factor binding sites. I have searched biopython.org and the mailing > list but couldn't find anything appropriate. > > There is the motif from the AlignAce package, but it's too specified > and tailored for AlignACE. Is there really no module in Biopython > apart from this? > There is also module called MEME. They are a little bit redundant and both provide a class called Motif. That's more or less something you are looking for. You may try to do something like this import Bio.AlignAce.Motif as motif from Bio.Seq import Seq from Bio.Alphabet import IUPAC m = motif.Motif() a = IUPAC.unambiguous_dna m.add_instance(Seq("ATATAT",a)) m.add_instance(Seq("ATATTT",a)) m.set_mask("******") print m.__str__() t = Seq("ATTATTATTATTATTATATATTT",a) for o in m.search_instances(t): # search for exact matches print o for o in m.search_pwm(t): # scan the whole sequence and score all positions print o for o in m.search_pwm(t,0.5): # select only hits with score above 0.5 print o > Others: There is internal support in BioPerl, with an external module > BioPerl called TFBS, in Biojava with its "Distributions", though they > don't seem to support loading/saving matrices, and a standalone > program which I found via this mailing list called "tacg". TFBS is quite a large program developed and distributed also separately. It's written in perl, so no quick way of incorporating this into BioPython. > > Hum...any other ideas? Standalone programs/external modules for > searching transcription factor binding sites that I've missed? > The thing is, that if you like to scan sequences for motifs , you should better know waht you are doing. There is no single scoring function and there are always special cases. The code from AlignAce is very simplistic, but you should have no problems extending it. If you want to scan for transfac, you can check out the alibaba website: http://www.alibaba2.com Speaking of alignAce, they are also distributing something called ScanAce - small quick program written in C. -- regards Bartek Wilczynski -- For every complex problem there is an answer that is clear, simple, and wrong. H. L. Mencken From loraine at loraine.net Fri Jul 15 08:51:44 2005 From: loraine at loraine.net (Ann Loraine) Date: Fri Jul 15 08:42:32 2005 Subject: [BioPython] question regarding writing SeqRecord objects in Fasta format Message-ID: Hi, I'm trying to create a new fasta file from a larger one by selecting out records that contain specific ids. Reading the records worked fine - I just followed the directions in the tutorial. But now I want to write them out in fasta format to a file handle. I tried using the SeqIO.FASTA.FastaWriter class to do this, but got this error: >>> writer = Bio.SeqIO.FASTA.FastaWriter('test.fa') >>> writer.write(cur_record) Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.4/site-packages/Bio/SeqIO/FASTA.py", line 67, in write id = record.id AttributeError: Record instance has no attribute 'id' Looks like FastaWriter expects a different type of object. Printing the record works fine, however: >>> print cur_record >consensus:Rat230_2:1367552_at; gb|M25590; gb:M25590.1 /DB_XREF=gi:202756 /FEA=FLmRNA /CNT=455 /TID=Rn.9942.1 /TIER=FL+Stack /STK=452 /UG=Rn.9942 /LL=24802 /UG_GENE=Svp4 /UG_TITLE=Seminal vesicle protein 4 /DEF=Rat androgen-dependent protein mRNA, complete cds. /FL=gb:NM_012662.1 gb:M25590.1 atataaactaagaactcagctcagccttcagtcaagagcttttctggcaagatgaagtct accagcttgttcctctgttctctgctcctccttctagtgacaggagccattgggagaaaa acnaaggaaaaatactcacagtcggaagaagttgtcagtgagagctttgcctcgggccct tcctcgggttcttctgatgatgaattagtgagagacaagccatatggccccaaagtctcg ggcggctcctttggtgaggaagcttctgaggagataagtagcagaaggagcaagcacatc tctaggagttccggtggctccaacatggaaggtgagagctcgtatgccaagaaaaagagg agccggtttgcccaagacgtactcaactgatagtgcatcgggcagctgaacatcttggac caatatgccggagccacattgcctggatgaagcctgtgatgtcttcagcatgcagctccc natgtggtctcagaggcagtccctggatggcatttccttctcatgcttgtttgtcttgag gttcttaaacctaacattcaggaactttctgtccaataaagagataacaatctgcatcnt taaaaaaaaaaaaaaaaaaaaaaaaaannnnnnnnnn But I don't want to print the record to stdout -- I want to write it to a filehandle. I'd like the record to be formatted nicely - same number of characters per line in the sequence part - but I can't figure out how to do it. Is there another 'writer' type object I could use that would accept a SeqRecord? -Ann From cgw501 at york.ac.uk Fri Jul 15 12:16:47 2005 From: cgw501 at york.ac.uk (cgw501@york.ac.uk) Date: Fri Jul 15 12:07:45 2005 Subject: [BioPython] Bio.Nexus Message-ID: Hi everyone, I am using the nexus format to store and process a bunch of mixed data and want to use python to do it. I can't find the Bio.Nexus API documentation, and I know this was a new development for 1.40b, but confusingly I can import the module fine from the interpreter. Anyone have any ideas where I might find some info/howto on the methods availible in this module without going back to the source? Thanks, Chris Williams From fkauff at duke.edu Fri Jul 15 12:31:05 2005 From: fkauff at duke.edu (Frank Kauff) Date: Fri Jul 15 12:21:47 2005 Subject: [BioPython] Bio.Nexus In-Reply-To: References: Message-ID: <1121445065.4583.13.camel@osiris.biology.duke.edu> Hi Chris, unfortunately there's no documentation yet - my bad. But in short, to read your data like from Bio.Nexus import Nexus nex=Nexus.Nexus('my_file.nex') and then access data like nex.taxlabels nex.nchar nex.charsets nex.trees etc. I hope most of the methods have a descriptive title and are easy to use. Let me know if I can help further. And I promise to write some documentation, but it won't be before end of August. Cheers, Frank On Fri, 2005-07-15 at 17:16 +0100, cgw501@york.ac.uk wrote: > Hi everyone, > > I am using the nexus format to store and process a bunch of mixed data and > want to use python to do it. I can't find the Bio.Nexus API documentation, > and I know this was a new development for 1.40b, but confusingly I can > import the module fine from the interpreter. Anyone have any ideas where I > might find some info/howto on the methods availible in this module without > going back to the source? > > Thanks, > > Chris Williams > _______________________________________________ > BioPython mailing list - BioPython@biopython.org > http://biopython.org/mailman/listinfo/biopython -- Frank Kauff Dept. of Biology Duke University Box 90338 Durham, NC 27708 USA Phone 919-660-7382 Fax 919-660-7293 Web http://www.lutzonilab.net/members/page225.shtml From idoerg at burnham.org Fri Jul 15 13:30:18 2005 From: idoerg at burnham.org (Iddo Friedberg) Date: Fri Jul 15 13:21:52 2005 Subject: [BioPython] question regarding writing SeqRecord objects in Fasta format In-Reply-To: References: Message-ID: <42D7F2AA.5080608@burnham.org> Hi Ann, How did you create the SeqRecord instances in the first place? Did you read in using SeqIO.Fasta.FastaReader, or something else? Your SeqRecord instance lacks an id attribute, which should have been read in, or inserted some other way. Let me know. Cheers, Iddo Ann Loraine wrote: > Hi, > > I'm trying to create a new fasta file from a larger one by selecting > out records that contain specific ids. > > Reading the records worked fine - I just followed the directions in > the tutorial. > > But now I want to write them out in fasta format to a file handle. > > I tried using the SeqIO.FASTA.FastaWriter class to do this, but got > this error: > > >>> writer = Bio.SeqIO.FASTA.FastaWriter('test.fa') > >>> writer.write(cur_record) > Traceback (most recent call last): > File "", line 1, in ? > File "/usr/local/lib/python2.4/site-packages/Bio/SeqIO/FASTA.py", > line 67, in write > id = record.id > AttributeError: Record instance has no attribute 'id' > > Looks like FastaWriter expects a different type of object. > > Printing the record works fine, however: > > >>> print cur_record > >consensus:Rat230_2:1367552_at; gb|M25590; gb:M25590.1 > /DB_XREF=gi:202756 /FEA=FLmRNA /CNT=455 /TID=Rn.9942.1 /TIER=FL+Stack > /STK=452 /UG=Rn.9942 /LL=24802 /UG_GENE=Svp4 /UG_TITLE=Seminal vesicle > protein 4 /DEF=Rat androgen-dependent protein mRNA, complete cds. > /FL=gb:NM_012662.1 gb:M25590.1 > atataaactaagaactcagctcagccttcagtcaagagcttttctggcaagatgaagtct > accagcttgttcctctgttctctgctcctccttctagtgacaggagccattgggagaaaa > acnaaggaaaaatactcacagtcggaagaagttgtcagtgagagctttgcctcgggccct > tcctcgggttcttctgatgatgaattagtgagagacaagccatatggccccaaagtctcg > ggcggctcctttggtgaggaagcttctgaggagataagtagcagaaggagcaagcacatc > tctaggagttccggtggctccaacatggaaggtgagagctcgtatgccaagaaaaagagg > agccggtttgcccaagacgtactcaactgatagtgcatcgggcagctgaacatcttggac > caatatgccggagccacattgcctggatgaagcctgtgatgtcttcagcatgcagctccc > natgtggtctcagaggcagtccctggatggcatttccttctcatgcttgtttgtcttgag > gttcttaaacctaacattcaggaactttctgtccaataaagagataacaatctgcatcnt > taaaaaaaaaaaaaaaaaaaaaaaaaannnnnnnnnn > > > But I don't want to print the record to stdout -- I want to write it > to a filehandle. I'd like the record to be formatted nicely - same > number of characters per line in the sequence part - but I can't > figure out how to do it. > > Is there another 'writer' type object I could use that would accept a > SeqRecord? > > -Ann > > > _______________________________________________ > BioPython mailing list - BioPython@biopython.org > http://biopython.org/mailman/listinfo/biopython > > -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 713 9949 http://ffas.ljcrf.edu/~iddo From christopher.e.hart at gmail.com Mon Jul 18 00:15:08 2005 From: christopher.e.hart at gmail.com (Christopher Hart) Date: Mon Jul 18 00:06:06 2005 Subject: [BioPython] Problems with BioSQL/postgres/Genbank In-Reply-To: References: Message-ID: <478fc3ec05071721156740bd7d@mail.gmail.com> Hello, when following this tutorial: http://www.biopython.org/docs/biosql/python_biosql_basic.html I'm getting this error: IntegrityError: ERROR: insert or update on table "bioentry" violates foreign key constraint "fktaxon_bioentry" On closer inspection within the pdb, the taxon_id is set to 0, regardless of what it should be from the genbank file. I've tried loading several genbank files, and the taxon_id is always set to 0. Although I can circumvent this problem by adding an taxon_id 0 to the taxon table within the biosqldb, this is obviously a suboptimal solution. Any suggestions? Specifics of my setup are as follows: I was trying to try out the BioSQL module to load some genbank sequences in. I've installed postgres 8.0, and the latest CVS checkouts of both biopython and biosql. I'm attaching to postgres through psycopg. And here is the full traceback: ------Begin paste---------- In [8]:db.load(iterator) --------------------------------------------------------------------------- psycopg.IntegrityError Traceback (most recent call last) /home/hart/ /home/hart/python-home-packages/lib/python/BioSQL/BioSeqDatabase.py in load(self, record_iterator) 412 break 413 num_records += 1 --> 414 db_loader.load_seqrecord(cur_record) 415 416 return num_records /home/hart/python-home-packages/lib/python/BioSQL/Loader.py in load_seqrecord(self, record) 35 """Load a Biopython SeqRecord into the database. 36 """ ---> 37 bioentry_id = self._load_bioentry_table(record) 38 self._load_bioentry_date(record, bioentry_id) 39 self._load_biosequence(record, bioentry_id) /home/hart/python-home-packages/lib/python/BioSQL/Loader.py in _load_bioentry_table(self, record) 249 %s, 250 %s)""" --> 251 self.adaptor.execute(sql, (self.dbid, 252 taxon_id, 253 record.name, /home/hart/python-home-packages/lib/python/BioSQL/BioSeqDatabase.py in execute(self, sql, args) 275 """Just execute an sql command. 276 """ --> 277 self.cursor.execute(sql, args or ()) 278 279 def get_subseq_as_string(self, seqid, start, end): IntegrityError: ERROR: insert or update on table "bioentry" violates foreign key constraint "fktaxon_bioentry" DETAIL: Key (taxon_id)=(0) is not present in table "taxon". INSERT INTO bioentry ( biodatabase_id, taxon_id, name, accession, identifier, division, description, version) VALUES ( 7, '0', 'ATCOR66M', 'X55053', '16229', 'PLN', 'A.thaliana cor6.6 mRNA.', 1) > /home/hart/python-home-packages/lib/python/BioSQL/BioSeqDatabase.py(277)execute() -> self.cursor.execute(sql, args or ()) ---end paste---- Thanks, Chris From loraine at loraine.net Mon Jul 18 20:04:55 2005 From: loraine at loraine.net (Ann Loraine) Date: Mon Jul 18 20:09:14 2005 Subject: [BioPython] re: question regarding writing SeqRecord objects in Fasta format Message-ID: <3e5a25d017bef800b0b90f5d860ac935@loraine.net> Hello, To answer your question - I read in the fasta records like so: from Bio import Fasta fh = gzip.Gzipfile('seqs.fa.gz').open() parser = Fasta.RecordParser() iterator = Fasta.Iterator(fh,parser) curr_record = iterator.next() I was following the example in this tutorial Web page: http://www.biopython.org/docs/tutorial/Tutorial003.html#toc7 "Let's make all of this talk more concrete by using the Iterator and Record interfaces to do what we did before -- extract a unique list of all species in our FASTA file. First we need to set up our parser and iterator: >>> from Bio import Fasta >>> parser = Fasta.RecordParser() >>> file = open("ls_orchid.fasta") >>> iterator = Fasta.Iterator(file, parser)" Should I be using the SeqIO method instead to read fasta records if I want to write some of them out to a fasta format file? -Ann -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 1032 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biopython/attachments/20050718/57e79660/attachment.bin From idoerg at burnham.org Mon Jul 18 20:21:04 2005 From: idoerg at burnham.org (Iddo Friedberg) Date: Mon Jul 18 20:11:48 2005 Subject: [BioPython] Re: question regarding writing SeqRecord objects in Fasta format In-Reply-To: <3e5a25d017bef800b0b90f5d860ac935@loraine.net> References: <3e5a25d017bef800b0b90f5d860ac935@loraine.net> Message-ID: <42DC4770.60904@burnham.org> Ann Loraine wrote: > Hello, > > To answer your question - I read in the fasta records like so: > > from Bio import Fasta > fh = gzip.Gzipfile('seqs.fa.gz').open() > parser = Fasta.RecordParser() > iterator = Fasta.Iterator(fh,parser) > curr_record = iterator.next() > > I was following the example in this tutorial Web page: > > http://www.biopython.org/docs/tutorial/Tutorial003.html#toc7 > > > "Let's make all of this talk more concrete by using the Iterator and > Record interfaces to do what we did before -- extract a unique list of > all species in our FASTA file. First we need to set up our parser and > iterator: > >>> from Bio import Fasta > >>> parser = Fasta.RecordParser() > >>> file = open("ls_orchid.fasta") > >>> iterator = Fasta.Iterator(file, parser)" > > Should I be using the SeqIO method instead to read fasta records if I > want to write some of them out to a fasta format file? > > -Ann > > Yes, if you want to use SeqIO for output, use it for input as well. When reading using Bio.Fasta.Iterator, you are creating Bio.Fasta.Record instances, which do not have the 'id' attribute. When reading using Bio.SeqIO.FastaReader, you are creating a Bio.SeqRecord instance, which is a different representation of a sequence. But Bio.SeqIO.FASTA does have a writing method, so you may want to use that. The reason that Biopython has two ways of representing sequences are basically historical: both methods were CVS deposited, approved, and code grew around both. Not exactly optimal I know. HTH< ./I -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 713 9949 http://ffas.ljcrf.edu/~iddo From aurelie.bornot at free.fr Tue Jul 19 09:08:20 2005 From: aurelie.bornot at free.fr (aurelie.bornot@free.fr) Date: Tue Jul 19 08:58:57 2005 Subject: [BioPython] Changes in NCBI BLAST output format !!?? Message-ID: <1121778500.42dcfb449be14@imp3-q.free.fr> Hi ! I've got the same problem as Jessica Leigh (in the Discussion List) : When I try to parse a BLAST file with a script that worked until the beginning of July, I get this syntax error : Line does not contain 'Database': (Blank line) It seem that the NCBI has made changes : -"Old" blast file :

Query= sequence (569 letters)

Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, GSS,environmental samples or phase 0, 1 or 2 HTGS sequences) 3,047,402 sequences; 13,743,552,639 total letters

If you have any problems or questions with the results... -New Blast file : Query= sequence (540 letters) Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, GSS,environmental samples or phase 0, 1 or 2 HTGS sequences) 3,312,348 sequences; 14,588,094,788 total letters

If you have any problems or questions with the... The

before Query and Database are missing !!! And the fact is that in Python24\Lib\site-packages\Bio\Blast\NCBIWWW.py, it seems that the code to find "Database" uses the

: def _scan_database_info(self, uhandle, consumer): attempt_read_and_call(uhandle, consumer.noevent, start='

') read_and_call(uhandle, consumer.database_info, contains='Database') .... I'm not sure to have a good understanding of what happens... But could someone help... I don't know what to do. Is it possible to correct the problem easily ? Thanks a lot !! Aurelie -------------- Aurelie BORNOT MNHN Paris From jleigh at dal.ca Tue Jul 19 10:00:42 2005 From: jleigh at dal.ca (Jessica Leigh) Date: Tue Jul 19 09:51:06 2005 Subject: [BioPython] Changes in NCBI BLAST output format !!?? In-Reply-To: <1121778500.42dcfb449be14@imp3-q.free.fr> References: <1121778500.42dcfb449be14@imp3-q.free.fr> Message-ID: <42DD078A.9030401@dal.ca> Hi Aurelie, I ended up just writing my own parser... it wasn't as hard as I thought it would be. The BLAST output is pretty straightforward. What I wanted to do with BLAST was pretty simple, so I don't know if this will help you or not. I wanted to get UIDs for the top hits, then retrieve the sequences. I used the following regular espression to get the information from the top of the BLAST report (the part with the links to lower in the page: alignre = re.compile(r' *(\d+) *(\d*e\-\d+|\d+.\d+)') This regular expression contains 3 groups: the UID, score, and expect value, so I used the RE with: uid, score, expect = alignre.search(line).groups() I used a bit of other code to make sure that the line I'm looking at ('line') contains these items. It's kind of dirty, but it worked for me. Hopefully this will give you ideas as to what you can do to extract the information you need from the BLAST report. Jessica aurelie.bornot@free.fr wrote: >Hi ! > >I've got the same problem as Jessica Leigh (in the Discussion List) : >When I try to parse a BLAST file with a script that worked until the beginning >of July, I get this syntax error : > >Line does not contain 'Database': >(Blank line) > >It seem that the NCBI has made changes : > >-"Old" blast file : >

>Query= sequence > (569 letters) > >

>Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, >GSS,environmental samples or phase 0, 1 or 2 HTGS sequences) > 3,047,402 sequences; 13,743,552,639 total letters > >

If you have any problems or questions with the results... > >-New Blast file : > >Query= sequence > (540 letters) > > >Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, >GSS,environmental samples or phase 0, 1 or 2 HTGS sequences) > 3,312,348 sequences; 14,588,094,788 total letters > >

If you have any problems or questions with the... > > >The

before Query and Database are missing !!! >And the fact is that in Python24\Lib\site-packages\Bio\Blast\NCBIWWW.py, it >seems that the code to find "Database" uses the

: > >def _scan_database_info(self, uhandle, consumer): > attempt_read_and_call(uhandle, consumer.noevent, start='

') > read_and_call(uhandle, consumer.database_info, contains='Database') > .... > > >I'm not sure to have a good understanding of what happens... >But could someone help... >I don't know what to do. Is it possible to correct the problem easily ? > >Thanks a lot !! >Aurelie > >-------------- >Aurelie BORNOT >MNHN >Paris >_______________________________________________ >BioPython mailing list - BioPython@biopython.org >http://biopython.org/mailman/listinfo/biopython > > From aurelie.bornot at free.fr Tue Jul 19 11:21:47 2005 From: aurelie.bornot at free.fr (aurelie.bornot@free.fr) Date: Tue Jul 19 11:16:51 2005 Subject: [BioPython] Changes in NCBI BLAST output format !!?? Message-ID: <1121786507.42dd1a8b9dee5@imp3-q.free.fr> Thank you very much Jessica !!! Unfortunately, I need a lot of thing in the BLAST reports..... It will be difficult to do the same thing as you did.... I will try to do something in the code of parser of Python. But it will be difficult for me.. so if you or someone has advices !!! Thanks a lot again for your answer Jessica ! Aur?lie -------------- Aurelie BORNOT MNHN Paris From saccenti at cerm.unifi.it Tue Jul 19 13:43:30 2005 From: saccenti at cerm.unifi.it (Edoardo Saccenti) Date: Tue Jul 19 13:34:00 2005 Subject: [BioPython] Changes in NCBI BLAST output format !!?? In-Reply-To: <1121786507.42dd1a8b9dee5@imp3-q.free.fr> References: <1121786507.42dd1a8b9dee5@imp3-q.free.fr> Message-ID: <200507191943.30115.saccenti@cerm.unifi.it> On Tuesday 19 July 2005 17:21, aurelie.bornot@free.fr wrote: > Thank you very much Jessica !!! > I also have problem...so I wrote my own parser to parse plain text output file. Can extract lot of info but you have to blast manually Any case it works fine for me > > -------------- > Aurelie BORNOT > MNHN > Paris > > > _______________________________________________ > BioPython mailing list - BioPython@biopython.org > http://biopython.org/mailman/listinfo/biopython -- "Raffiniert ist der Herr Gott, aber boshaft ist Er nicht." --- Dr. Edoardo Saccenti CERM Nuclear Magnetic Resonance Research Center Scientific Pole - University of Florence Via Luigi Sacconi n? 6 50019 Sesto Fiorentino (FI) tel: +39 055 4574193 fax: +39 055 4574253 saccenti@cerm.unifi.it www.cerm.unifi.it From cgw501 at york.ac.uk Thu Jul 21 12:37:42 2005 From: cgw501 at york.ac.uk (cgw501@york.ac.uk) Date: Thu Jul 21 13:06:24 2005 Subject: [BioPython] SeqUtils.antiparallel function Message-ID: Hi, I am parsing whole-chromosome genbank files, and need to get the reverse complement of some of the sequence. Here's what I get trying things out in the interpreter. >>> x= 'atattatatat' >>> from Bio.Seq import Seq >>> a = Seq(x) >>> from Bio.SeqUtils import antiparallel >>> b = antiparallel(a) Traceback (most recent call last): File "", line 1, in ? b = antiparallel(a) File "C:\Python24\Lib\site-packages\Bio\SeqUtils\__init__.py", line 49, in antiparallel s = complement(seq) File "C:\Python24\Lib\site-packages\Bio\SeqUtils\__init__.py", line 38, in complement return seq.translate(_ttable) AttributeError: Seq instance has no attribute 'translate' Not sure what I'm doing wrong. Thanks, Chris From mdehoon at c2b2.columbia.edu Thu Jul 21 14:23:35 2005 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Thu Jul 21 14:16:05 2005 Subject: [BioPython] SeqUtils.antiparallel function Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE7AC258@cgcmail.cgc.cpmc.columbia.edu> The antiparallel function in SeqUtils has been deprecated. Please use the function reverse_complement in Bio.Seq instead: >>> from Bio.Seq import * >>> x = 'atattatatat' >>> reverse_complement(x) 'atatataatat' This function is currently only in CVS, and is not included in Biopython 1.40b. You would need to download Seq.py from CVS and copy it to c:\Python24\Lib\site-packages\Bio (overwriting the Seq.py that lives there). If you want to use a Seq object instead of a string, use the reverse_complement method on the Seq object: >>> from Bio.Seq import * >>> s = Seq('actactacta') >>> s.reverse_complement() Seq('tagtagtagt', Alphabet()) >>> This function *is* in Biopython 1.40b, so you wouldn't need to install anything. --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: biopython-bounces@portal.open-bio.org on behalf of cgw501@york.ac.uk Sent: Thu 7/21/2005 12:37 PM To: biopython@biopython.org Subject: [BioPython] SeqUtils.antiparallel function Hi, I am parsing whole-chromosome genbank files, and need to get the reverse complement of some of the sequence. Here's what I get trying things out in the interpreter. >>> x= 'atattatatat' >>> from Bio.Seq import Seq >>> a = Seq(x) >>> from Bio.SeqUtils import antiparallel >>> b = antiparallel(a) Traceback (most recent call last): File "", line 1, in ? b = antiparallel(a) File "C:\Python24\Lib\site-packages\Bio\SeqUtils\__init__.py", line 49, in antiparallel s = complement(seq) File "C:\Python24\Lib\site-packages\Bio\SeqUtils\__init__.py", line 38, in complement return seq.translate(_ttable) AttributeError: Seq instance has no attribute 'translate' Not sure what I'm doing wrong. Thanks, Chris _______________________________________________ BioPython mailing list - BioPython@biopython.org http://biopython.org/mailman/listinfo/biopython From loraine at loraine.net Fri Jul 22 19:45:31 2005 From: loraine at loraine.net (Ann Loraine) Date: Fri Jul 22 19:35:58 2005 Subject: [BioPython] question regarding writing SeqRecord objects in Fasta format In-Reply-To: <42D7F2AA.5080608@burnham.org> References: <42D7F2AA.5080608@burnham.org> Message-ID: <97638e3a7812724ba0fcff2103c89fe4@loraine.net> Hi, I made the SeqRecord following an example in one of the tutorials: from: http://www.biopython.org/docs/tutorial/Tutorial003.html#toc7: >>> from Bio import Fasta >>> parser = Fasta.RecordParser() >>> file = open("ls_orchid.fasta") >>> iterator = Fasta.Iterator(file, parser) To get the first record in the file, I did this: >>> cur_record = iterator.next() Probably I should use the SeqIO.Fasta.FastaReader for my application - which is reading in sequences and then selecting a subset of records to write to an output file. -Ann On Jul 15, 2005, at 12:30 PM, Iddo Friedberg wrote: > Hi Ann, > > How did you create the SeqRecord instances in the first place? Did you > read in using SeqIO.Fasta.FastaReader, or something else? > > Your SeqRecord instance lacks an id attribute, which should have been > read in, or inserted some other way. > > Let me know. > > Cheers, > > Iddo > > > > > > Ann Loraine wrote: > >> Hi, >> >> I'm trying to create a new fasta file from a larger one by selecting >> out records that contain specific ids. >> >> Reading the records worked fine - I just followed the directions in >> the tutorial. >> >> But now I want to write them out in fasta format to a file handle. >> >> I tried using the SeqIO.FASTA.FastaWriter class to do this, but got >> this error: >> >> >>> writer = Bio.SeqIO.FASTA.FastaWriter('test.fa') >> >>> writer.write(cur_record) >> Traceback (most recent call last): >> File "", line 1, in ? >> File "/usr/local/lib/python2.4/site-packages/Bio/SeqIO/FASTA.py", >> line 67, in write >> id = record.id >> AttributeError: Record instance has no attribute 'id' >> >> Looks like FastaWriter expects a different type of object. >> >> Printing the record works fine, however: >> >> >>> print cur_record >> >consensus:Rat230_2:1367552_at; gb|M25590; gb:M25590.1 >> /DB_XREF=gi:202756 /FEA=FLmRNA /CNT=455 /TID=Rn.9942.1 /TIER=FL+Stack >> /STK=452 /UG=Rn.9942 /LL=24802 /UG_GENE=Svp4 /UG_TITLE=Seminal >> vesicle protein 4 /DEF=Rat androgen-dependent protein mRNA, complete >> cds. /FL=gb:NM_012662.1 gb:M25590.1 >> atataaactaagaactcagctcagccttcagtcaagagcttttctggcaagatgaagtct >> accagcttgttcctctgttctctgctcctccttctagtgacaggagccattgggagaaaa >> acnaaggaaaaatactcacagtcggaagaagttgtcagtgagagctttgcctcgggccct >> tcctcgggttcttctgatgatgaattagtgagagacaagccatatggccccaaagtctcg >> ggcggctcctttggtgaggaagcttctgaggagataagtagcagaaggagcaagcacatc >> tctaggagttccggtggctccaacatggaaggtgagagctcgtatgccaagaaaaagagg >> agccggtttgcccaagacgtactcaactgatagtgcatcgggcagctgaacatcttggac >> caatatgccggagccacattgcctggatgaagcctgtgatgtcttcagcatgcagctccc >> natgtggtctcagaggcagtccctggatggcatttccttctcatgcttgtttgtcttgag >> gttcttaaacctaacattcaggaactttctgtccaataaagagataacaatctgcatcnt >> taaaaaaaaaaaaaaaaaaaaaaaaaannnnnnnnnn >> >> >> But I don't want to print the record to stdout -- I want to write it >> to a filehandle. I'd like the record to be formatted nicely - same >> number of characters per line in the sequence part - but I can't >> figure out how to do it. >> >> Is there another 'writer' type object I could use that would accept a >> SeqRecord? >> >> -Ann >> >> >> _______________________________________________ >> BioPython mailing list - BioPython@biopython.org >> http://biopython.org/mailman/listinfo/biopython >> >> > > > -- > Iddo Friedberg, Ph.D. > The Burnham Institute > 10901 N. Torrey Pines Rd. > La Jolla, CA 92037 USA > Tel: +1 (858) 646 3100 x3516 > Fax: +1 (858) 713 9949 > http://ffas.ljcrf.edu/~iddo > > From rcsqtc at iiqab.csic.es Fri Jul 29 11:59:56 2005 From: rcsqtc at iiqab.csic.es (Ramon Crehuet) Date: Fri Jul 29 11:50:56 2005 Subject: [BioPython] add models to a structure Message-ID: <42EA527C.7070401@iiqab.csic.es> Hi all, I would like to convert two structures created from different PDB files through the PDBparser into two models of a new structure. Or alternatively, add one of the structures as another model (a new child) of the first structute instance. Being s1 and s2 structures, I tried: s1.add(s2[0]) but this doesn't work, probably because both model id's are the same (=0). How can this model id be changed? The question may be very naive, but it is more general. How can I convert a group of chains into a new model or a new structure or vice-versa? Thanks in advance. Cheers, Ramon From thamelry at binf.ku.dk Sat Jul 30 09:08:07 2005 From: thamelry at binf.ku.dk (thamelry@binf.ku.dk) Date: Sat Jul 30 10:52:05 2005 Subject: [BioPython] add models to a structure In-Reply-To: <42EA527C.7070401@iiqab.csic.es> References: <42EA527C.7070401@iiqab.csic.es> Message-ID: <32989.83.92.3.59.1122728887.squirrel@www.binf.ku.dk> Hi Ramon, > I would like to convert two structures created from different PDB > files through the PDBparser into two models of a new structure. Or > alternatively, add one of the structures as another model (a new child) > of the first structute instance. > Being s1 and s2 structures, I tried: > s1.add(s2[0]) > but this doesn't work, probably because both model id's are the same > (=0). How can this model id be changed? Try model.id=1 for example, before adding the model. > The question may be very naive, but it is more general. How can I > convert a group of chains into a new model or a new structure or > vice-versa? Well, you can for example do: s=Structure("Test") m=Model(0) s.add(m) m.add(chain1) m.add(chain2) Best regards, -Thomas