From mdehoon at ims.u-tokyo.ac.jp  Mon Jul  5 00:40:05 2004
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Sat Mar  5 14:43:35 2005
Subject: [Biopython-dev] Bio.Seq and alphabets
Message-ID: <40E8DBA5.6000102@ims.u-tokyo.ac.jp>

I've been working on a complement() and reverse_complement() function for 
Bio.Seq's Seq and MutableSeq classes. Previously, similar functions existed in 
various places in Biopython. I am not sure though how to deal with the alphabet 
associated with a Seq or MutableSeq object. For example, a Seq can be created 
where the sequence is inconsistent with the alphabet:

 >>> from Bio.Alphabet import IUPAC
 >>> from Bio.Seq import Seq
 >>> Seq('GATCGACXYSMDG_or_any_funny_char_u_like_eg_*&$%', IUPAC.unambiguous_dna)
Seq('GATCGACXYSMDG_or_any_funny_char_u_like_eg_*&$%', IUPACUnambiguousDNA())

With a MutableSeq, one can change the sequence regardless of the alphabet:
 >>> from Bio.Seq import MutableSeq
 >>> s = MutableSeq('ACTGCCATCGT', IUPAC.unambiguous_dna)
 >>> s[9] = 'X'
 >>> s
MutableSeq(array('c', 'ACTGCCATCXT'), IUPACUnambiguousDNA())

Anyway, my immediate concern is how to deal with uppercase and lowercase 
characters. The reverse_complement function in Bio.GFF.easy converts lowercase 
characters to uppercase before taking the complement:

def _forward_complement_list_with_table(table, seq):
     return [table[x] for x in seq.tostring().upper()]

However, the complement and antiparallel functions in Bio.SeqUtils are not 
implemented for lowercase sequences:

_before = ''.join(IUPACData.ambiguous_dna_complement.keys())
_after = ''.join(IUPACData.ambiguous_dna_complement.values())
_ttable = maketrans(_before, _after)

def complement(seq):
     """Returns the complementary sequence (NOT antiparallel).

     This works on string sequences, not on Bio.Seq objects.
     """
     #Much faster on really long sequences than the previous loop based one.
     #thx to Michael Palmer, University of Waterloo
     return seq.translate(_ttable)


So there are two issues we need to decide:

1) Should we modify the Seq and MutableSeq classes such that the sequence is 
always consistent with the alphabet?

2) Should we allow lowercase characters in the sequence?

My own preference at this point is 1) yes 2) no, but I'd like to check what 
y'all think.

--Michiel.

-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon


From crocha at dc.uba.ar  Tue Jul  6 13:52:11 2004
From: crocha at dc.uba.ar (Cristian S. Rocha)
Date: Sat Mar  5 14:43:35 2005
Subject: [Biopython-dev] hmmpfam parser
In-Reply-To: 40226C51.F70315B7@ebc.uu.se
Message-ID: <1089136330.19621.34.camel@numero2>

Hi,

While I was searching for a hmmsearch output parser for biopython, I had
found a mail from you to the biopython-dev list with a source code. I'm
interesting to use it to parse a lot of hmm results but I would like to
know if exists a mature version and if you can append it to the
Bio-python CVS. If I can help you to testing and appending code to your
parser will be a pleasure. I really need these code. I was learning
Martel to write a parser, but I prefer help you than write one alone.

Thanks,
Cristian.

PD: Sorry about my bad english... :)

-- 
Lic. Cristian S. Rocha.
<crocha@dc.uba.ar>
Departamento de Computacin. FCEyN. UBA.
Pabellon I. Cuarto 9.
Ciudad Universitaria.
(1428) Buenos Aires. Argentina.
Tel: +54-11-4576-3390/96 int 714
Tel/Fax: +54-11-4576-3359
Cel: 15-5-607-9192


From tcwilliams79 at verizon.net  Mon Jul  5 21:35:44 2004
From: tcwilliams79 at verizon.net (Thomas C. Williams)
Date: Sat Mar  5 14:43:35 2005
Subject: [Biopython-dev] ReportLab toolkit is now at
	http://www.reportlab.org/rl_toolkit.html
Message-ID: <20040706013548.CKMQ6671.out003.verizon.net@TCWHP>

 
From h.j.tipney at stud.man.ac.uk  Tue Jul 13 09:56:43 2004
From: h.j.tipney at stud.man.ac.uk (h.j.tipney@stud.man.ac.uk)
Date: Sat Mar  5 14:43:35 2005
Subject: [Biopython-dev] python newbies blast problem
Message-ID: <E1BkNm8-000C7r-9n@curlew.cs.man.ac.uk>

Hi
I posted this to the other mailing list and got no response so I'm 
hoping you guys can help me. I'm very new to programming and 
even newer to python, so I apologise in advance if this is a simple
problem with an obvious solution but there are no python 
programmers
near to help me. Anyway, I inherited the script below and have been
using it on and off as part of a larger workflow. It has been running
fine, but I ran it again last week and it didn't give the output I
expected - it returned the 'your results will be updated in X seconds'
page rather than the actual results. It has been a while since I had
used this program and both blast and biopython had been updated 
so
I've now got the new biopython release (1.30) but I still get the
'wrong' output. I'm using python 2.3.3 on solaris, if that helps. Any
help would be greatly appreciated! Thank you in advance Hannah 
Tipney

    #!/opt/cs/bin/python
    from Bio import Fasta
    from Bio.Blast import NCBIWWW
    import sys
    import getopt

    opts, args =
    getopt.getopt(sys.argv[1:],"",['program=','database=','format=','e
    ntrez_query='])

    print sys.argv
    print opts

    if len(args)==0:
        print "no file given"
        sys.exit(2)

    program = "blastn"
    database = "nr"
    format = "Text"
    #"Homo sapiens [ORGN]"

    short_query=""

    for o,a in opts:
        print o,a
        if o == "--program":
            program = a
        if o == "--database":
            database = a
        if o == "--format":
            format = a
        if o == "--entrez_query":
     short_query = a

    if short_query=="human":
        query="Homo sapiens [ORGN]"
    else:
        query=""

    print "program = %s , database = %s, query = %s" %
    (program,database,query)

    file_for_blast = open(args[0], 'r')
    f_iterator = Fasta.Iterator(file_for_blast)

    f_record = f_iterator.next()
    file_for_blast.close()
    b_results = NCBIWWW.blast(program, database,
    f_record,format_type=format, entrez_query=query,timeout=60)

    blast_results = b_results.read()
    sys.stdout.write(blast_results)

------- End of forwarded message -------
------------------------------------------
Hannah Tipney
Manchester University,
Academic Unit of Medical Genetics,
St Mary's Hospital,
Hathersage Road,
Manchester. M13 0JH. 
UK

tel: +44 (0)161 276 6602
fax: +44 (0)161 276 6606

From jeffrey_chang at stanfordalumni.org  Tue Jul 13 23:47:32 2004
From: jeffrey_chang at stanfordalumni.org (Jeffrey Chang)
Date: Sat Mar  5 14:43:35 2005
Subject: [Biopython-dev] python newbies blast problem
In-Reply-To: <E1BkNm8-000C7r-9n@curlew.cs.man.ac.uk>
References: <E1BkNm8-000C7r-9n@curlew.cs.man.ac.uk>
Message-ID: <893C92DD-D548-11D8-8676-000A956845CE@stanfordalumni.org>

Hello,

This is because the NCBI website is not really meant to be queried by 
computer scripts.  It looks like a recent change has broken the 
NCBIWWW.blast function.  Fortunately, NCBI does have a computer 
friendly BLAST API called QBLAST.  I added an interface to QBLAST into 
biopython called NCBIWWW.qblast.  Please get the updated version of the 
NCBIWWW.py from CVS, and replace NCBIWWW.blast with NCBIWWW.qblast in 
your script, and see if that fixes things.

The anonymous CVS is at:
http://cvs.biopython.org/

Jeff


On Jul 13, 2004, at 9:56 AM, h.j.tipney@stud.man.ac.uk wrote:

> Hi
> I posted this to the other mailing list and got no response so I'm
> hoping you guys can help me. I'm very new to programming and
> even newer to python, so I apologise in advance if this is a simple
> problem with an obvious solution but there are no python
> programmers
> near to help me. Anyway, I inherited the script below and have been
> using it on and off as part of a larger workflow. It has been running
> fine, but I ran it again last week and it didn't give the output I
> expected - it returned the 'your results will be updated in X seconds'
> page rather than the actual results. It has been a while since I had
> used this program and both blast and biopython had been updated
> so
> I've now got the new biopython release (1.30) but I still get the
> 'wrong' output. I'm using python 2.3.3 on solaris, if that helps. Any
> help would be greatly appreciated! Thank you in advance Hannah
> Tipney
>
>     #!/opt/cs/bin/python
>     from Bio import Fasta
>     from Bio.Blast import NCBIWWW
>     import sys
>     import getopt
>
>     opts, args =
>     getopt.getopt(sys.argv[1:],"",['program=','database=','format=','e
>     ntrez_query='])
>
>     print sys.argv
>     print opts
>
>     if len(args)==0:
>         print "no file given"
>         sys.exit(2)
>
>     program = "blastn"
>     database = "nr"
>     format = "Text"
>     #"Homo sapiens [ORGN]"
>
>     short_query=""
>
>     for o,a in opts:
>         print o,a
>         if o == "--program":
>             program = a
>         if o == "--database":
>             database = a
>         if o == "--format":
>             format = a
>         if o == "--entrez_query":
>      short_query = a
>
>     if short_query=="human":
>         query="Homo sapiens [ORGN]"
>     else:
>         query=""
>
>     print "program = %s , database = %s, query = %s" %
>     (program,database,query)
>
>     file_for_blast = open(args[0], 'r')
>     f_iterator = Fasta.Iterator(file_for_blast)
>
>     f_record = f_iterator.next()
>     file_for_blast.close()
>     b_results = NCBIWWW.blast(program, database,
>     f_record,format_type=format, entrez_query=query,timeout=60)
>
>     blast_results = b_results.read()
>     sys.stdout.write(blast_results)
>
> ------- End of forwarded message -------
> ------------------------------------------
> Hannah Tipney
> Manchester University,
> Academic Unit of Medical Genetics,
> St Mary's Hospital,
> Hathersage Road,
> Manchester. M13 0JH.
> UK
>
> tel: +44 (0)161 276 6602
> fax: +44 (0)161 276 6606
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev


From bugzilla-daemon at portal.open-bio.org  Wed Jul 14 00:35:46 2004
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Sat Mar  5 14:43:35 2005
Subject: [Biopython-dev] [Bug 1667] New: PUBMED key collision in dbxref table
Message-ID: <200407140435.i6E4ZkSV012149@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1667

           Summary: PUBMED key collision in dbxref table
           Product: Biopython
           Version: Not Applicable
          Platform: Macintosh
        OS/Version: MacOS X
            Status: NEW
          Severity: normal
          Priority: P2
         Component: BioSQL
        AssignedTo: biopython-dev@biopython.org
        ReportedBy: open-bio@zesty.ca


I am using BioPython 1.30.

While loading records from the human genome into a MySQL database, BioSQL
causes the error: "Duplicate entry PUBMED-0 for key 2".

PUBMED appears in the dbxref table.  I looked at the code that inserts entries
into the dbxref table: the method _add_dbxref at line 97 of BioSQL/Loader.py.

_add_dbxref is called twice, at lines 333 and 336.  I believe the second call
has a bug, since both calls supply "reference.medline_id" as an argument.

            if reference.medline_id:
                dbxref_id = self._add_dbxref("MEDLINE",
                                             reference.medline_id, 0)
            elif reference.pubmed_id:
                dbxref_id = self._add_dbxref("PUBMED",
                                             reference.medline_id, 0)

It seems clear to me that the last line above should say "reference.pubmed_id".

If i make this change in my local copy of BioSQL/Loader.py, the MySQL error
about the duplicate key value indeed goes away.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

From Hegedus.Tamas at mayo.edu  Thu Jul 15 14:58:18 2004
From: Hegedus.Tamas at mayo.edu (Hegedus, Tamas .)
Date: Sat Mar  5 14:43:35 2005
Subject: [Biopython-dev] ModBioSQL release 0.12
Message-ID: <D70AF73EF4BAD811A4020002B3C1E496128A9A@excsrv22.mayo.edu>

Dear All,

Since I used Python and BioPython in my Modular BioSQL packege, my site would be interesting for you:
http://www.biomembrane.hu/~hegedus/modbiosql/

Best regards,
Tamas

--
Tamas Hegedus, Research Fellow | phone: 480-301-6041
Mayo Clinic Scottsdale         | fax:   480-301-7017
13000 E. Shea Blvd             | mailto:hegedus.tamas@mayo.edu
Scottsdale, AZ, 85259          | http://www.biomembrane.hu/~hegedus

From bugzilla-daemon at portal.open-bio.org  Sat Jul 17 13:42:13 2004
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Sat Mar  5 14:43:35 2005
Subject: [Biopython-dev] [Bug 1669] New: SwissProt Parser error - cannot
	read recent SwissProt entries
Message-ID: <200407171742.i6HHgDhF025690@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1669

           Summary: SwissProt Parser error - cannot read recent SwissProt
                    entries
           Product: Biopython
           Version: 1.24
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: major
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev@biopython.org
        ReportedBy: kris@math.princeton.edu


RX field of a SwissProt entry can, in newer records, be more than 1 line long,
while Sprot.py only accepts one line per record. See error message below. RX is
database reference of the article relevant to the entry, and Swissprot has
recently added DOI references as well. RA should be the next field after RX, but
if there are two RX lines in the record, parser chokes. 

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "find_pdb_orgs.py", line 33, in parse_yeast
    curr=siter.next()
  File "Bio/SwissProt/SProt.py", line 166, in next
    return self._parser.parse(File.StringHandle(data))
  File "Bio/SwissProt/SProt.py", line 290, in parse
    self._scanner.feed(handle, self._consumer)
  File "Bio/SwissProt/SProt.py", line 333, in feed
    self._scan_record(uhandle, consumer)
  File "Bio/SwissProt/SProt.py", line 338, in _scan_record
    fn(self, uhandle, consumer)
  File "Bio/SwissProt/SProt.py", line 414, in _scan_reference
    self._scan_ra(uhandle, consumer)
  File "Bio/SwissProt/SProt.py", line 436, in _scan_ra
    one_or_more=1)
  File "Bio/SwissProt/SProt.py", line 360, in _scan_line
    read_and_call(uhandle, event_fn, start=line_type)
  File "Bio/ParserSupport.py", line 300, in read_and_call
    raise SyntaxError, errmsg
SyntaxError: Line does not start with 'RA':
RX   DOI=10.1128/JB.183.20.5942-5955.2001;


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

From fsms at users.sourceforge.net  Mon Jul 26 12:45:51 2004
From: fsms at users.sourceforge.net (fsms@users.sourceforge.net)
Date: Sat Mar  5 14:43:35 2005
Subject: [Biopython-dev] Restriction analysis package
Message-ID: <4105353F.3000102@users.sourceforge.net>

Hi,

The restriction analysis package is now ready. Complete with a 
tutorial/cookbook section in html.
If you give me access to the CVS I can commit it in this week.
Alternatively the files are in the CVS at : 
http://cvs.sourceforge.net/viewcvs.py/rana/rana/Bio/Restriction/

Fred