[Biopython] python advice needed
Csaba Kiss
csaba.kiss at lanl.gov
Tue Apr 15 15:19:19 EDT 2014
Thanks for the advice Kevin. If this was a forum, they should make your
post a sticky :). I use pycharm and really like it. However, using it
efficiently is also challenging.
Csaba
On 4/15/2014 10:27 AM, Kevin Rue wrote:
> Hi Csaba,
>
> Well done! I witness everyday in my research group that the transition
> from fundamental biology to bioinformatics is not a straightforward
> process. Congratulations on your first successful experience.
>
> To give some context to my answer, let me tell you that I am a 3rd
> year PhD student trained in bioinformatics for the past 6 years (since
> my Master's Degree). Python is the first programming language I was
> taught during my Master's Degree (a tiny amount of Matlab in
> practicals of math before that), and I was taught the object-oriented
> programming aspect through classes of the Java programming language.
>
> I am glad that you managed to teach yourself how to program in Python
> through online resources. However, I think that going to actual
> classes can ease the learning curve a lot, particularly at the
> beginning, and for new topics such as object-oriented programming. The
> interactive Q&A with the demonstrator, and the questions of other
> classmates can help rapidly come across some common mistakes and
> tricks. For instance, a post-doc in my lab is learning Python just
> like you, and I have seen him rack his head for hours until I came
> along and pointed him in the right direction (avoid giving a student
> an answer: "give someone food and he'll eat for the day, teach them
> how to cook and they'll eat for the rest of their life").
>
> Meanwhile, it is always useful to have a book around, I heard a lot of
> good about the O'Reilly books for that matter. They have Python books
> for beginners, intermediate and high-performance programming
> (http://shop.oreilly.com/category/browse-subjects/programming/python.do).
>
>
>
> Now, if you allow me a few personal pieces of advice about programming
> (valid for Python and most languages):
>
> * "Always write pseudo-code first"
> o Pseudo-code is "an informal high-level
> <http://en.wikipedia.org/wiki/High-level_programming_language> description
> of the operating principle of a computer program or other
> algorithm" (Thanks Wikipedia, you just saved me 10 minutes to
> find my words)
> o In other words, before you even approach you "file.py" script,
> turn off the screen of your computer, take a piece of paper,
> and write down what your script is supposed to do, what input
> it will accept, what outputs it will generate. First in one
> sentence of plain English. Then break the sentence in
> subtasks. Then continue breaking each of these subtasks into
> smaller ones until you recognise small tasks that you feel
> confident to code in a reasonable number of lines.
> o The pseudo-code is extremely valuable for two reasons:
> + Avoid losing focus of what the script was originally
> intended to do. (once coding, it is quite easy to lose
> sight of the greater scheme)
> + It will help document your script, if you write a wiki or
> simply to comment you code (if you share it with someone
> else, they won't need to read the entire code to
> understand its purpose)
> * "Draw your objects/classes"
> o Essentially, an object/class has a number of attributes
> (=variables) and methods (=functions). For each I typically
> draw a box entitled with the name of the class. Then in the
> box, I list the names of the attributes and the names of the
> methods. The names of the attributes and methods should
> clearly represent what they are meant to contain (attributes)
> or do (methods).
> + I still apply a rule that one of my earliest programming
> teacher taught us: "functions are meant to do stuff,
> therefore their name should always start with a verb of
> action"
> * "Google is your friend"
> o That's a tricky one, but every time you know what you want to
> do but you don't know how on earth you can do it: Google your
> problem. You may have to browse a while, or try different
> search words, but in my experience "Any problem you find to
> write working and efficient code, someone else likely had the
> same problem before you". If you can clearly explain your
> problem, StackOverflow and other such websites may have the
> answer.
> * Use a code versioning tool
> o All the changes you have done for the past week have made your
> script worse and you don't have a copy of last week's script?
> Version control tools such as git/GitHub and svn will help you
> keep track of what your code looked like along the way. This
> way, you can edit a script that is working to try and enhance
> it without the fear of messing it up. If it goes sour, you can
> just go back to the working script without having to keep a
> separate backup.
> * Use a friendly (but still powerful) development environement
> o IDE (Integrated development environement) are software which
> are meant to make programming easier. A (silly?) example is a
> feature I cannot work without: auto-completion. Tired of
> typing the same long variable name over and over again? Once
> you have defined "variable=5" in your script, a decent IDE
> will allow you to type only "var" and opens you a friendly
> pop-up window suggesting you all existing variables and
> methods starting with "var". Select the one you need with the
> arrow keys and hit TAB: you don't have to type the rest of the
> variable. An amusing side-effect of this is that your variable
> names will grow longer (and therefore be more explicit about
> what they contain). IDE come with many more features including
> code checking, spell checking, ...
> o For Python I am very happy with PyCharm
>
>
>
> This email ended up to be much longer than I intended it, but I hope
> you will find it useful !
> The learning curve to Python progamming can be rough. Learning
> additional tricks like version control, IDE, and object-oriented
> programming can make it even steeper, but the end result is a very
> rewarding skillset that can be helpful in many circumstances and
> appeal to many research group leaders too!
>
> Best of luck in your learning of Python !
>
> Kevin
>
>
>
>
> On 15 April 2014 15:58, Csaba Kiss <csaba.kiss at lanl.gov
> <mailto:csaba.kiss at lanl.gov>> wrote:
>
> Hi!
> I need some advice how to get better in python. I have written a
> software package to analyze antibody deep sequencing data. This
> was my first experience with python and I am not a programmer. The
> end result works, however, if a professional coder looks at the
> scripts, it is obvious that it was written by an amateur. I am
> planning to re-write the code into a better format that is
> extendable and more user and coder friendly. At the moment the
> script only relies on biopython to get the sequences and quality
> values out of sff and fastq files, the rest is custom written. I
> would like to rely more on biopython and also perhaps extend
> biopython with new features.
> The problem I am having is object oriented python and classes. I
> understand the concept of both, but it's completely different to
> actually use it. I would like to ask help from scientist who are
> in a similar situation, as myself. I am a molecular biologist with
> interest in coding, but little background. Do you have any good
> tutorials books about python classes and OOP? For example, when I
> learned python I found the Google python class, extremely
> valuable. I practically looked at the videos and solved the
> problems and that sent me on my way to python:
> https://developers.google.com/edu/python/?csw=1
>
> Any help would be appreciated:
> Csaba
>
> --
> Best Regards:
> Csaba Kiss PhD, MSc, BSc
> TA-43, HRL-1, MS888
> Los Alamos National Laboratory
> Work: 1-505-667-9898 <tel:1-505-667-9898>
> Cell: 1-505-920-5774 <tel:1-505-920-5774>
>
> _______________________________________________
> Biopython mailing list - Biopython at lists.open-bio.org
> <mailto:Biopython at lists.open-bio.org>
> http://lists.open-bio.org/mailman/listinfo/biopython
>
>
>
>
> --
> Kévin RUE-ALBRECHT
> Wellcome Trust Computational Infection Biology PhD Programme
> University College Dublin
> Ireland
> http://fr.linkedin.com/pub/k%C3%A9vin-rue/28/a45/149/en
--
Best Regards:
Csaba Kiss PhD, MSc, BSc
TA-43, HRL-1, MS888
Los Alamos National Laboratory
Work: 1-505-667-9898
Cell: 1-505-920-5774
More information about the Biopython
mailing list