[Biopython] New to BP. Looking for closely spaced genes

Dan Tomso dtomso at agbiome.com
Mon Apr 1 15:09:39 EDT 2013


Hi, Mark.

I think BioPython will have the tools you need to do the mechanical handling of sequences.  You might want to contemplate various strategies to do the positional comparisons and data overlays.  For example, if I were approaching this, I would start building position tables for the various content in SQL and then do the set/join/overlap work there.  

But to re-answer your primary question--yes, you can get the sequence and features parsed in BioPython with reasonable convenience.

Best regards,
Dan Tomso

________________________________________
From: biopython-bounces at lists.open-bio.org on behalf of Mark Budde
Sent: Monday, April 01, 2013 2:41 PM
To: biopython
Subject: [Biopython] New to BP. Looking for closely spaced genes

Hi,
Before I dive too far into BioPython, I'd like to get some input if you
BioPython is an appropriate tool for my task....

I would like to look at the human genome ORF structure and identify regions
where ORFs are closely spaced but differentially regulated, and also
identify whether the ORFs are facing the same direction of opposing
directions. To do this, I assume I would first download the annotated
genome and write a script in BioPython annotating how far each ORF is from
it's neighbors, what the orientation is, and store the result in a
dictionary. Then I would download some expression data sets and add this to
the data to the dictionary. Then I would write some algorithm comparing
gene distance, orientation and expression correlation to generate a list of
candidate ORF pairs which fit my criteria.

My question is, is BioPython a reasonable tool to accomplish this, or is it
going to be way to slow whereas some alternative package is better suited
for my task?
Thanks,
Mark Budde
_______________________________________________
Biopython mailing list  -  Biopython at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython



More information about the Biopython mailing list