[Bioperl-l] Assignment problem :(

Kurt Gobain enrique_rulz at yahoo.com
Wed Jan 3 14:42:59 UTC 2007


Hi guys...I ve got this assignment to submit ...I m not asking n e 1 to do it
for me...But if n e can gimme n e direction will be really glad,,,There are
four .embl files which I have attached...We need to do the following things
1) Read 3 arguments from the user at command line.
2) Read in the set of files in a directory.
3)for each sequence reformat the sequence in to fasta format.
4)for each sequence in fasta format search for close homologues using the
local copy of BLAST.
5) for each sequence print the swissprot name and code of its close
homologues to an HTML formatted file that can be viewed in a web browser.Is
it possible to extract the evalues for each hit & record them in the HTML
file?
  
I  m done with first two But havin problem in 3,4 & 5...I ve jus got basic
knowledge of PERL... I ve been  really tryin hard to get solution...But not
able to solve them...Plzz..If n e 1 can help me in giving direction will be
really great full...

Thanx in advance..

File 1 :  P41391.embl
=========================================================================================
ID   sp|P41391|RNA1_SCHPOstandard; AA; UNK; 386 BP.
XX
AC   unknown;
XX
DE   Ran GTPase-activating protein 1 (Protein rna1) - Schizosaccharomyces
pombe
DE   (Fission yeast).
XX
FH   Key             Location/Qualifiers
FH
XX
SQ   Sequence 386 BP; 31 A; 4 C; 20 G; 18 T; 313 other;
     msrfsiegks lkldaitted eksvfavlle ddsvkeivls gntigteaar wlseniaskk       
60
     dleiaefsdi ftgrvkdeip ealrlllqal lkcpklhtvr lsdnafgpta qeplidflsk      
120
     htplehlylh nnglgpqaga kiaralqela vnkkaknapp lrsiicgrnr lengsmkewa      
180
     ktfqshrllh tvkmvqngir pegiehllle glaycqelkv ldlqdntfth lgssalaial      
240
     kswpnlrelg lndcllsarg aaavvdafsk leniglqtlr lqyneielda vrtlktvide      
300
     kmpdllflel ngnrfseedd vvdeirevfs trgrgeldel ddmeeltdee eedeeeeaes      
360
     qspepetsee ekedkelade lskahi                                           
386
//
===========================================================================================
File 2:  P43994.embl
===========================================================================================
ID   sp|P43994|Y395_HAEINstandard; AA; UNK; 102 BP.
XX
AC   unknown;
XX
DE   UPF0125 protein HI0395 - Haemophilus influenzae.
XX
FH   Key             Location/Qualifiers
FH
XX
SQ   Sequence 102 BP; 10 A; 0 C; 5 G; 5 T; 82 other;
     mnqinieiay afperyylks fqvdegitvq taitqsgils qfpeidlstn kigifsrpik       
60
     ltdvlkegdr ieiyrpllad pkeirrkraa eqaaakdkek ga                         
102
//
========================================================================================== 
File 3: Q9UJ38.embl
==========================================================================================
ID   sp|Q9UJ37|SI7B_HUMANstandard; AA; UNK; 374 BP.
XX
AC   unknown;
XX
DE   Alpha-N-acetylgalactosaminide alpha-2,6-sialyltransferase II (EC
2.4.99.-)
DE   (Gal-beta-1,3-GalNAc alpha-2,6-sialyltransferase) (ST6GalNAc II)
DE   (Sialyltransferase 7B) (SThM) - Homo sapiens (Human).
XX
FH   Key             Location/Qualifiers
FH
XX
SQ   Sequence 374 BP; 30 A; 5 C; 31 G; 18 T; 290 other;
     mglprgsffw llllltaacs gllfalyfsa vqrypgpaag ardttsfeaf fqskasnswt       
60
     gkgqacrhll hlaiqrhphf rglfnlsipv llwgdlftpa lwdrlsqhka pygwrglshq      
120
     viastlslln gsesaklfap prdtppkcir cavvgnggil ngsrqgpnid ahdyvfrlng      
180
     avikgferdv gtktsfygft vntmknslvs ywnlgftsvp qgqdlqyifi psdirdyvml      
240
     rsailgvpvp egldkgdrph ayfgpeasas kfkllhpdfi sylterflks klinthfgdl      
300
     ympstgalml ltalhtcdqv saygfitsny wkfsdhyfer kmkplifyan hdlsleaalw      
360
     rdlhkagilq lyqr                                                        
374
//
=========================================================================================== 
File 4:  Q9WZY7.embl
==========================================================================================
ID   tr|Q9WZY7  standard; AA; UNK; 185 BP.
XX
AC   unknown;
XX
DE   Hypothetical protein - Thermotoga maritima.
XX
FH   Key             Location/Qualifiers
FH
XX
SQ   Sequence 185 BP; 17 A; 2 C; 13 G; 12 T; 141 other;
     mvlfekpgke ntrktleiai qkaselsskk lliasatgys armalemipe dmklvvvthh       
60
     agfeepdtqe fdeelrkllk ekghdvltat halsagersl rrkfggiypl eiiantlrmf      
120
     segvkvgvei tlmaadaglv ktselvvacg gtesgldsai vvkpanspnl fdlkiteilc      
180
     kplis                                                                  
185
//
===========================================================================================

I ve even Copy pasted the file if u guys not able to dowload
http://maxupload.com/0FD1F2F4 <-- P41391.embl
http://maxupload.com/E62811F4 <--P43994.embl 
http://maxupload.com/88F8A43E<- Q9UJ38.embl
http://maxupload.com/B687CE7D<--Q9WZY7.embl
-- 
View this message in context: http://www.nabble.com/Assignment-problem-%3A%28-tf2913859.html#a8141866
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.




More information about the Bioperl-l mailing list