[Bioperl-l] A problem about a subroutin in my code

Alex Zhang mayagao1999 at yahoo.com
Fri Jul 29 20:37:35 EDT 2005





Dear all,
 
Sorry to bother you. I need some help on my code. I have an input file named
"origin8.txt" which holds 200 short sequences of width 8. My code is to use each
short sequence from "origin8.txt" as a template to generate 100 short sequences of the same
width and store them in a txt file A. 

Then the code will read 100 short sequences from the txt file A and 100 long sequences of width 200 from a txt file B , and then replaced a substring of each long sequence using each short sequence. This code will lead to two txt files C and D. File C will hold 100 replaced long sequences. 

In other words, I want to input "origin8.txt" to get 200 File D. 

My code can generates 200 File D but each of them holds nothing. So I guess the problem is caused by a failure of passing the data to a subroutine named "make_file". 

Can anybody suggest me how to modify that? Thank you very much in advance!

Sincerely,

     Alex

 

 

 

My code:

 

*******************************************************************

 #!/usr/bin/perl
use strict;
use warnings;
my (@origin, $y);
my $N_Sequences = 100; 
my @Alphabet = split(//,'ACGT');      
my $P_Consensus = 0.85;               # This is the probability of dominant letter
# ====== Globals ==========================
my @Probabilities;                    # Stores the probability of each character


# ====== Program ==========================

open (ORIGIN, "< origin8.txt");       # This file holds 200 sequences used for motif template
chomp (@origin = <ORIGIN>);
close ORIGIN;

for ($y=0; $y<=$#origin; $y++) {
  

    my @Motif = split(//,'$origin[$y]');     # This is a loop to get the motif template from origin8
    open (OUT_NORM, ">short_sequences8_[$y].txt") or die "Unable to open file :$!";
        for (my $i=0; $i < $N_Sequences; $i++) {
            for (my $j=0; $j < scalar(@Motif); $j++) {
                 loadConsensusCharacter($Motif[$j]);    
                 addNoiseToDistribution();             
                 convertToIntervals();
                 print OUT_NORM (getRandomCharacter(rand(1.0)));
                                                     }
            print OUT_NORM "\n";
            make_files();
                                               }
                              }

exit();

# ====== Subroutines =======================
#
sub loadConsensusCharacter {
    my ($char) = @_;
    my $Found = 'FALSE';

    for (my $i=0; $i < scalar(@Alphabet); $i++) {
        if ( $char eq $Alphabet[$i]) {
            $Probabilities[$i] = 1.0;
            $Found = 'TRUE';
        } else {
            $Probabilities[$i] = 0.0;
        }
    }
    if ($Found eq 'FALSE') {
    die("Panic: Motif-Character\"$char\" was not found in Alphabet.
Aborting.\n");
    }

return();
}

# ==========================================
sub addNoiseToDistribution {


    my $P_NonConsensus = ( 1.0-$P_Consensus) / (scalar(@Alphabet) - 1);

    for (my $i=0; $i < scalar(@Probabilities); $i++) {
        if ( $Probabilities[$i] == 1.0 ) {     
            $Probabilities[$i] = $P_Consensus;
        } else {
            $Probabilities[$i] = $P_NonConsensus;
        }
    }

    return();
}

# ==========================================
sub convertToIntervals {

    my $Sum = 0;

    for (my $i=1; $i < scalar(@Probabilities); $i++) {
        $Probabilities[$i] += $Probabilities[$i-1];
    }

    return();
}

# ==========================================
sub getRandomCharacter {

    my ($RandomNumber) = @_;
    my $i=0;
    for ($i=0; $i < scalar(@Probabilities); $i++) {
        if ($Probabilities[$i] > $RandomNumber) { last; }
    }

    return($Alphabet[$i]);
}

# ==========================================
sub make_files {
my (@short, @long,$x,$r, $output_norm);

open (SHORT, "< short_sequences8_[$y].txt");
chomp (@short = <SHORT>);
close SHORT;

open (LONG, "< long_sequences.txt");
chomp (@long = <LONG>);
close LONG;

open (OUT_INITIAL,  "> output8_[$y]1.txt");
open (OUT_REPLACED, "> output8_[$y]2.txt");

for ($x=0; $x<=$#short; $x++) {
  $r=2;
  print OUT_INITIAL ">SeqName$x\n$long[$x]\n";
  print OUT_REPLACED "SeqName$x\n" . substr($long[$x], $r, length $short[$x]) . "\n";}


close OUT_INITIAL;
close OUT_REPLACED;

} 

*******************************************************************

 

 

Input file "origin8.txt" holds 200 sequences as:

 

 TTTATAAT
TGTCAATG
CGTTGATG
CGTCCTAG
GGCTTCCA
ATTAGCCT
GTCCTGAT
TGTAAATC
CGCTTATT
TTGACATA
CCTGATAT
ATGAATCG
CGTCCGAT
TGGCCCAT
ATCCTGAT
TGCCCATT
CCCTAACT
AAAAAAAA
TTTTTTTT
CCCCCCCC
GGGGGGGG
AAAAAAAT
AAAAAAAG
AAAAAAAC
AAAAAACC
AAAAAATT
AAAAAAGG
AAAAAACT
AAAAAACG
AAAAAACA
AAAAACAA
AAAACAAA
AAACAAAA
AACAAAAA
ACAAAAAA
CAAAAAAA
AAAAAATA
AAAAATAA
AAAATAAA
AAATAAAA
AATAAAAA
ATAAAAAA
TAAAAAAA
AAAAAAGA
AAAAAGAA
AAAAGAAA
AAAGAAAA
AAGAAAAA
AGAAAAAA
GAAAAAAA
AAAACCAA
AACCAAAA
CCAAAAAA
AAAATTAA
AATTAAAA
TTAAAAAA
AAAAACCC
AAAACCCA
AAACCCAA
AACCCAAA
ACCCAAAA
CCCAAAAA
AAAAATTT
AAAATTTA
AAATTTAA
AATTTAAA
ATTTAAAA
TTTAAAAA
AAAAAGGG
AAAAGGGA
AAAGGGAA
AAGGGAAA
AGGGAAAA
GGGAAAAA
AAAACCCC
AAACCCCA
AACCCCAA
ACCCCAAA
CCCCAAAA
AAAATTTT
AAATTTTA
AATTTTA A
ATTTTAAA
TTTTAAAA
AAAAGGGG
AAAGGGGA
AAGGGGAA
AGGGGAAA
GGGGAAAA
AAACCCCC
AACCCCCA
ACCCCCAA
CCCCCAAA
AAATTTTT
AATTTTTA
ATTTTTAA
TTTTTAAA
AAAGGGGG
AAGGGGGA
AGGGGGAA
GGGGGAAA
AAGGGGGG
AGGGGGGA
GGGGGGAA
AACCCCCC
ACCCCCCA
CCCCCCAA
AATTTTTT
ATTTTTTA
TTTTTTAA
ATTTTTTT
TTTTTTTA
ACCCCCCC
CCCCCCCA
AGGGGGGG
GGGGGGGA
ATTTTTTT
TTTTTTTA
ATAAAATA
AATAAATA
AAATAATA
AAAATATA
ACAAAACA
AACAAACA
AAACAACA
AAAACACA
AGAAAAGA
AAGAAAGA
AAAGAAGA
AAAAGAGA
ATAAAAGA
ATAAAACA
AGAAAATA
AGAAAACA
ACAAAAGA
ACAAAATA
ATTAAATA
AATTAATA
AAATTATA
ACCAAACA
AACCAACA
AAACCACA
AGGAAAGA
AAGGAAGA
AAAGGAGA
ATTTAATA
AATTTATA
ACCCAACA
AACCCACA
AGGGAAGA
AAGGGAGA
ATTTAACA
ATTTAAGA
AATTTACA
AATTTAGA
ACCCAATA
ACCCAAGA
AACCCATA
AACCCAGA
AGGGAACA
AGGGAATA
AAGGGATA
AAGGGACA
TTGGGACA
CCGGGACA< BR>AGAAGGGA
TGCCCATA
TAAAAAAT
TGCCTATA
CCGTAGTC
ACTTGACT
CTGATCCC
TGTGACTA
CCTGATCC
CCTGAACC
TGATCACG
GGGTAACC
CTTTTGAA
TTGTATGA
CCTGATAA
CTGGTTAG
CCCCGACC
TTGGGGAC
GGTTTGAC
GCTTAGAC
GTTACACC
TTGTACCA
TGGTACCA
CCGTACAT
CCCTTGCC
GTGTTGGT
ATCGATCG
ACGTACGT
TCAGTCAG
GCTATACG
GTCCATAC
CCGTCCGT
ATATATCC
GTGTCCCC 


---------------------------------
Yahoo! Mail for Mobile
Take Yahoo! Mail with you! Check email on your mobile phone.

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


More information about the Bioperl-l mailing list