[Bioperl-l] Re: [Gmod-gbrowse] Adding human chromosomes as reference sequences

Ilari Scheinin ilari.scheinin at helsinki.fi
Wed Jul 20 05:37:36 EDT 2005


On 19.7.2005, at 19:43, Lincoln Stein wrote:
> I'm sorry that the ucsc_genes2gff.pl script isn't loading the 
> chromosome
> extents; We just need a similar script called ucsc_chromosomes2gff.pl 
> or
> something similar. Ilari, since you've already essentially done this, 
> perhaps
> you'd be willing to contribute the script? I'll add it to bioperl.

Actually I wrote my script with PHP, because I don't really know much 
about Perl. I just recently wanted to use gbrowse and for that reason 
installed Bioperl. I have started learning myself some Perl, but I 
think I'm more in the "Hello world" stage than chromosomes2gff stage.

Anyway, the chromInfo.txt file from UCSC is just a tab delimited file 
where the first field is the name of the chromosome, and the second 
field contains the number of bases. So it is really simple to do a 
chromosomes2gff script.

If someone is interested, here is the PHP script I used. It doesn't 
convert the chromosome info to a GFF file, but directly loads the data 
into a mysql database. It is a really dummy script and doesn't do any 
kind of checks whether it can really read the provided file and access 
the database, or whether some of the data already exists. It doesn't 
touch the fbin column of the table fdata, because I have no idea what 
it is for. It is not mentioned in perldoc 
Bio::DB::GFF::Adaptor::dbi::mysql.

#!/usr/bin/php -f
<?php
         $host = "";
         $db = "";
         $user = "";
         $pass = "";

         $file = $argv[1];
         if (!$file) {
                 echo "Usage: $argv[0] <path to chromInfo.txt>\n";
                 exit();
         }
         $con = mysql_connect($host, $user, $pass);
         mysql_select_db($db, $con);
         mysql_query("insert into ftype (fmethod, fsource) values 
('chromosome', 'assembly')", $con);
         $ftypeid = mysql_insert_id($con);
         $fp = fopen($file, "r");
         $count = 0;
         while ($line = fgets($fp)) {
                 $fields = explode("\t", $line);
                 mysql_query("insert into fgroup (gclass, gname) values 
('chromosome', '$fields[0]')", $con);
                 $gid = mysql_insert_id();
                 mysql_query("insert into fdata (fref, fstart, fstop, 
ftypeid, gid) values ('$fields[0]', 1, $fields[1], $ftypeid, $gid)", 
$con);
                 $count++;
         }
         fclose($fp);
         mysql_close($con);
         echo "Added $count entries.\n";
?>

Ilari



More information about the Bioperl-l mailing list