This page contains information and downloads for Computational Science 670, Fall 2006.

 

Explore the website which talks about BLAST, a software package for comparing biostrings. In particular, try to make sense of this page.


Readings
  1. NRM Viral Metagenomics
  2. Phage Proteomic Tree
  3. Substitution Matrices
  4. A much more readable article on the BLOSUM62 scoring matrix.

Data

The way that the distances file was obtained can be found here.

A zip file of MATLAB visualization tools for the data is here. After unzipping, run "GetStarted". The .mat files appear corrupt on some platforms due to a new MATLAB "feature" and you may need to run SparseVersion_maker and phagetoproteins_maker to recreate the .mat files for the platform on which you are running MATLAB. These two scripts (as well as the 3 text files listed above) are now included in the .zip package.

A zip file of PERL resources is here. It contains two files. The first is a perl module (Comp670.pm) for reading the protein distance, length, and proteins.txt files.
The second is a simple demo program showing how to get data out from the first.

To find the amino acid sequence in any protein in the data, you can go to this link: http://phage.sdsu.edu/~rob/cgi-bin/phage.cgi. Alternatively, you can find it the fourth datafile above.

How to make a treefile