CMPE 280B Spring 2003 Home Page

Bioinformatics Research Seminar


This course is a weekly research seminar that assumes that students have substantial background in biology, chemistry, computer science, or statistics.

Room:
Social Science 2, room 137
Time:
10--11:45 Thursdays

The seminar will consist of student presentations, primarily from their own research, though students who wish to take the seminar but have no research of their own to report may present papers from the literature. I would like a title and abstract from each presenter at least a week ahead of time to put on this web page.

Evaluation will be based on the student presentation and on attendance and participation at other students' presentations.


Tentative schedule

3 April 2003 Administrative details, choosing dates
Firas Khatib
Knotfind

We present in this paper Knotfind, a computer program for determining if a given protein has a knot in it. When given a pdb file, Knotfind iteratively steps through each chain in the file and outputs whether or not the chain has a knot in it. For each chain that has a knot, the residue numbers and the corresponding xyz coordinates that form the knot are given. Knots are very rare in proteins: of the 2000+ proteins in the PDB, only 11 are known to have knots. Because of this, protein structure prediction programs such as Rosetta should be expected to rarely produce knots in their predictions. Upon running Knotfind on the CASP 5 predictions for Rosetta, a significant number of Rosetta decoys were determined to have knots. Accordingly, the Knotfind algorithm would be ideal for incorporation into the Rosetta algorithm.

10 April 2003
Sareina Wu
Prediction of the Function of UNC5H1 at the Synapses: a Bioinformatic Approach

Testing the involvement of a protein in a particular biological task will be extremely difficult without any clue. In vertebrate, UNC5Hs (H1, H2, and H3) are repulsive receptors for the bi-functional guidance cue, Netrin-1, in guiding neurons or axons to their final destination during neuronal development. How UNC5Hs specify the Netrin signal remains unknown. In my previous studies, two PDZ-domain containing proteins, PICK1 and GIPC are identified as UNC5H1 interacting partners. The association of UNC5H1 with the PICK1 in the synapse of hippocampal neurons suggests a novel role UNC5H1 in the synapses. A bioinformatic approach in searching more UNC5H1 interacting synaptic proteins may ease the process of revealing the role of UNC5H1 in the synapse. Toward this final goal, here I reported an initial attempt in finding potential PDZ containing proteins that might interact with UNC5H1. By homology modeling, I identified a common motif present in the PDZ domain of both PICK1 and GIPC that might be responsible for interacting with UNC5H1. This motif will provide valuable information in searching more PDZ containing candidates that bind with UNC5H1 at the synapses.

17 April 2003 (Kevin out of town)

24 April 2003
George Shackelford
Using the MDI algorithm for generating an hidden Markov model for secondary protein structure prediction

Hidden Markov models have played an important part in bioinformatics. However, with the exception of HMMSTR, they have not been useful in secondary protein structure prediction. Part of the challenge is in developing a good topology for the HMM.

This is an attempt to build an HMM for such prediction that uses an algorithm to combine states based on the MDI algorithm to generate the topology as well as setting values for the parameters. The MDI algorithm uses the change in relative entropy resulting from combining states to ensure that the combinations reflect the original distribution of the primary model.

Results of tests using a culled Dunbrack set of protein sequences are presented.

1 May 2003
Newton Der
Computer Graphics and Visualization of Protein Structures

Biological data is often too large and complex for a scientist to read and draw conclusions from. Representing the data graphically allows for a person to gain insight more quickly and easily, and computer graphics has proven to be a valuable tool in this regard. Various computer graphics techniques for visualizing biosequence data and molecular dynamics will be surveyed.

8 May 2003
informal discussion of protein research opportunities

15 May 2003 (Bike to Work Day)
Jonathan Casper

22 May 2003
Matt Weirauch

29 May 2003
Two talks:

Josue Samayoa
Incorporating Different Types of NMR Constraints into Rosetta ab initio Structure Prediction

Leon Xing
Something Interesting About Influenza Virus

Influenza virus has influenced human species for hundreds of years, if not longer. And it's still doing the same thing on us year after year. Numerous studies have been done on this virus throughout the years. As a result, a huge collection of information on this virus is available for us to examine.

In this talk, I'll present some interesting observations on this virus through sequence analysis and structure comparison. As we will see from the discussion, many important biological questions are related to this virus. These questions have to do with how the virus survives and evolves against human immune system, and how effective vaccines can be developed against it. After all, the dynamics of influenza invasion and replication is what makes flu unique among viruses.

5 June 2003
Jenny Draper
An Investigation of SVM Kernels for Predicting Protein Secondary Structure

Protein secondary structure is an obvious target for prediction by machine learning, as certain residues and patterns of residues in proteins have definate preferences for specific structural states. Currently, the state-of-the-art in such prediction is held by methods which use a multiple alignment of the target sequence as input to a neural network, which achieve an accuracy >80%. As far as I know, single-sequence prediction has not been improved much beyond use of the Chou-Fasman Rules, which achieve approximately 60% accuracy.

In this talk, I will review methods for predicting secondary structure of a single amino acid sequence using a Support Vector Macine (SVM). While these methods do not bring single-sequence prediction accuracy beyond the 60% barrier, it is nontheless an interesting approach that is quite extendable, and has potential. I will provide a simple overview of Support Vector Machines and kernels, describe the software (developed by Ryan Weber) and methodology that was used for this project, and describe attempts to improve prediction with the old kernels, new kernels, and their respective performance in accurately predicting secondary structure. I will also discuss possible future extension and improvement.



slug icon to go to Scool of Engineering home page
SoE home
sketch of Kevin Karplus by Abe
Kevin Karplus's home page
BME-slug-icon
Biomolecular Engineering Department
Karplus's lab page UCSC Bioinformatics research

Questions about page content should be directed to

Kevin Karplus
Biomolecular Engineering
University of California, Santa Cruz
Santa Cruz, CA 95064
USA
karplus@soe.ucsc.edu
1-831-459-4250
318 Physical Sciences Building
Locations of visitors to pages with this footer (started 3 Nov 2008)