I532
Fall 2004
Lecture: 2:30pm-4:20pm M JH248
Lab: 9:30am-10:45am W I109
Office
227 Informatics building
901 10th street
Office Hours (Tentative)
4:30 - 5:30 pm (Monday) 248 Jordan Hall
1:30 - 2:30 pm (Thursday) 227 Informatics
or by appointment
Course Outline
The goal of L529 is to teach issues in
the design and implementation of
bioinformatics tools and systems. The student will pursue
a term project that either explores a new computational
tool or develops an information system for large scale
biological data mining.
The course will be divided into four parts:
study of probabilistic and algorithmic techniques in biological sequence analysis,
study of issues in the design and implementation of information systems for
biological research,
literature survey on recent development in bioinformatics, and
a term project
Text Books
(REQUIRED)
Biological Sequence Analysis
Durbin, Eddy, Drogh and Mitchinson
0-521-62971-3,
Cambridge University Press
(OPTIONAL)
Mastering Algorithm with Perl
Orwant, Hietaniemi, and MacDonald
0-56592-398-7, Oreilly
LEDA : A Platform for Combinatorial and Geometric Computing
by Kurt Mehlhorn (Author), Stefan Näher (Author)
The Boost Graph Library User Guide and Reference Manual
by Lie-Quan Lee, Andrew Lumsdaine, Jeremy G. Siek
Machine Learning
Thomas M. Mitchell
ISBN: 0070428077
1997
McGraw-Hill Companies,
Data Mining; Practical Machine Learning Tools and Techniques with Java Implementations
Ian H. Witten, Eibe Frank
ISBN: 1558605525
1999 Elsevier Science & Technology
Prerequisites
Bioinformatics
Knowledge equivalent to L519.
here
Computer Science
Knowledge equivalent to C343/A594 (Data Structures)
see here
Biology
Some understanding of biology.
Try terminologies such as translation, transcription,
promotors, exon, intron, splicing,
SNP, gene expression profile, transcription factors,
hybridization, codon, centromete,
TATA box, alhpa-helix, beta-sheet, etc
Statistics/Probability
Undergraduate knowledge of Statistics/Probability
Grading
Exam, Quiz (30%): A programming assignment as an exam
be sometime October.
Homework (20%):
Presentation of papers (10%):
We will collect a set of recent papers on bioinformaitcs.
We will discuss about what topics should be covered by
a collection of papers by the end of October.
Once we finalize the paper collection,
starting from early November, a team of two students will present
papers of your selection, focusing on the technical aspect
of the paper.
You may choose papers closely related to your term
or thesis projects if all agree.
Term project (40%):
The topic for your term project needs to be selected by
the end of October. There will be two presentations and/or written reports,
one in the middle of the semester and another at the end of
the semester. The first should include
the motivation and overview of the project. The second
should describe the design and implementation
of your project and provide some result
with biological interpretation.
Lecture Schedule
We will cover the main textbook,
Biological Sequence Analysis, as much as possible.
Topics not included in the text will be discussed a collection of papers.
Everyone should read all research papers, including the
papers you are not presenting.
Resources
IBM SP or Solar
Anyone who do not have an account can request an account
at
http://www.indiana.edu/~rats/application.shtml
Be sure to mention that you are requesting an account for L529.
Linux machine at Informatics: biokdd and dna
Contact Scott Martin at
sccmarti@indiana.edu
We will support algorithmic packages for Perl, LEDA and Boost libraries.
Others, including JDSL may be supported.
Oncourse for L529
Course homepage at http://bio.informatics.indiana.edu/L529/
Teamwork
This will be a challenging course. Students are encouraged to help each
other, as we did in L519, throughout the semester.
Course Related Links
homework
exams
lab
reading assignment
UsefulLinks
Miroarray page at UIUC
follow "Links" on left, and then select "All about miroarray".
SCOP
LEDA
WEKA
GraphViz
BIO-IT
Perl for Bioinformatics
Dirichlet Mixtures and other Regularizers atUCSC
a link to genetic networks
A course on algorithms in bioinformatics
Statistical Methods in Computational Genetics