I532
Fall 2004

Lecture: 2:30pm-4:20pm M JH248

Lab: 9:30am-10:45am W I109


Instructor: Sun Kim

Office

227 Informatics building
901 10th street

Office Hours (Tentative)

4:30 - 5:30 pm (Monday) 248 Jordan Hall
1:30 - 2:30 pm (Thursday) 227 Informatics
or by appointment

Course Outline

The goal of L529 is to teach issues in the design and implementation of bioinformatics tools and systems. The student will pursue a term project that either explores a new computational tool or develops an information system for large scale biological data mining.
The course will be divided into four parts:
  • study of probabilistic and algorithmic techniques in biological sequence analysis,
  • study of issues in the design and implementation of information systems for biological research,
  • literature survey on recent development in bioinformatics, and
  • a term project

    Text Books

    (REQUIRED)
  • Biological Sequence Analysis
    Durbin, Eddy, Drogh and Mitchinson
    0-521-62971-3, Cambridge University Press

    (OPTIONAL)
  • Mastering Algorithm with Perl
    Orwant, Hietaniemi, and MacDonald
    0-56592-398-7, Oreilly
  • LEDA : A Platform for Combinatorial and Geometric Computing
    by Kurt Mehlhorn (Author), Stefan Näher (Author)
  • The Boost Graph Library User Guide and Reference Manual
    by Lie-Quan Lee, Andrew Lumsdaine, Jeremy G. Siek
  • Machine Learning
    Thomas M. Mitchell
    ISBN: 0070428077 1997 McGraw-Hill Companies,
  • Data Mining; Practical Machine Learning Tools and Techniques with Java Implementations
    Ian H. Witten, Eibe Frank
    ISBN: 1558605525 1999 Elsevier Science & Technology

    Prerequisites

  • Bioinformatics
    Knowledge equivalent to L519. here
  • Computer Science
    Knowledge equivalent to C343/A594 (Data Structures) see here
  • Biology
    Some understanding of biology. Try terminologies such as translation, transcription, promotors, exon, intron, splicing, SNP, gene expression profile, transcription factors, hybridization, codon, centromete, TATA box, alhpa-helix, beta-sheet, etc
  • Statistics/Probability
    Undergraduate knowledge of Statistics/Probability

    Grading

  • Exam, Quiz (30%): A programming assignment as an exam be sometime October.
  • Homework (20%):
  • Presentation of papers (10%): We will collect a set of recent papers on bioinformaitcs. We will discuss about what topics should be covered by a collection of papers by the end of October. Once we finalize the paper collection, starting from early November, a team of two students will present papers of your selection, focusing on the technical aspect of the paper. You may choose papers closely related to your term or thesis projects if all agree.
  • Term project (40%): The topic for your term project needs to be selected by the end of October. There will be two presentations and/or written reports, one in the middle of the semester and another at the end of the semester. The first should include the motivation and overview of the project. The second should describe the design and implementation of your project and provide some result with biological interpretation.


    Lecture Schedule

  • We will cover the main textbook, Biological Sequence Analysis, as much as possible.
  • Topics not included in the text will be discussed a collection of papers. Everyone should read all research papers, including the papers you are not presenting.

    Resources

  • IBM SP or Solar
    Anyone who do not have an account can request an account at http://www.indiana.edu/~rats/application.shtml
    Be sure to mention that you are requesting an account for L529.
  • Linux machine at Informatics: biokdd and dna
    Contact Scott Martin at
    sccmarti@indiana.edu
  • We will support algorithmic packages for Perl, LEDA and Boost libraries.
    Others, including JDSL may be supported.
  • Oncourse for L529
  • Course homepage at http://bio.informatics.indiana.edu/L529/

    Teamwork

    This will be a challenging course. Students are encouraged to help each other, as we did in L519, throughout the semester.

    Course Related Links

  • homework
  • exams
  • lab
  • reading assignment

    UsefulLinks

  • Miroarray page at UIUC
    follow "Links" on left, and then select "All about miroarray".
  • SCOP
  • LEDA
  • WEKA
  • GraphViz
  • BIO-IT
  • Perl for Bioinformatics
  • Dirichlet Mixtures and other Regularizers atUCSC
  • a link to genetic networks
  • A course on algorithms in bioinformatics
  • Statistical Methods in Computational Genetics