Statistical Learning in Computational Biology


Recent advances in high-throughput technologies have led to an exponential increase in biological data (such as genomic, epigenomic and proteomic data). To find meaningful insights in such large data collections, efficient statistical learning methods are needed. In January 2013 the group "Statistical Learning in Computational Biology" was established at the Department of Computational Biology and Applied Algorithmics. The group is headed by Dr. Nico Pfeifer. We are interested in developing and applying new machine learning / statistical learning methods to solving computational biology problems and answering new biological questions. Previously, we focused on proteomic data, but now the focus is more on epigenomic and genomic data.

Application areas include the study of viruses like HIV, Hepatitis C or Influenza as well as the field of epigenetics. Method-wise we are interested in

  • integration of heterogeneous data sets
  • improving interpretability of non-linear estimators
  • efficient learning methods for large data sets

All those topics can be subsumed under the category "Machine learning for precision medicine".