Z1: Bioinformatics core - semantic knowledge fusion and non-standard optimization - Next-generation sequencing core
The Z1 project, which will continue to offer next-generation sequencing service, will be able to serve as cen-tral bioinformatics support for all participating scientists of the CRC. Besides its function as a core se-quencing and bioinformatics unit, the Z1 project aims to use high-dimensional data sets generated by the CRC projects to develop novel methods for data mining methods are paramount to analyze high-dimensional data.
The aim of this project is the development of parallel algorithms for data mining of next-generation se-quencing data. Based on our parallel distributed computing platform, we will develop new methods for clus-tering, classification, and feature selection. Here, we will focus on the integration of semantic information into knowledge-driven data mining strategies. This semantic integration process will be used to develop interpretable diagnostic models via the adaptation of purely data-driven learning algorithms in order to pre-vent possible model over-fitting. Additionally, the semantic knowledge will increase the interpretability of the chosen models.
Furthermore, we will investigate population-based non-standard optimization procedures like swarm algo-rithms for clustering and feature selection in next-generation sequencing data. By design, these methods offer implicit parallelization for multi-core computers.
Visualization of next-generation sequencing data is a key task of applied bioinformatics. It enables re-searchers to capture significant changes in genomics profiles. We will investigate visualization methods for displaying shared mutations across cohorts and the characterization of specific mutations in personalized medicine.
For a current list of project-related publications, please go to this page