Dr. rer. nat. Dennis Dobler

Dr. rer. nat. Dennis Dobler
Institut für Statistik

»Nonparametric inference procedures for multi-state Markovian models with applications to incomplete life science data«

The dissertation of Dr. Dennis Dobler addresses the development of adequate statistical inference methods in modern survival analysis. Usually, time-to-event data are analyzed in classical survival models (such as the Cox model) which completely ignore different (e.g. disease) stages of the patients until an absorbing event occurs (e.g. death). However, this habit has slightly changed in the last years. In particular, the focus has shifted towards specific Markov models such as competing risks models or more general multi-state models for complex time-to-event data. This enables the modeling and the analysis of various disease developments and classifications. Apart from an adequate handling of incomplete observations, a major difficulty is the typical requirement (for medical applications) of information on the development of the patients‘ condition over time. Pointwise confidence intervals at a few single points of time may be easy to derive but are too short-sighted as a focus on a greater time span calls for an adequate adjustment of the involved larger estimation uncertainties. This observation leads to the time-dynamic estimation of so-called state transition probabilities and hazards and related tailored resampling procedures.

That is exactly the initial point of Dr. Dobler‘s Dissertation. He developed an extensive theory for the construction of adequate time-dynamic inference procedures (such as confidence bands) for such state transition probabilities. He did not only obtain various and numerous results in this matter which were associated with a formidable technical complexity. The newly developed methods also exceed the results in the state-of-the-art literature and are highly relevant for medical applications.

For instance, the since 1993 open question of the validity of the so-called „weird bootstrap“ in survival analysis has been answered positively and this procedure has even been embedded in a more general context of resampling techniques. The importance of this validity result is all the more pronounced as the weird bootstrap technique had already been implemented and utilized in statistical programming languages. Furthermore, the obtained results explain why, in some sense, the weird bootstrap method is more natural than the usually used standard method.

The general idea of the weird bootstrap is to replace the actually observed state transitions by randomly generated numbers of transitions where the data generation is conducted using different data-dependent distributions for different points of time. Simulating a large number of such so-called weird bootstrap processes, the asymptotic distributions (i.e. with increasing sample sizes) of function-valued state transition probability estimators are reproduced adequately and the involved estimation uncertainty is evaluated time-simultaneously.

Further achievements in the dissertation include the extension of the so-called „wild bootstrap“ to general time-inhomogeneous and incompletely observed Markovian multi-state models, the efficient and practice-oriented implementation of moment approximation techniques as well as answering an unsolved problem in the bootstrap treatment of the Kaplan-Meier estimator for survival probabilities.

Figure (left): Simplified illustration of a competing risks model with two absorbing states (e.g. two different causes of death). The quantities a(t) and b(t), plotted next to the arrows, may be interpreted as the instantaneous forces which draw an individual in the direction of each absorbing state.

Figure (right): The middle, solid line represents the development in time of the estimator for the cumulative event probability of the first competing event. The narrow region between the two dotted lines incorporates all pointwise asymptotic 95% confidence intervals for single state transition probabilities (these are only interpretable for each vertical section, i.e. for each single point of time). The broader region between the two slim straight lines is an asymptotic 95% confidence band for the true underlying cumulative event probability curve, i.e. this curve is contained therein time-simultaneously with a probability of approximately 95% if the sample size is sufficiently large. This band has been constructed utilizing the weird bootstrap.