Tools to accelerate innovation along the automotive value chain
Denis Neumüller
My general research interest lies in the area of software engineering, with a focus on the design and development of software systems. As part of my PhD at the University of Ulm I am conducting research on the (pattern-based) detection of algorithms in source code. This is intended to support, among others, the following use cases:
- Software comprehension:
- Which problems are solved in the code?
- How are they solved and which components are involved?
- Software optimization: Detection and optimization of inefficient algorithm implementations.
- Source code search: Search and retrieval of source code examples for reference.
For this purpose, I work on:
- The development of domain-specific languages (to describe search patterns).
- Static code analysis, e.g. data flow analysis, based on the abstract syntax tree.
- Graph or tree search for locating the specified patterns.
- Applying LLMs for algorithm detection
I employ empirical methods to validate and ensure the quality of my research.
Besides my PhD, I am also involved in research projects such as GENIAL! and assist in teaching.
Teaching
Besides my PhD, I assist(ed) in teaching the following courses:
| Lecture | Semesters |
|---|---|
| Empirical Research Methods (Bachelor) | WiSe 2024, WiSe 2025 |
| Software Quality Assurance (Master) | WiSe 2024, WiSe 2025 |
| Model Driven Software Engineering (Master) | SoSe 2023, SoSe 2024 |
| Software-Engineering Projects (Bachelor & Master) | SoSe 2022, SoSe 2023, WiSe 2023 |
| Research Trends in Software Engineering (Master) | WiSe 2021 |
Topics for Theses and Projects
Algorithm Detection
Context
We are convinced that automatically detecting algorithm implementations in a code base can be helpful to gain knowledge about, which concerns are present in the code base, how they are solved and which components are involved. This knowledge can then support the tasks of software comprehension, software architecture recovery and software maintenance.
Examples of algorithms that could be interesting to detect and some of the insights their detection provides:
- Quicksort -> The application sorts data structure x.
- A* -> The application does path search.
- Raft (Consensus Algorithm) -> The application is a distributed, fault-tolerant system.
Research Question
Large Language Models (LLMs) achieve impressive results on code-related tasks such as code clone detection, code summarization and code generation.
Our first experiments using LLMs for algorithm detection indicate promising performance, with F1 scores of about 77%.
However, our evaluation has so far focused only on smaller algorithms such as Binary Search, Bubble Sort, Matrix Transposition etc.
We are now interested in evaluating LLMs on more complex algorithms implemented in real code bases.
Therefore, the main research question of this thesis is:
- How do LLMs perform in recognizing complex algorithms in source code bases?
Tasks
- Create a dataset of more complex algorithms (e.g. Levenshtein distance, Raft), preferably from real open-source projects, using code from GitHub, Maven, or other web resources.
- Adapt the current LLM evaluation code written in Python to support different evaluation strategies, such as file-based and API-based evaluations.
- Evaluate different LLMs (e.g., Deepseek, Mixtral, LLaMA3, ChatGPT) using the dataset and the implemented evaluation strategies.
- Furthermore, we are interested in explaining the predictions of LLMs, (especially for failure cases).
No prior knowledge of machine learning (ML) is required for this thesis. However, you should be open to familiarizing yourself with a new subject area.
Related Work
- Publications regarding the (automated) curation of datasets:
- Publications regarding the application of LLMs to (other) code related tasks:
- Examples of LLM API interaction:
- Publications regarding explanations:
Contact
If you are interested and/or have any questions, feel free to contact me any time.
We can discuss the topic further and try to adapt it to your personal preferences.
Publications
2024
Exploring the Effectiveness of Abstract Syntax Tree Patterns for Algorithm Recognition
4. International Conference on Code Quality (ICCQ)
Juni 2024
| DOI: | 10.1109/ICCQ60895.2024.10576984 |
| ISBN: | 979-8-3503-6646-4 |
2022
Towards Detecting Algorithm Implementations in Code Bases
24. Workshop Software-Reengineering und -Evolution (WSRE)
Mai 2022
| Datei: |
A Quality Model and Checklists for Reviewing Automotive Test Case Specifications
Software Quality Days (SWQD 2022), Vienna, Austria
Herausgeber: Springer International Publishing
2022
| DOI: | https://doi.org/10.1007/978-3-031-04115-0_6 |
| ISBN: | 978-3-031-04115-0 |
M.Sc. Denis Neumüller
Institute of Software Engineering and Programming Languages
Albert-Einstein-Allee 11