
<bib>
<comment>
This file was created by the TYPO3 extension publications
--- Timezone: CEST
Creation date: 2026-04-21
Creation time: 20:51:12
--- Number of references
13
</comment>
<reference>
<bibtype>inproceedings</bibtype>
<title>Towards Automatically Inferring Constraints to Identify Implicit Assumptions in Data Analysis</title>
<year>2026</year>
<reviewed>1</reviewed>
<DOI>10.1145/3786582.3786806</DOI>
<booktitle>2026 IEEE/ACM 48th International Conference on Software Engineering (ICSE-NIER ’26)</booktitle>
<authors>
<person>
<fn>Florian</fn>
<sn>Sihler</sn>
</person>
<person>
<fn>Lars</fn>
<sn>Pfrenger</sn>
</person>
<person>
<fn>Oliver</fn>
<sn>Gerstl</sn>
</person>
<person>
<fn>Matthias</fn>
<sn>Tichy</sn>
</person>
</authors>
</reference>
<reference>
<bibtype>inproceedings</bibtype>
<title>Statically Analyzing the Dataflow of R Programs</title>
<abstract>The R programming language is primarily designed for statistical computing and mostly used by researchers without a background in computer science. R provides a wide range of dynamic features and peculiarities that are difficult to analyze statically like dynamic scoping and lazy evaluation with dynamic side effects. At the same time, the R ecosystem lacks sophisticated analysis tools that support researchers in understanding and improving their code. In this paper, we present a novel static dataflow analysis framework for the R programming language that is capable of handling the dynamic nature of R programs and produces the dataflow graph of given R programs. This graph can be essential in a range of analyses, including program slicing, which we implement as a proof of concept. The core analysis works as a stateful fold over a normalized version of the abstract syntax tree of the R program, which tracks (re-)definitions, values, function calls, side effects, external files, and a dynamic control flow to produce one dataflow graph per program. We evaluate the correctness of our analysis using output equivalence testing on a manually curated dataset of 779 sensible slicing points from executable real-world R scripts. Additionally, we use a set of systematic test cases based on the capabilities of the R language and the implementation of the R interpreter and measure the runtimes well as the memory consumption on a set of 4,230 real-world R scripts and 20,815 packages available on R’s package manager CRAN. Furthermore, we evaluate the recall of our program slicer, its accuracy using shrinking, and its improvement over the state of the art. We correctly analyze almost all programs in our equivalence test suite, preserving the identical output for 99.7% of the manually curated slicing points. On average, we require 576ms to analyze the dataflow and around 213kB to store the graph of a research script. This shows that our analysis is capable of analyzing real-world sources quickly and correctly. Our slicer achieves an average reduction of 84.8% of tokens indicating its potential to improve program comprehension.</abstract>
<type>Konferenzbeitrag</type>
<year>2025</year>
<month>10</month>
<DOI>10.1145/3763087</DOI>
<booktitle>Proceedings of the ACM on Programming Languages, OOPSLA 2025</booktitle>
<pages>1034-1062</pages>
<authors>
<person>
<fn>Florian</fn>
<sn>Sihler</sn>
</person>
<person>
<fn>Matthias</fn>
<sn>Tichy</sn>
</person>
</authors>
</reference>
<reference>
<bibtype>conference</bibtype>
<title>Explainability in Self-Adaptive Systems: A Systematic Literature Review</title>
<year>2025</year>
<month>9</month>
<day>9</day>
<reviewed>1</reviewed>
<DOI>10.1007/978-3-032-04200-2_19</DOI>
<booktitle>Euromicro Conference on Software Engineering and Advanced Applications 2025</booktitle>
<authors>
<person>
<fn>Raphael</fn>
<sn>Straub</sn>
</person>
<person>
<fn>Florian</fn>
<sn>Sihler</sn>
</person>
<person>
<fn>Ali</fn>
<sn>Torbati</sn>
</person>
<person>
<fn>Cong</fn>
<sn>Wang</sn>
</person>
<person>
<fn>Raffaela</fn>
<sn>Groner</sn>
</person>
<person>
<fn>Verena</fn>
<sn>Klös</sn>
</person>
<person>
<fn>Matthias</fn>
<sn>Tichy</sn>
</person>
</authors>
</reference>
<reference>
<bibtype>inproceedings</bibtype>
<title>On the Anatomy of Real-World R Code for Static Analysis (Extended Abstract)</title>
<year>2025</year>
<month>2</month>
<issn>2944-7682</issn>
<DOI>10.18420/se2025-27</DOI>
<journal>Software Engineering 2025</journal>
<publisher>Gesellschaft für Informatik, Bonn</publisher>
<authors>
<person>
<fn>Florian</fn>
<sn>Sihler</sn>
</person>
<person>
<fn>Lukas</fn>
<sn>Pietzschmann</sn>
</person>
<person>
<fn>Raphael</fn>
<sn>Straub</sn>
</person>
<person>
<fn>Matthias</fn>
<sn>Tichy</sn>
</person>
<person>
<fn>Andor</fn>
<sn>Diera</sn>
</person>
<person>
<fn>Abdelhalim</fn>
<sn>Dahou</sn>
</person>
</authors>
</reference>
<reference>
<bibtype>inproceedings</bibtype>
<title>flowR: A Static Program Slicer for R</title>
<abstract>Context Many researchers rely on the R programming language to perform their statistical analyses and visualizations in the form of R scripts. However, recent research and experience show, that many of these scripts contain problems. From being hard to comprehend by combining several analyses and plots into a single source file to being non-reproducible, with a lack of analysis tools supporting the writing of correct and maintainable code. Objective In this work, we address the problem of comprehending and maintaining R scripts by proposing flowR, a program slicer and static dataflow analyzer for the R programming language, which can be integrated directly into Visual Studio Code. Given a set of variables of interest, like the generation of a single figure in a script, flowR automatically reduces the program to the parts relevant for the output of interest, like the value of a variable. Method First, we use static program analysis to construct a detailed dataflow graph of the R script. The analysis supports loops, function calls, side effects, sourcing external files, and even redefinitions of R's primitive constructs. Subsequently, we calculate the program slice by solving a reachability problem on the graph, collecting all required parts and presenting them to the user. Results Providing several interactive ways of slicing the program, we require an average of 16 ms to calculate the slice on a given dataflow graph, reducing the code by around 94% of tokens.
The demonstration video is available at https://youtu.be/Zgq6rnbvvhk. For the full source code and extensive documentation, refer to https://github.com/Code-Inspect/flowr.</abstract>
<year>2024</year>
<month>10</month>
<day>27</day>
<reviewed>1</reviewed>
<DOI>10.1145/3691620.3695359</DOI>
<booktitle>ASE '24: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering (Tool Demonstrations)</booktitle>
<keywords>program analysis
R</keywords>
<web_url>https://github.com/flowr-analysis/flowr</web_url>
<authors>
<person>
<fn>Florian</fn>
<sn>Sihler</sn>
</person>
<person>
<fn>Matthias</fn>
<sn>Tichy</sn>
</person>
</authors>
</reference>
<reference>
<bibtype>inproceedings</bibtype>
<title>Improving the Comprehension of R Programs by Hybrid Dataflow Analysis</title>
<abstract>Context Comprehending code is crucial in all areas of software development, with many existing supporting tools and techniques for various languages. However, for R, a widely used programming language, especially in the field of statistical computing, the support is limited. R offers a large number of packages as well as dynamic features, which make it challenging to analyze and understand. Objective We aim to (i) gain a better understanding of how R is used in the real world, (ii) devise better analysis strategies for R, which are able to handle its dynamic nature, and (iii) improve the comprehension of R scripts by using these analyses, providing new methods and procedures applicable to program comprehension in general. Method In eight contributions, we analyze feature usage in R scripts, develop a new static dataflow analysis intertwining control and dataflow, and more. We enable and propose new techniques for program comprehension using a combination of static and dynamic analysis.</abstract>
<year>2024</year>
<month>10</month>
<day>27</day>
<reviewed>1</reviewed>
<DOI>10.1145/3691620.3695603</DOI>
<booktitle>ASE '24: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering (Doctoral Symposium)</booktitle>
<keywords>doctoral symposium
program analysis</keywords>
<web_url>https://dl.acm.org/doi/abs/10.1145/3691620.3695603</web_url>
<web_url_date>11.11.2024</web_url_date>
<authors>
<person>
<fn>Florian</fn>
<sn>Sihler</sn>
</person>
</authors>
</reference>
<reference>
<bibtype>inproceedings</bibtype>
<title>Exploring the Effectiveness of Abstract Syntax Tree Patterns for Algorithm Recognition</title>
<year>2024</year>
<month>6</month>
<isbn>979-8-3503-6646-4</isbn>
<DOI>10.1109/ICCQ60895.2024.10576984</DOI>
<booktitle>4. International Conference on Code Quality (ICCQ)</booktitle>
<web_url>https://ieeexplore.ieee.org/document/10576984</web_url>
<authors>
<person>
<fn>Denis</fn>
<sn>Neumüller</sn>
</person>
<person>
<fn>Florian</fn>
<sn>Sihler</sn>
</person>
<person>
<fn>Raphael</fn>
<sn>Straub</sn>
</person>
<person>
<fn>Matthias</fn>
<sn>Tichy</sn>
</person>
</authors>
</reference>
<reference>
<bibtype>inproceedings</bibtype>
<title>On the Anatomy of Real-World R Code for Static Analysis</title>
<status>1</status>
<year>2024</year>
<month>1</month>
<DOI>10.1145/3643991.3644911</DOI>
<booktitle>21st International Conference on Mining Software Repositories (MSR '24)</booktitle>
<web_url>https://arxiv.org/abs/2401.16228</web_url>
<file_url>https://arxiv.org/pdf/2401.16228.pdf</file_url>
<authors>
<person>
<fn>Florian</fn>
<sn>Sihler</sn>
</person>
<person>
<fn>Lukas</fn>
<sn>Pietzschmann</sn>
</person>
<person>
<fn>Raphael</fn>
<sn>Straub</sn>
</person>
<person>
<fn>Matthias</fn>
<sn>Tichy</sn>
</person>
<person>
<fn>Andor</fn>
<sn>Diera</sn>
</person>
<person>
<fn>Abdelhalim</fn>
<sn>Dahou</sn>
</person>
</authors>
</reference>
<reference>
<bibtype>inproceedings</bibtype>
<title>GenCodeSearchNet: A Benchmark Test Suite for Evaluating Generalization in Programming Language Understanding</title>
<status>1</status>
<year>2023</year>
<month>10</month>
<DOI>10.48550/arXiv.2311.09707</DOI>
<booktitle>GenBench 2023 Workshop</booktitle>
<authors>
<person>
<fn>Andor</fn>
<sn>Diera</sn>
</person>
<person>
<fn>Abdelhalim</fn>
<sn>Dahou</sn>
</person>
<person>
<fn>Lukas</fn>
<sn>Galke</sn>
</person>
<person>
<fn>Fabian</fn>
<sn>Karl</sn>
</person>
<person>
<fn>Florian</fn>
<sn>Sihler</sn>
</person>
<person>
<fn>Ansgar</fn>
<sn>Scherp</sn>
</person>
</authors>
</reference>
<reference>
<bibtype>thesis</bibtype>
<title>Constructing a Static Program Slicer Specifically for R Programs</title>
<type>Masterarbeit</type>
<year>2023</year>
<month>8</month>
<DOI>10.18725/OPARU-50107</DOI>
<school>University of Ulm, Germany</school>
<editor>Prof. Matthias Tichy</editor>
<authors>
<person>
<fn>Florian</fn>
<sn>Sihler</sn>
</person>
</authors>
</reference>
<reference>
<bibtype>article</bibtype>
<title>One-Way Model Transformations in the Context of the Technology-Roadmapping Tool IRIS</title>
<year>2023</year>
<month>7</month>
<DOI>10.5381/jot.2023.22.2.a2</DOI>
<journal>Journal of Object Technology</journal>
<edition>The 19th European Conference on Modelling Foundations and Applications (ECMFA 2023)</edition>
<authors>
<person>
<fn>Florian</fn>
<sn>Sihler</sn>
</person>
<person>
<fn>Jakob</fn>
<sn>Pietron</sn>
</person>
<person>
<fn>Matthias</fn>
<sn>Tichy</sn>
</person>
</authors>
</reference>
<reference>
<bibtype>article</bibtype>
<title>A domain-specific language for modeling and analyzing solution spaces for technology roadmapping</title>
<year>2022</year>
<month>2</month>
<DOI>10.1016/j.jss.2021.111094</DOI>
<journal>Journal of Systems & Software (JSS)</journal>
<authors>
<person>
<fn>Alexander</fn>
<sn>Breckel</sn>
</person>
<person>
<fn>Jakob</fn>
<sn>Pietron</sn>
</person>
<person>
<fn>Katharina</fn>
<sn>Juhnke</sn>
</person>
<person>
<fn>Florian</fn>
<sn>Sihler</sn>
</person>
<person>
<fn>Matthias</fn>
<sn>Tichy</sn>
</person>
</authors>
</reference>
<reference>
<bibtype>thesis</bibtype>
<title>One-way Model Transformations</title>
<type>Bachelorarbeit</type>
<year>2022</year>
<month>1</month>
<DOI>10.18725/OPARU-47275</DOI>
<school>Universität Ulm</school>
<authors>
<person>
<fn>Florian</fn>
<sn>Sihler</sn>
</person>
</authors>
</reference>
</bib>
