I am currently a PhD candidate at the Technical University of Munich (TUM), where I am supervised by Prof. Jana Giceva, with Prof. Carsten Binnig serving as my second supervisor. In parallel, I collaborate closely with researchers at MIT CSAIL, where I previously held a research stay and worked under the guidance of Dr. Cagatay Demiralp and Turing Award laureate Prof. Michael Stonebraker.
I hold two bachelor’s degrees in Mathematics and Computer Science and a master’s degree in Mathematics and Data Science from TUM. During my final semester, I conducted my master’s thesis as a visiting student at MIT CSAIL, co-advised by Prof. Michael Stonebraker and Prof. Thomas Neumann. During this time, I was part of the LLM-Enterprise group and contributed to the BEAVER and GOBY benchmarks in collaboration with Moe Kayali, Peter Baile Chen, Cagatay Demiralp, Nesime Tatbul, and Michael Stonebraker.
My research focuses on system-oriented approaches to improving the performance of large language models for information retrieval and complex reasoning, with a particular emphasis on machine learning systems and data management. More broadly, I am interested in designing robust, scalable AI-driven systems for enterprise-scale applications.
BEAVER - a enterprise dataset for text-to-SQL - is sourced from enterprise data warehouses with natural language queries and accurate SQL statements. Unlike public datasets, it highlights LLM limitations in complex environments. Future research can leverage this dataset to build advanced text-to-SQL systems.
GOBY is a benchmark dataset designed for evaluating data integration techniques specifically for enterprise data. It was derived from a real-world production workload in the event promotion and marketing domain, compiled around 2017. Unlike public benchmarks, GOBY focuses on private datasets, making it more representative of enterprise challenges.
BENCHPRESS is an interactive annotation system for rapidly creating text-to-SQL benchmarks tailored to enterprise data tasks. It uses a human-in-the-loop approach, where annotators refine or repair LLM-generated SQL-NL pairs. BENCHPRESS was instrumental in creating the BEAVER benchmark and includes plans for scalable query log annotation, semantic context enrichment, and robustness evaluation, addressing the unique challenges of enterprise data.
RUBICON is an agent-centric information system designed to answer complex, cross-source queries in domains with heterogeneous, multimodal, and partially incompatible data. Each information source—such as regulatory documents, technical guidelines, databases, or visual artifacts—is wrapped by a dedicated agent that exposes its capabilities through a unified Agentic Query Language (AQL). Rather than relying on a fully autonomous coordinating agent, Rubicon places the human in the loop as the explicit coordinator, allowing users to decompose queries, orchestrate agents, and iteratively refine intermediate results. This design avoids brittle global planning, increases transparency and controllability, and enables robust reasoning across sources that differ in structure, modality, and interpretability.
Developed and implemented new features and algorithms for Celonis’ query engine in C++, Java, and Python, improving performance and reliability.
Sep 2022 – Mar 2024Created data visualizations for fund mandates and automated monthly reports using Python and MySQL.
Jul 2021 – Mar 2022Developed a web crawler for pharmaceutical regulations, an automated email newsletter, and tax-related applications using Python and Power BI.
Nov 2019 – Jul 2020Implemented optimization algorithms for generating realistic load cases and automated scripts for strain case calculations in Lua and Python.
Mar 2018 – Sep 2018Served as a mathematics tutor and exam corrector, and scripted LaTeX documents.
Feb 2019 – Jul 2019