PHD RESEARCHER · INSIGHT RETRIEVAL FROM STRUCTURED DATA
Hello! I'm Cornelius Wolff, a PhD researcher at the TRL Lab at the Centrum Wiskunde & Informatica (CWI) and the University of Amsterdam, under the supervision of Madelon Hulsebos and Maarten de Rijke. My work centers on Insight Retrieval from Structured Data, where I investigate AI models and pipelines that can deal with all kinds of structured data like databases and CSV files. I'm furthermore deeply interested in AI in education, In-Context Learning, and efficient machine learning.
SQALE: A Large Text-to-SQL Corpus Grounded in Real Schemas — AI for Tabular Data Workshop @ Eurips 2025
Are We Asking the Right Questions? On Ambiguity in Natural Language Queries for Tabular Data Analysis — AI for Tabular Data Workshop @ EuRIPS 2025
MinervAI: Using Generative AI to Assist, Not Replace Humans in Peer Review — AI in Science (AIS) 2025 – Copenhagen, Denmark
Centrum Wiskunde & Informatica & University of Amsterdam
Research focus: Insight Retrieval from Structured Data. Supervised by Madelon Hulsebos and Maarten de Rijke at the TRL Lab.
Osnabrück University, Germany — GPA: 1.0 (A+)
Focus: Machine Learning, Computational Neuroscience, Ethics of AI. Thesis: Emergence of language in situated environments (Grade: 1.0)
Osnabrück University, Germany — GPA: 2.1
Thesis: Simulation environment for analyzing the spread of SARS-CoV-2 using ML agents (Grade: 1.0)
Investigating AI models and pipelines that extract meaningful insights from structured data sources such as databases, CSV files, and tabular datasets.
Developing compact, interpretable language models that achieve high performance while remaining transparent and computationally efficient.
Applying ICL principles to image classification, tabular data, and other structured domains; studying emergent communication in multi-agent RL systems and situated environments.
AI for Tabular Data Workshop @ Eurips 2025 — 2025
Introduction of SQALE, a large-scale semi-synthetic text-to-SQL dataset grounded in real relational schemas, supporting research on scalable and generalizable NL2SQL models.
READ THE PAPER ↗AI for Tabular Data Workshop @ EuRIPS 2025 — 2025
A conceptual framework for characterising ambiguity in natural-language queries over tabular data, arguing for cooperative query specification and analysing 15 common tabular-data benchmarks.
READ THE PAPER ↗AI in Science (AIS) 2025 – Copenhagen, Denmark — 2025
A position paper arguing for constrained, verifiable uses of LLMs in peer review and introducing MinervAI—an open-source tool that supports citation verification and argumentation mapping while preserving human judgment.
READ THE PAPER ↗4th Table Representation Workshop - ACL 2025 — 2025
Analysis of the reasoning capabilities of LLMs over tabular data, presented at the 4th Table Representation Workshop at ACL 2025.
READ THE PAPER ↗arXiv preprint arXiv:2506.14842 — 2025
PictSure: demonstrating the importance of pretraining embeddings for in-context learning image classifiers.
READ THE PAPER ↗arXiv preprint arXiv:2408.14649 — 2024
Research on bidirectional emergent language in situated environments, exploring how agents develop communication systems in interactive settings.
READ THE PAPER ↗Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI-24 — 2024
GRASP benchmark for evaluating language grounding and situated physics understanding in multimodal language models, presented at IJCAI-24.
READ THE PAPER ↗arXiv preprint arXiv:2407.08590 — 2024
A comprehensive review of nine physics engines for reinforcement learning research, analyzing their capabilities and suitability for RL applications.
READ THE PAPER ↗Mobile Communication-Technologies and Applications; 27th ITG-Symposium — 2023
A novel approach for estimating mobile broadband coverage in Germany using crowdsourced data, presented at the 27th ITG-Symposium on Mobile Communication.
READ THE PAPER ↗Master's Study Project — Osnabrück University · Osnabrück, Germany · 2024
Managed and organized a study project focused on “Supporting the Review Process with AI” - an AI-driven system developed by master’s students to assist with the academic reviewing process.
VIEW DETAILS ↗Master's Study Project — Osnabrück University · Osnabrück, Germany · 2022
Started and organized a study project to provide a platform and course setting for cutting edge Reinforcement Learning and emergent behavior research.
VIEW DETAILS ↗Seminar Talk — TRL Seminar, CWI · Amsterdam, Netherlands · January 23, 2026
YouTube RecordingIn this talk at the TRL Seminar Series at CWI, I presented SQALE, a large-scale dataset aimed at advancing text-to-SQL systems through more realistic training and evaluation data.Natural language interfaces for databases rely on text-to-SQL models to translate user questions into executable SQL quer...
WATCH RECORDING ↗Online Talk — AlphaXiv · Amsterdam, Netherlands · September 3, 2025
Youtube RecordingIn this presentation at AlphaXiv, I talked about my paper on “How well do LLMs reason over tabular data, really?”, which is about whether general-purpose Large Language Models can effectively reason over tabular data. We identified flaws in current evaluation methods and proposes an LLM-as-a-judge ...
WATCH RECORDING ↗Podcast Interview — Kaleidoscience: Conversations on Cognitive Science · Osnabrück, Germany · December 19, 2024
Podcast EpisodeIn the 12th episode of the podcast “Kaleidoscience: Conversations on Cognitive Science,” I discussed with Sönke Lülf and Elisa Palme about the sustainable use of Artificial Intelligence and how we can create green AI.Topics: The energy footprint of AI models Sustainable data management in companies ...
WATCH RECORDING ↗Amsterdam, The Netherlands
Centrum Wiskunde & Informatica,
Science Park 123,
1098 XG Amsterdam,
The Netherlands