Cornelius Wolff

PHD RESEARCHER · INSIGHT RETRIEVAL FROM STRUCTURED DATA

BSky

Hello! I'm Cornelius Wolff, a PhD researcher at the TRL Lab at the Centrum Wiskunde & Informatica (CWI) and the University of Amsterdam, under the supervision of Madelon Hulsebos and Maarten de Rijke. My work centers on Insight Retrieval from Structured Data, where I investigate AI models and pipelines that can deal with all kinds of structured data like databases and CSV files. I'm furthermore deeply interested in AI in education, In-Context Learning, and efficient machine learning.

Latest publications:

SQALE: A Large Text-to-SQL Corpus Grounded in Real Schemas — AI for Tabular Data Workshop @ Eurips 2025

Are We Asking the Right Questions? On Ambiguity in Natural Language Queries for Tabular Data Analysis — AI for Tabular Data Workshop @ EuRIPS 2025

MinervAI: Using Generative AI to Assist, Not Replace Humans in Peer Review — AI in Science (AIS) 2025 – Copenhagen, Denmark

Education.

2025 – Present

Ph.D. in Computer Science

Centrum Wiskunde & Informatica & University of Amsterdam

Research focus: Insight Retrieval from Structured Data. Supervised by Madelon Hulsebos and Maarten de Rijke at the TRL Lab.

2021 – 2025

M.Sc. in Cognitive Science

Osnabrück University, Germany — GPA: 1.0 (A+)

Focus: Machine Learning, Computational Neuroscience, Ethics of AI. Thesis: Emergence of language in situated environments (Grade: 1.0)

2018 – 2021

B.Sc. in Information Systems

Osnabrück University, Germany — GPA: 2.1

Thesis: Simulation environment for analyzing the spread of SARS-CoV-2 using ML agents (Grade: 1.0)

Work Experience

PhD Student, CWI & University of Amsterdam (Mar 2025 – Present)

Junior Researcher, German Research Center for AI (DFKI), Osnabrück (Mar 2022 – Mar 2025)

Research Assistant, Elia Bruni's Lab, Osnabrück University (Apr 2023 – Mar 2025)

Research Assistant, Distributed Systems Lab, Osnabrück University (Jun 2021 – Mar 2023)

Technical Skills

Python PyTorch TensorFlow SQL TypeScript LLMs RAG Reinforcement Learning Vector Databases Git LaTeX

Publications.

SQALE: A Large Text-to-SQL Corpus Grounded in Real Schemas

AI for Tabular Data Workshop @ Eurips 2025 — 2025

Introduction of SQALE, a large-scale semi-synthetic text-to-SQL dataset grounded in real relational schemas, supporting research on scalable and generalizable NL2SQL models.

READ THE PAPER ↗

Are We Asking the Right Questions? On Ambiguity in Natural Language Queries for Tabular Data Analysis

AI for Tabular Data Workshop @ EuRIPS 2025 — 2025

A conceptual framework for characterising ambiguity in natural-language queries over tabular data, arguing for cooperative query specification and analysing 15 common tabular-data benchmarks.

READ THE PAPER ↗

MinervAI: Using Generative AI to Assist, Not Replace Humans in Peer Review

AI in Science (AIS) 2025 – Copenhagen, Denmark — 2025

A position paper arguing for constrained, verifiable uses of LLMs in peer review and introducing MinervAI—an open-source tool that supports citation verification and argumentation mapping while preserving human judgment.

READ THE PAPER ↗

How well do LLMs reason over tabular data, really?

4th Table Representation Workshop - ACL 2025 — 2025

Analysis of the reasoning capabilities of LLMs over tabular data, presented at the 4th Table Representation Workshop at ACL 2025.

READ THE PAPER ↗

PictSure: Pretraining Embeddings Matters for In-Context Learning Image Classifiers

arXiv preprint arXiv:2506.14842 — 2025

PictSure: demonstrating the importance of pretraining embeddings for in-context learning image classifiers.

READ THE PAPER ↗

Bidirectional Emergent Language in Situated Environments

arXiv preprint arXiv:2408.14649 — 2024

Research on bidirectional emergent language in situated environments, exploring how agents develop communication systems in interactive settings.

READ THE PAPER ↗

GRASP: A Novel Benchmark for Evaluating Language Grounding and Situated Physics Understanding in Multimodal Language Models

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI-24 — 2024

GRASP benchmark for evaluating language grounding and situated physics understanding in multimodal language models, presented at IJCAI-24.

READ THE PAPER ↗

A Review of Nine Physics Engines for Reinforcement Learning Research

arXiv preprint arXiv:2407.08590 — 2024

A comprehensive review of nine physics engines for reinforcement learning research, analyzing their capabilities and suitability for RL applications.

READ THE PAPER ↗

A New Approach on Estimating Germany's Mobile Broadband Coverage based on Crowdsourced Data

Mobile Communication-Technologies and Applications; 27th ITG-Symposium — 2023

A novel approach for estimating mobile broadband coverage in Germany using crowdsourced data, presented at the 27th ITG-Symposium on Mobile Communication.

READ THE PAPER ↗

Teaching & Talks.

Study Project: LLMs for Science

Master's Study Project — Osnabrück University · Osnabrück, Germany · 2024

Managed and organized a study project focused on “Supporting the Review Process with AI” - an AI-driven system developed by master’s students to assist with the academic reviewing process.

VIEW DETAILS ↗

Emergent Behavior in Multi-Agent Systems (EBIMAS)

Master's Study Project — Osnabrück University · Osnabrück, Germany · 2022

Started and organized a study project to provide a platform and course setting for cutting edge Reinforcement Learning and emergent behavior research.

VIEW DETAILS ↗

Computer Networks (Rechnernetze)

Teaching Assistant — Osnabrück University · Osnabrück, Germany · 2022

Served as a Teaching Assistant for the undergraduate Computer Networks course (6 ECTS) in Summer Semester 2022, supporting both the lecture and tutorial components of the course.

VIEW DETAILS ↗

SQALE: A large Text-to-SQL dataset with Realistic Database Schemas

Seminar Talk — TRL Seminar, CWI · Amsterdam, Netherlands · January 23, 2026

YouTube RecordingIn this talk at the TRL Seminar Series at CWI, I presented SQALE, a large-scale dataset aimed at advancing text-to-SQL systems through more realistic training and evaluation data.Natural language interfaces for databases rely on text-to-SQL models to translate user questions into executable SQL quer...

WATCH RECORDING ↗

How well do LLMs reason over tabular data?

Online Talk — AlphaXiv · Amsterdam, Netherlands · September 3, 2025

Youtube RecordingIn this presentation at AlphaXiv, I talked about my paper on “How well do LLMs reason over tabular data, really?”, which is about whether general-purpose Large Language Models can effectively reason over tabular data. We identified flaws in current evaluation methods and proposes an LLM-as-a-judge ...

WATCH RECORDING ↗

Can We Create Green AI? - Podcast Interview

Podcast Interview — Kaleidoscience: Conversations on Cognitive Science · Osnabrück, Germany · December 19, 2024

Podcast EpisodeIn the 12th episode of the podcast “Kaleidoscience: Conversations on Cognitive Science,” I discussed with Sönke Lülf and Elisa Palme about the sustainable use of Artificial Intelligence and how we can create green AI.Topics: The energy footprint of AI models Sustainable data management in companies ...

WATCH RECORDING ↗

Retrieval-Augmented Generation: Survey on improving the reliability of large language models in industrial settings

Edgar Melcher — Bachelor Thesis · Osnabrück University & DFKI · 2024

Under Prof. Oliver Thomas & Moritz-André Weiher

VIEW DETAILS ↗

Using Diffusion Models to improve the process of floor plan drafting

Maria Oprea — Bachelor Thesis · Osnabrück University & DFKI · 2024

Under Prof. Oliver Thomas & Dr. Simon Pukrop

VIEW DETAILS ↗

AR model of Neural Networks for a collaborative approach to explainable AI

Hendrik Kremer — Bachelor Thesis · Osnabrück University & DFKI · 2024

Under Prof. Oliver Thomas & Enrico Kochon

VIEW DETAILS ↗

Natural Language Instruction-Following in a Simulated 3D World using microcosm.ai

Kamran Vatankhah-Barazandeh — Bachelor Thesis · Osnabrück University · 2024

Under Prof. Elia Bruni & Julius Mayer

VIEW DETAILS ↗

Visualizing Models of Artificial Neural Networks in Virtual Reality

Leonie Grafweg — Bachelor Thesis · Osnabrück University & DFKI · 2023

Under Prof. Oliver Thomas & Enrico Kochon

VIEW DETAILS ↗

Cornelius Wolff

Latest publications:

Education.

Ph.D. in Computer Science

M.Sc. in Cognitive Science

B.Sc. in Information Systems

Work Experience

Technical Skills

Research.

Insight Retrieval from Structured Data

Small & Interpretable Language Models

In-Context Learning & Reinforcement Learning

Publications.

SQALE: A Large Text-to-SQL Corpus Grounded in Real Schemas

Are We Asking the Right Questions? On Ambiguity in Natural Language Queries for Tabular Data Analysis

MinervAI: Using Generative AI to Assist, Not Replace Humans in Peer Review

How well do LLMs reason over tabular data, really?

PictSure: Pretraining Embeddings Matters for In-Context Learning Image Classifiers

Bidirectional Emergent Language in Situated Environments

GRASP: A Novel Benchmark for Evaluating Language Grounding and Situated Physics Understanding in Multimodal Language Models

A Review of Nine Physics Engines for Reinforcement Learning Research

A New Approach on Estimating Germany's Mobile Broadband Coverage based on Crowdsourced Data

Teaching & Talks.

Study Project: LLMs for Science

Emergent Behavior in Multi-Agent Systems (EBIMAS)

Computer Networks (Rechnernetze)

SQALE: A large Text-to-SQL dataset with Realistic Database Schemas

How well do LLMs reason over tabular data?

Can We Create Green AI? - Podcast Interview

Retrieval-Augmented Generation: Survey on improving the reliability of large language models in industrial settings

Using Diffusion Models to improve the process of floor plan drafting

AR model of Neural Networks for a collaborative approach to explainable AI

Natural Language Instruction-Following in a Simulated 3D World using microcosm.ai

Visualizing Models of Artificial Neural Networks in Virtual Reality

Contact.