GRASP: A Novel Benchmark for Evaluating Language Grounding and Situated Physics Understanding in Multimodal Language Models
Published in Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI-24, 2024
This paper introduces GRASP, a novel benchmark designed to evaluate language grounding and situated physics understanding in multimodal language models. The benchmark provides a comprehensive framework for assessing how well these models can understand and reason about physical interactions in visual contexts.
Recommended citation: Jassim, S., Holubar, M., Richter, A., Wolff, C., Ohmer, X., & Bruni, E. (2024). "GRASP: A Novel Benchmark for Evaluating Language Grounding and Situated Physics Understanding in Multimodal Language Models." Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI-24, 6297-6305.
Download Paper