News
Lute is attending CVPR 2026
We are at CVPR 2026 in Denver, presenting Beyond Recognition: Evaluating Visual Perspective Taking in Vision Language Models, accepted to the CVPR 2026 Findings track.
The paper asks a plain question: can vision-language models tell how a scene looks from where someone else is standing?
Recognizing what is in an image is the easy part. Asking where those things are trips the models up, and asking how the scene looks to someone else trips them up more. Knowing what is in a picture is not the same as understanding the space it shows.
Who is attending