Benjamin Schneider

Gradaute Student at UWaterloo, affiliated with the Vector Institute, Benjamin.Schneider@uwaterloo.ca

prof_pic.png

Hi! I’m Ben, a CS masters student at the University of Waterloo, advised by Wenhu Chen (TIGER-Lab) and Florian Kerschbaum.

Currently, I’m working on methods for training open-ended embodied agents. The analogy I always draw on is that if you sit a kid in front of a computer and get them to play an open-world game like MineCraft, they quickly become generalist experts (able to accomplish arbitrary tasks) without needing specific objectives. I am interested in machine learning algorithms that can emulate that process of learning. I am usually working on a combination of the following problems:

  • Embodied learning in open-ended environments without explicit objectives.
  • Continual/Lifelong learning for embodied agents.
  • Unified methods for representation learning across modalities.

I’m pretty terrible about keeping my website updated. 😅
So, for an up-to-date list of publications please check my scholar, my code/projects are hosted on GitHub:
Toolbox-HQ (Embodied Agents work) and TIGER-Lab (Multimodal Learning projects).

Fun fact about me: I try to sneak an image of my cat (pictured right) into my papers.

News

Sep 03, 2025 I have been battling to teach an agent to play Pokemon Emerald for a few months! Check out our work (in progress) here.
May 15, 2025 First public release of QuickVideo, our library for efficient (long) VideoLLM inference. QuickVideo is an ongoing project focused on improving systems and models for VideoLLMs, please provide feedback if there are features you want implemented!
Mar 04, 2025 We release ABC, a model fine0grained multimodal retrieval.

Publications

  1. quickvideo.png
    QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design
    Benjamin Schneider, Dongfu Jiang, Chao Du, and 2 more authors
    2025
  2. structeval.png
    StructEval: Benchmarking LLMs’ Capabilities to Generate Structural Outputs
    Jialin Yang, Dongfu Jiang, Lipeng He, and 17 more authors
    2025
  3. scholarcopilot.png
    ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations
    Yubo Wang, Xueguang Ma, Ping Nie, and 7 more authors
    2025
  4. abc.png
    ABC: Achieving Better Control of Multimodal Embeddings using VLMs
    Benjamin Schneider, Florian Kerschbaum, and Wenhu Chen
    2025
  5. demo-1.png
    Universal Backdoor Attacks
    Benjamin Schneider, Nils Lukas, and Florian Kerschbaum
    In The Twelfth International Conference on Learning Representations (ICLR 2024), 2024