
Problem Description
Conversational agents are becoming an inherent actor to access information online. However, their evaluation remains a challenge, as releasing a bad conversational agent to the public can hurt users trust and developers’ reputation. A solution for offline evaluation is user simulation, that is, synthetic users interact with the conversational agent to test it and identify potential flaws before public release. The objective of this project is to develop user simulators that are good enough to produce realistic interactions with conversational agents in the domain of dataset search.
Project Definition
The project is aligned with the U-Sim track @ TREC 2026, where participants are tasked to develop user simulators to interact with conversational systems to retrieve datasets in the scholarly domain. A critical question is to assess if the user simulators developed are good enough to substitute real users and support training and evaluation of conversational agents. The project has two main objectives:
- Conceptualizing and developing a user simulator
- Performing a qualitative analysis of the user simulator to assess its value for the evaluation or training of conversational agents
The timeline for the project is as follows:
- Kick-off workshop (mandatory): 1st or 2nd week of March
- Implementation phase: March-June
- Evaluation phase: July
Upon interest, the Cologne Information Retrieval group is happy to support student groups in submitting their user simulator along with a lab report to the U-Sim track in September 2026. Link and Resources:
- U-Sim track @ TREC: https://trec.usersim.ai/
- Sim4IA - Simulations for Information Access: http://sim4ia.org/
- Cologne Information Retrieval: https://ir.web.th-koeln.de/
- Related literature: https://arxiv.org/pdf/2405.14249, https://arxiv.org/pdf/2406.19007
Learning Outcome
- Practical implementation of a user simulator using cutting edge and timely technologies (including large language models and conversational agents)
- Project management and software development skills from concept to implementation
- Experiments in an active field of research with the possibility to participate in an international benchmarking campaign
Participation Requirements
- Strong coding skills, preferably with Python
- Interest in natural language processing and information retrieval
- Willingness to familiarize yourself with user simulation practices
External Partner
-