Guided Project SS26_13 »The Synthetic Researcher - Simulation of Dataset Search«
Organizational Details
- Supervisor(s)
- Nolwenn Bernard, Timo Breuer, Philipp Schaer
- Team size
- 3-5
- Language
- English
- Start
- March 2026
- Offered as
- GP-GAK (12 ECTS)

Conversational agents are becoming an inherent actor to access information online. However, their evaluation remains a challenge, as releasing a bad conversational agent to the public can hurt users trust and developers’ reputation. A solution for offline evaluation is user simulation, that is, synthetic users interact with the conversational agent to test it and identify potential flaws before public release. The objective of this project is to develop user simulators that are good enough to produce realistic interactions with conversational agents in the domain of dataset search.
The project is aligned with the U-Sim track @ TREC 2026, where participants are tasked to develop user simulators to interact with conversational systems to retrieve datasets in the scholarly domain. A critical question is to assess if the user simulators developed are good enough to substitute real users and support training and evaluation of conversational agents. The project has two main objectives:
The timeline for the project is as follows:
Upon interest, the Cologne Information Retrieval group is happy to support student groups in submitting their user simulator along with a lab report to the U-Sim track in September 2026. Link and Resources:
-