Multi-User LLM Agents: Formulation & Challenges
We study a setting where a single LLM-based agent interacts with a set of users \( \mathcal{U} = \{u_1, \ldots, u_N\} \). Each user \( u_i \) acts as an independent principal, characterized by an authority persona \( p_i \), a private context \( C_i \), and a user-specific utility function \( U_i \). The agent observes a selectively shared context \( C^{\mathrm{share}} \) and outputs an action \( a \).
Unlike single-user interaction, the agent must make decisions that jointly affect multiple users. We model the interaction as a multi-objective decision problem:
\[ \max_{a \in \mathcal{A}} \;\sum_{i=1}^{N} w_i \, U_i(a;\, C_i,\, p_i), \]
where \( w_i \geq 0 \) is an externally specified priority weight based on each user's role or authority level (e.g., assigning higher weight to a CEO than to an intern).
Why is this hard? Single-principal training assumptions
Modern LLMs are trained under a single-user assumption. Instruction tuning minimizes the negative log-likelihood of a reference response for a single user:
\[ \min_\theta \;\mathbb{E}_{(x,y)\sim\mathcal{D}_{\mathrm{SFT}}} \left[ -\sum_{t=1}^{|y|} \log p_\theta(y_t \mid x, y_{\lt t}) \right]. \]
RLHF further reinforces this single-principal assumption by learning a scalar reward model from pairwise preferences:
\[ \max_\phi \;\mathbb{E}_{(x,y^+,y^-)\sim\mathcal{D}_{\mathrm{pref}}} \left[ \log \sigma\!\left(r_\phi(x, y^+) - r_\phi(x, y^-)\right) \right], \]
yielding a single scalar preference signal that conflates user-specific desiderata into one shared objective, making it difficult for the agent to represent multiple principals or reason about cross-user trade-offs.
As shown in the table below, even in multi-user settings, existing LLM interfaces serialize inputs from different users into a single user role, preventing explicit modeling of user identities, roles, and authority information.
| Template |
Message Schema |
| Single-user |
{"messages": [
{"role": "system", "content": "..."},
{"role": "user", "content": "..."},
{"role": "assistant", "content": "..."}
]} |
Multi-user (serialized) |
{"messages": [
{"role": "system", "content": "..."},
{"role": "user", "content": "userA says:... userB says:..."},
{"role": "assistant", "content": "..."}
]} |
Multi-user (native) |
{"messages": [
{"role": "system", "content": "..."},
{"role": "userA", "content": "..."},
{"role": "userB", "content": "..."},
{"role": "assistant", "content": "..."}
]} |
Table 1. Chat templates under the single-user assumption. Even in multi-user settings, existing LLM interfaces serialize inputs from different users into a single user role.
Core Challenges
- User Role and Preference Modeling: The agent must reliably identify distinct users and model their individualized objectives and preferences.
- Information Asymmetry and Selective Visibility: Each user maintains a permission-scoped private context \( C_i \). The agent must manage information access and sharing—deciding which parts of each \( C_i \) can be used, what can be revealed, and to whom.
- Conflict Resolution: Different users may pursue conflicting objectives. The agent must make principled trade-offs when a solution cannot satisfy everyone.