Posted in

代理的用户体验,第 1 部分:聊天_AI阅读总结 — 包阅AI

包阅导读总结

1. 关键词:Agent UX、Chat、Streaming Chat、Non-streaming Chat、Human-Agent Interaction

2. 总结:本文探讨了代理的用户体验(UX),重点分析了聊天这一主要的UX范式,包括流式聊天和非流式聊天,指出它们的优缺点,认为随着对代理系统信任增加,UX模式可能会变化,强调聊天是被广泛使用的良好UX。

3. 主要内容:

– 作者在之前的会议中提到代理的三个局限,本系列将深入探讨代理的UX,此为第一部分。

– 认为未来人类与代理的交互将成为研究关键,代理系统因新挑战需新的UI/UX范式。

– 目前最主流的UX范式是聊天,包括流式聊天和非流式聊天。

– 流式聊天:通过自然语言与LLM直接交互,能让用户了解后台情况,方便纠错,但现有聊天平台未内置,不适合长期任务,需人为触发。

– 非流式聊天:响应分批返回,用户看不到后台,若建立信任则可委托任务,更易融入现有工作流,AI响应时间长也可接受,但存在“双文本”等问题。

– 指出随着对代理系统信任增加,UX模式可能会变化,强调聊天是良好的UX。

思维导图:

文章地址:https://blog.langchain.dev/ux-for-agents-part-1-chat-2/

文章来源:blog.langchain.dev

作者:Harrison Chase

发布时间:2024/8/2 0:03

语言:英文

总字数:1005字

预计阅读时间:5分钟

评分:90分

标签:人机交互,聊天UX,代理人UX,AI系统,LLM交互


以下为原文内容

本内容来源于用户推荐转载,旨在分享知识与观点,如有侵权请联系删除 联系邮箱 media@ilingban.com

At Sequoia’s AI Ascent conference in March, I talked about three limitations for agents: planning, UX, and memory. Check out that talk here. In this post I will dive deeper into UX for agents. Thanks to Nuno Campos, LangChain founding engineer for many of the original thoughts and analogies here.

Because there are so many different aspects of UX for agents, this topic will be split into three separate blogs. This is first in the series.

Human-Computer Interaction has been a well-studied area for years. I believe that in the coming years, Human-Agent Interaction will also become a key area of research.

Agentic systems differ from traditional computer systems of the past due to new challenges stemming from latency, unreliability, and natural language interfaces. As such, I strongly believe that new UI/UX paradigms for interacting with these agentic applications will emerge.

While it’s still early days for agentic systems, I think there are multiple emerging UX paradigms. In this blog we will discuss perhaps the most dominant UX so far: chat.

Streaming Chat

The “streaming chat” UX is the most dominant UX so far. This quite simply is an agentic system that streams back its thoughts and actions in a chat format — ChatGPT is the most popular example. This interaction pattern seems basic, but is actually quite good for a few reasons.

The main way to “program” an LLM is with natural language. In chat, you interact directly with the LLM through natural language. This means there is little to no barrier between you and the LLM.

💡

In some ways, streaming chat is the “terminal” of early computers.

A terminal (especially in early computers) provides lower-level and more direct access to the underlying OS. But over time, computers have shifted to more UI-based interactions. Streaming chat may be similar – it’s the first way we built to interact with LLMs, and it provides pretty direct access to the underlying LLM. Over time, other UXs may emerge (just as computers became more UI-based) – but low-level access has significant benefits, especially at the start!

One of the reasons that streaming chat is great is that LLMs can take a while to work. Streaming enables the user to understand exactly what is happening under the hood. You can stream back both intermediate actions the LLM takes (both what actions they take, and what the results are) as well as the tokens as the LLM “thinks”.

Another benefit of streaming chat is that LLMs can mess up often. Chat provides a great interface to naturally correct and guide it! We’re very used to having follow-up conversations and discussing things iteratively over chat already.

Still, streaming chat has its drawbacks. First – streaming chat is a relatively new UX, so our existing chat platforms (iMessage, Facebook messenger, Slack, etc) don’t have this mode built in. Second, it’s a bit awkward for longer-running tasks — am I just going to sit there and watch the agent work? Third, streaming chat generally needs to be triggered by a human, which means the human is still very much in the loop.

Non-streaming Chat

It feels odd to call it “non-streaming” chat, since we would have just called this “chat” up until two years ago — but here we are. Non-streaming chat has many of the same properties of streaming chat – it exposes the LLM pretty directly to the user, and it allows for very natural corrections.

The big difference for non-streaming chat is that responses come back in completed batches, which has its pros and cons. The main con is that you can’t see what’s going on under the hood, leaving you in the dark.

But… is that actually okay?

Linus Lee had some great thoughts on “delegation” recently that I really liked. A snippet just to illustrate:

I intentionally built the interface to be as opaque as possible.

He argues that an opaque interface requires a certain amount of trust, but once established, allows you to just delegate tasks to the agent without micro-managing. This async nature also lends itself to longer-running tasks – which means agents doing more work for you.

Assuming trust is established, this seems good. But it also opens up other issues. For example, how do you handle “double-texting” — where the user messages once, the agent starts doing something, and then the user messages again with a different (and sometimes unrelated) thought before the agent finishes its task. With streaming chat, you generally don’t have this issue because the streaming of the agent blocks the user from typing new input.

One of the benefits of the non-streaming chat UX is also much more native to us, which means that it may be easier to integrate into existing workflows. People are used to texting humans – why shouldn’t they easily adapt to texting with an AI?

💡

Another large benefit of non-streaming chat is that it’s often acceptable for the AI to take longer to respond.

This is often due to non-streaming chat being integrated more natively into our existing workflows. We don’t expect our friends to text us back instantaneously – why should we expect an AI to? This makes it easier to interact with more complex agentic systems – these systems often take a while, and if there is the expectation of an instantaneous response that could be frustrating. Non-streaming chat often removes that expectation, making it easier to do more complex things.

It may initially seem that streaming is newer, flashier, and more futuristic than standard chat… but as we trust our agentic systems more, will this reverse?

Is there more than just chat?

As this is just part one of a three-part series, we believe there are more UXs to consider beyond chat. Still – it is worth reminding that chat is a very good UX, and that here’s a reason it’s so widely used.

Benefits of chat:

  • Allows user to interact directly with the model
  • Allows for easy follow up questions and/or corrections

Pros/Cons of streaming vs. non-streaming chat

Updates from the LangChain team and community