(openai responses): add websocket connection pool by tinalenguyen · Pull Request #4985 · livekit/agents

tinalenguyen · 2026-03-03T02:27:57Z

when there are two parallel streams, restart and send the full context on the next request
each request will use its own websocket connection (edit: and each WS connection is independent of response IDs)

chenghao-mou · 2026-03-03T14:36:01Z

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py

+        self._pool = utils.ConnectionPool[aiohttp.ClientWebSocketResponse](
+            connect_cb=self._create_ws_conn,
+            close_cb=self._close_ws,
+            max_session_duration=3600,


But we can't reuse the same WS connection for different conversations, right? Even with store=True, the server has to rehydrate the chat history if the ws connection expects a different previous_response_id.

I could have worded my comment better, but each websocket connection is independent of previous_response_id

But the server-side in-memory cache for each connection seems very dependent:

On an active WebSocket connection, the service keeps one previous-response state in a connection-local in-memory cache (the most recent response). Continuing from that most recent response is fast because the service can reuse connection-local state. Because the previous-response state is retained only in memory and is not written to disk, you can use WebSocket mode in a way that is compatible with store=false and Zero Data Retention (ZDR).

If a previous_response_id is not in the in-memory cache, behavior depends on whether you store responses:

With store=true, the service may hydrate older response IDs from persisted state when available. Continuation can still work, but it usually loses the in-memory latency benefit.
With store=false (including ZDR), there is no persisted fallback. If the ID is uncached, the request returns previous_response_not_found.

Do we see previous_response_not_found when store=false if you reuse the same connection for two conversations?

Ah we don't use previous_response_id when store=False, we just send the entire context in that case (relevant line)

Gotcha, that makes sense.

chenghao-mou

mostly lgtm. Tested it locally and it worked well.

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py

chenghao-mou · 2026-03-03T20:48:48Z

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py

-                        if isinstance(o, openai.BaseModel):
-                            return o.model_dump()
-                        raise TypeError(f"unexpected type {type(o)}")
+    async def send_request(self, msg: dict) -> AsyncGenerator[dict, None]:


nit: the name seems misleading in a sense that it sends the request but also receives the responses. Maybe process_request?

how about generate_response?

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py

chenghao-mou

lgtm. One nit about APIStatusError construction.

chenghao-mou · 2026-03-04T15:13:30Z

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py

+                        f"OpenAI Responses WebSocket closed: {close_reason}",
+                        status_code=close_code,
+                        retryable=False,
                    )


raw_msg has both .data and .extra we can leverage:

APIStatusError( "AssemblyAI connection closed unexpectedly", status_code=ws.close_code or -1, body=f"{msg.data=} {msg.extra=}", )

tinalenguyen added 2 commits March 2, 2026 21:13

add websocket pool

fa239f7

typefixes

745a7fa

chenghao-mou requested a review from a team March 3, 2026 02:28

This comment was marked as resolved.

Sign in to view

refactor

b8b1398

chenghao-mou reviewed Mar 3, 2026

View reviewed changes

renaming + misc

d2f6d06

This comment was marked as resolved.

Sign in to view

chenghao-mou approved these changes Mar 4, 2026

View reviewed changes

restructure apistatuserror

256db24

tinalenguyen merged commit f3063a1 into main Mar 4, 2026
19 checks passed

tinalenguyen deleted the tina/oai-responses-pool-websockets branch March 4, 2026 20:09

Conversation

tinalenguyen commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chenghao-mou left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

chenghao-mou left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tinalenguyen commented Mar 3, 2026 •

edited

Loading