[TaskProcessing] Add audio-to-audio chat task type #53759

julien-nc · 2025-07-02T10:31:56Z

This is going towards making it possible to have a voice interaction with the Assistant.
The Assistant chat interface will schedule such audio-to-audio chat tasks if Agency is not available.

This does not cover Agency for now.
Do we make a ContextAgentVoiceInteraction task type that takes voice as input?
We just need to support voice input and voice output. In between, the interactions between the agent and the model are text-based. So the Context Agent app would most likely still schedule TextToTextChatWithTools tasks.
We can also go the easy road and do the 3 steps when Agency is involved: STT -> Context Agent -> TTS. So we would not need any change in Context Agent and no new task type.
To be discussed.

marcelklehr · 2025-07-03T06:44:54Z

lib/public/TaskProcessing/TaskTypes/AudioToAudioChat.php

+			'history' => new ShapeDescriptor(
+				$this->l->t('Chat history'),
+				$this->l->t('The history of chat messages before the current message, starting with a message by the user'),
+				EShapeType::ListOfTexts


Where do we get the texts from?

From the chat conversation history in the chat UI.
Ideally we need a mixed list of text and audio but since we will get the transcription of audio responses (as optional output), my plan is to store that in a chat message so the history can be text-only for now. Wdyt?

I see. But if the transcription is an optional output, we may not get it when other providers implement this task type. Do we schedule another transcription task then?

I think so, yes.
I even think the text output could be mandatory and part of the task type output shape (because we need it for the history). Wdyt?

Yeah, I was thinking that as well, let's make it mandatory

The input transcription is optional though. Or is it? We need it too for the history.

lib/public/TaskProcessing/TaskTypes/AudioToAudioChat.php

kyteinsky

🚀

Signed-off-by: Julien Veyssier <[email protected]>

julien-nc · 2025-07-07T09:41:25Z

Rebased on master.
Added input_transcript in the output shape (so it's mandatory).

julien-nc added this to the Nextcloud 32 milestone Jul 2, 2025

julien-nc requested review from kyteinsky, marcelklehr and oleksandr-nc July 2, 2025 10:31

julien-nc requested a review from a team as a code owner July 2, 2025 10:31

julien-nc removed the request for review from a team July 2, 2025 10:31

julien-nc added the enhancement label Jul 2, 2025

julien-nc requested review from icewind1991, skjnldsv and sorbaugh July 2, 2025 10:31

julien-nc added 3. to review Waiting for reviews feature: text processing labels Jul 2, 2025

julien-nc mentioned this pull request Jul 2, 2025

Implement audio chat provider nextcloud/integration_openai#227

Merged

marcelklehr reviewed Jul 3, 2025

View reviewed changes

julien-nc force-pushed the enh/noid/taskpro-audio-chat branch from a549444 to b9c1fc0 Compare July 3, 2025 08:43

oleksandr-nc reviewed Jul 4, 2025

View reviewed changes

lib/public/TaskProcessing/TaskTypes/AudioToAudioChat.php Outdated Show resolved Hide resolved

lib/public/TaskProcessing/TaskTypes/AudioToAudioChat.php Show resolved Hide resolved

lib/public/TaskProcessing/TaskTypes/AudioToAudioChat.php Show resolved Hide resolved

julien-nc force-pushed the enh/noid/taskpro-audio-chat branch from b9c1fc0 to cce12a9 Compare July 4, 2025 09:02

julien-nc requested review from marcelklehr and oleksandr-nc July 4, 2025 09:02

oleksandr-nc approved these changes Jul 4, 2025

View reviewed changes

kyteinsky approved these changes Jul 4, 2025

View reviewed changes

feat(TaskProcessing): add audio-to-audio chat task type

af059cb

Signed-off-by: Julien Veyssier <[email protected]>

julien-nc force-pushed the enh/noid/taskpro-audio-chat branch from cce12a9 to af059cb Compare July 7, 2025 09:39

marcelklehr approved these changes Jul 7, 2025

View reviewed changes

marcelklehr merged commit 58a3710 into master Jul 7, 2025
194 checks passed

marcelklehr deleted the enh/noid/taskpro-audio-chat branch July 7, 2025 12:31

julien-nc mentioned this pull request Jul 7, 2025

[TaskProcessing] Add agency audio-to-audio task type #53846

Merged

skjnldsv mentioned this pull request Aug 19, 2025

32.0.0 beta 1 #54502

Merged

skjnldsv removed this from the Nextcloud 32 milestone Sep 28, 2025

skjnldsv modified the milestones: Nextcloud 33, Nextcloud 32 Sep 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[TaskProcessing] Add audio-to-audio chat task type #53759

[TaskProcessing] Add audio-to-audio chat task type #53759

Uh oh!

julien-nc commented Jul 2, 2025

Uh oh!

marcelklehr Jul 3, 2025

Uh oh!

julien-nc Jul 3, 2025

Uh oh!

marcelklehr Jul 3, 2025

Uh oh!

julien-nc Jul 3, 2025

Uh oh!

marcelklehr Jul 3, 2025

Uh oh!

julien-nc Jul 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kyteinsky left a comment

Uh oh!

julien-nc commented Jul 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

[TaskProcessing] Add audio-to-audio chat task type #53759

[TaskProcessing] Add audio-to-audio chat task type #53759

Uh oh!

Conversation

julien-nc commented Jul 2, 2025

Uh oh!

marcelklehr Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

julien-nc Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

marcelklehr Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

julien-nc Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

marcelklehr Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

julien-nc Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kyteinsky left a comment

Choose a reason for hiding this comment

Uh oh!

julien-nc commented Jul 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants