Skip to content

Conversation

@julien-nc
Copy link
Member

@julien-nc julien-nc commented Jul 3, 2025

Add the ability to record/submit audio input in the chat UI, run an "Audio Chat" task and to display the audio output and its transcription.

  • Generic audio chat provider that schedules 3 sub tasks: STT, Chat, TTS
  • UI in the chatty UI
  • Chatty UI Backend: schedule the Audio chat task and create the messages accordingly
  • Support Agency
  • Prevent cleaning up chat attachments
  • Answer with audio even after an agency confirmation
  • Add user setting to toggle autoplay
  • Store the audio_id optional output of the AudioToAudioChat tasks and use it to build the history instead of only using the transcripts
  • Store and use audio_expires_at optional output of the AudioToAudioChat tasks
  • Delete related tasks when deleting a session or a message
  • Fix security issue letting a user delete any message (check message.sessionId in ChattyLLMController::deleteMessage)
  • Fix race condition in ChattyLLMController::checkMessageGenerationTask that was failing if the message generation task finished successfully but there was no message created yet

Requirements to test this

  • Use server's up-to-date master
  • Make sure there's a provider for audio chat or agency audio chat
    • integration_openai is an AudioToAudioChat provider
    • assistant contains a provider for audio chat and agency audio chat, they need providers for STT, TTS and TextToTextChat

@julien-nc julien-nc added enhancement New feature or request 3. to review labels Jul 3, 2025
@julien-nc julien-nc force-pushed the enh/noid/audio-chat branch 2 times, most recently from c546092 to 641cb96 Compare July 3, 2025 11:49
@julien-nc julien-nc force-pushed the enh/noid/audio-chat branch 3 times, most recently from ecdbccb to 922f04a Compare July 4, 2025 15:54
if (class_exists('OCP\\TaskProcessing\\TaskTypes\\AudioToAudioChat')
&& $taskTypeId === \OCP\TaskProcessing\TaskTypes\AudioToAudioChat::ID) {
$message->setContent(trim($task->getOutput()['output_transcript'] ?? ''));
$message->setAttachments('[{"type":"Audio","fileId":' . $task->getOutput()['output'] . '}]');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output file is deleted by the task processing cleanup job after a while. Do we want to make sure that it lives longer than that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we might wanna avoid cleaning up the chat-related data.
But then we should correctly cleanup when a chat session is deleted.

@julien-nc julien-nc force-pushed the enh/noid/ui-adjustments branch from 1f6e0aa to fd6004f Compare July 8, 2025 08:24
Base automatically changed from enh/noid/ui-adjustments to main July 8, 2025 08:28
@julien-nc julien-nc force-pushed the enh/noid/audio-chat branch from 16bfe1f to 495ce20 Compare July 8, 2025 08:28
@julien-nc julien-nc force-pushed the enh/noid/audio-chat branch 3 times, most recently from 9b3da26 to 403400b Compare July 9, 2025 00:31
@julien-nc julien-nc force-pushed the enh/noid/audio-chat branch from c575a42 to 3bde1cf Compare July 10, 2025 10:16
@julien-nc julien-nc requested a review from kyteinsky July 10, 2025 10:19
Copy link
Contributor

@kyteinsky kyteinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sweet, works seamlessly!

  1. not changed in this PR but audio output in chrome family looks a bit odd. It's not a problem in firefox. To unify the look, border-radius: 100px; can be used for all the audio elements.

before:
image
image

after:
image
image

  1. It would be nice to update the title of the chat after the transcript of the first audio comes in.

Comment on lines 229 to 236
try {
$task = $this->taskProcessingManager->getTask($ocpTaskId);
$this->taskProcessingManager->deleteTask($task);
} catch (\OCP\TaskProcessing\Exception\Exception) {
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not silent the exception and let the user know that the task deletion was unsuccessful. It most likely will be a DB failure which would throw in the next call $this->sessionMapper->deleteSession anyway

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't care that much if the task has not been deleted. It can be caused by either:

  • the task is not found: all good nothing to delete
  • the task couldn't be deleted for some reason: all good it will be cleaned up later by the task processing cleanup bg job

I think the user does not need to know there are tasks behind the scene.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense.
Would you mind adding these two points as comments there too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@julien-nc julien-nc requested a review from kyteinsky July 11, 2025 14:54
@julien-nc
Copy link
Member Author

@kyteinsky Thank you for the review! I think I addressed everything.

julien-nc added 20 commits July 15, 2025 10:24
…s in the backend, schedule proper audioChat task

Signed-off-by: Julien Veyssier <[email protected]>
…f the server task type exists

Signed-off-by: Julien Veyssier <[email protected]>
…t /task-types endpoint

Signed-off-by: Julien Veyssier <[email protected]>
… the logic to schedule audio tasks for agency

Signed-off-by: Julien Veyssier <[email protected]>
…al settings task type availability values

Signed-off-by: Julien Veyssier <[email protected]>
…an use in the history instead of the transcripts

Signed-off-by: Julien Veyssier <[email protected]>
… prevent deleting a message from another session

Signed-off-by: Julien Veyssier <[email protected]>
…t ContextAgentAudioInteraction in the assistant task type list, change the audio html tag border radius

Signed-off-by: Julien Veyssier <[email protected]>
…sing phpDoc, rename autoplay_audio_chat and audio_chat_available initial states

Signed-off-by: Julien Veyssier <[email protected]>
Signed-off-by: Julien Veyssier <[email protected]>
… do it when we get its transcript

Signed-off-by: Julien Veyssier <[email protected]>
@julien-nc julien-nc force-pushed the enh/noid/audio-chat branch from f76d909 to 1bec314 Compare July 15, 2025 08:52
Copy link
Member

@marcelklehr marcelklehr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Prevent cleaning up chat attachments (Or just copy them to user directory so we can manage them ourselves)
  • lint:fix

@julien-nc julien-nc force-pushed the enh/noid/audio-chat branch from 1bec314 to cf6648a Compare July 15, 2025 08:59
@julien-nc julien-nc requested a review from marcelklehr July 15, 2025 08:59
Copy link
Contributor

@kyteinsky kyteinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@julien-nc julien-nc merged commit ec6f230 into main Jul 15, 2025
16 checks passed
@julien-nc julien-nc deleted the enh/noid/audio-chat branch July 15, 2025 09:39
@janepie janepie mentioned this pull request Aug 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3. to review enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants