Procházet zdrojové kódy

Add the API documentation for streaming TTS (Text-to-Speech) (#6382)

chenxu9741 před 10 měsíci
rodič
revize
a6dbd26f75

+ 18 - 6
web/app/components/develop/template/template.en.mdx

@@ -103,6 +103,16 @@ The text generation application offers non-session support and is ideal for tran
       - `metadata` (object) Metadata
         - `usage` (Usage) Model usage information
         - `retriever_resources` (array[RetrieverResource]) Citation and Attribution List
+    - `event: tts_message` TTS audio stream event, that is, speech synthesis output. The content is an audio block in Mp3 format, encoded as a base64 string. When playing, simply decode the base64 and feed it into the player. (This message is available only when auto-play is enabled)
+      - `task_id` (string) Task ID, used for request tracking and the stop response interface below
+      - `message_id` (string) Unique message ID
+      - `audio` (string) The audio after speech synthesis, encoded in base64 text content, when playing, simply decode the base64 and feed it into the player
+      - `created_at` (int) Creation timestamp, e.g.: 1705395332
+    - `event: tts_message_end` TTS audio stream end event, receiving this event indicates the end of the audio stream.
+      - `task_id` (string) Task ID, used for request tracking and the stop response interface below
+      - `message_id` (string) Unique message ID
+      - `audio` (string) The end event has no audio, so this is an empty string
+      - `created_at` (int) Creation timestamp, e.g.: 1705395332
     - `event: message_replace` Message content replacement event.
       When output content moderation is enabled, if the content is flagged, then the message content will be replaced with a preset reply through this event.
       - `task_id` (string) Task ID, used for request tracking and the below Stop Generate API
@@ -185,6 +195,8 @@ The text generation application offers non-session support and is ideal for tran
       data: {"event": "message", "message_id": : "5ad4cb98-f0c7-4085-b384-88c403be6290", "answer": " meet", "created_at": 1679586595}
       data: {"event": "message", "message_id": : "5ad4cb98-f0c7-4085-b384-88c403be6290", "answer": " you", "created_at": 1679586595}
       data: {"event": "message_end", "id": "5e52ce04-874b-4d27-9045-b3bc80def685", "metadata": {"usage": {"prompt_tokens": 1033, "prompt_unit_price": "0.001", "prompt_price_unit": "0.001", "prompt_price": "0.0010330", "completion_tokens": 135, "completion_unit_price": "0.002", "completion_price_unit": "0.001", "completion_price": "0.0002700", "total_tokens": 1168, "total_price": "0.0013030", "currency": "USD", "latency": 1.381760165997548}}}
+      data: {"event": "tts_message", "conversation_id": "23dd85f3-1a41-4ea0-b7a9-062734ccfaf9", "message_id": "a8bdc41c-13b2-4c18-bfd9-054b9803038c", "created_at": 1721205487, "task_id": "3bf8a0bb-e73b-4690-9e66-4e429bad8ee7", "audio": "qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq"}
+      data: {"event": "tts_message_end", "conversation_id": "23dd85f3-1a41-4ea0-b7a9-062734ccfaf9", "message_id": "a8bdc41c-13b2-4c18-bfd9-054b9803038c", "created_at": 1721205487, "task_id": "3bf8a0bb-e73b-4690-9e66-4e429bad8ee7", "audio": ""}
     ```
     </CodeGroup>
   </Col>
@@ -495,29 +507,29 @@ The text generation application offers non-session support and is ideal for tran
     ### Request Body
 
     <Properties>
+      <Property name='message_id' type='str' key='text'>
+        For text messages generated by Dify, simply pass the generated message-id directly. The backend will use the message-id to look up the corresponding content and synthesize the voice information directly. If both message_id and text are provided simultaneously, the message_id is given priority.
+      </Property>
       <Property name='text' type='str' key='text'>
         Speech generated content。
       </Property>
       <Property name='user' type='string' key='user'>
         The user identifier, defined by the developer, must ensure uniqueness within the app.
       </Property>
-      <Property name='streaming' type='bool' key='streaming'>
-        Whether to enable streaming output, true、false。
-      </Property>
     </Properties>
   </Col>
   <Col sticky>
 
-    <CodeGroup title="Request" tag="POST" label="/text-to-audio" targetCode={`curl -o text-to-audio.mp3 -X POST '${props.appDetail.api_base_url}/text-to-audio' \\\n--header 'Authorization: Bearer {api_key}' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\n    "text": "Hello Dify",\n    "user": "abc-123",\n    "streaming": false\n}'`}>
+    <CodeGroup title="Request" tag="POST" label="/text-to-audio" targetCode={`curl -o text-to-audio.mp3 -X POST '${props.appDetail.api_base_url}/text-to-audio' \\\n--header 'Authorization: Bearer {api_key}' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\n    "message_id": "5ad4cb98-f0c7-4085-b384-88c403be6290",\n    "text": "Hello Dify",\n    "user": "abc-123"\n}'`}>
 
     ```bash {{ title: 'cURL' }}
     curl -o text-to-audio.mp3 -X POST '${props.appDetail.api_base_url}/text-to-audio' \
     --header 'Authorization: Bearer {api_key}' \
     --header 'Content-Type: application/json' \
     --data-raw '{
+        "message_id": "5ad4cb98-f0c7-4085-b384-88c403be6290",
         "text": "Hello Dify",
-        "user": "abc-123",
-        "streaming": false
+        "user": "abc-123"
     }'
     ```
     

+ 19 - 6
web/app/components/develop/template/template.zh.mdx

@@ -97,12 +97,22 @@ import { Row, Col, Properties, Property, Heading, SubProperty } from '../md.tsx'
       - `message_id` (string) 消息唯一 ID
       - `answer` (string) LLM 返回文本块内容
       - `created_at` (int) 创建时间戳,如:1705395332
-    - `event: message_end` 消息结束事件,收到此事件则代表流式返回结束。
+    - `event: message_end` 消息结束事件,收到此事件则代表文本流式返回结束。
       - `task_id` (string) 任务 ID,用于请求跟踪和下方的停止响应接口
       - `message_id` (string) 消息唯一 ID
       - `metadata` (object) 元数据
         - `usage` (Usage) 模型用量信息
         - `retriever_resources` (array[RetrieverResource]) 引用和归属分段列表
+    - `event: tts_message` TTS 音频流事件,即:语音合成输出。内容是Mp3格式的音频块,使用 base64 编码后的字符串,播放的时候直接解码即可。(开启自动播放才有此消息)
+      - `task_id` (string) 任务 ID,用于请求跟踪和下方的停止响应接口
+      - `message_id` (string) 消息唯一 ID
+      - `audio` (string) 语音合成之后的音频块使用 Base64 编码之后的文本内容,播放的时候直接 base64 解码送入播放器即可
+      - `created_at` (int) 创建时间戳,如:1705395332
+    - `event: tts_message_end` TTS 音频流结束事件,收到这个事件表示音频流返回结束。
+      - `task_id` (string) 任务 ID,用于请求跟踪和下方的停止响应接口
+      - `message_id` (string) 消息唯一 ID
+      - `audio` (string) 结束事件是没有音频的,所以这里是空字符串
+      - `created_at` (int) 创建时间戳,如:1705395332
     - `event: message_replace` 消息内容替换事件。
       开启内容审查和审查输出内容时,若命中了审查条件,则会通过此事件替换消息内容为预设回复。
       - `task_id` (string) 任务 ID,用于请求跟踪和下方的停止响应接口
@@ -162,6 +172,8 @@ import { Row, Col, Properties, Property, Heading, SubProperty } from '../md.tsx'
     ```streaming {{ title: 'Response' }}
       data: {"id": "5ad4cb98-f0c7-4085-b384-88c403be6290", "answer": " I", "created_at": 1679586595}
       data: {"id": "5ad4cb98-f0c7-4085-b384-88c403be6290", "answer": " I", "created_at": 1679586595}
+      data: {"event": "tts_message", "conversation_id": "23dd85f3-1a41-4ea0-b7a9-062734ccfaf9", "message_id": "a8bdc41c-13b2-4c18-bfd9-054b9803038c", "created_at": 1721205487, "task_id": "3bf8a0bb-e73b-4690-9e66-4e429bad8ee7", "audio": "qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq"}
+      data: {"event": "tts_message_end", "conversation_id": "23dd85f3-1a41-4ea0-b7a9-062734ccfaf9", "message_id": "a8bdc41c-13b2-4c18-bfd9-054b9803038c", "created_at": 1721205487, "task_id": "3bf8a0bb-e73b-4690-9e66-4e429bad8ee7", "audio": ""}
     ```
     </CodeGroup>
   </Col>
@@ -456,26 +468,27 @@ import { Row, Col, Properties, Property, Heading, SubProperty } from '../md.tsx'
     ### Request Body
 
     <Properties>
+      <Property name='message_id' type='str' key='text'>
+        Dify 生成的文本消息,那么直接传递生成的message-id 即可,后台会通过 message_id 查找相应的内容直接合成语音信息。如果同时传 message_id 和 text,优先使用 message_id。
+      </Property>
       <Property name='text' type='str' key='text'>
-        语音生成内容。
+        语音生成内容。如果没有传 message-id的话,则会使用这个字段的内容
       </Property>
       <Property name='user' type='string' key='user'>
         用户标识,由开发者定义规则,需保证用户标识在应用内唯一。
       </Property>
-      <Property name='streaming' type='bool' key='streaming'>
-        是否启用流式输出true、false。
-      </Property>
     </Properties>
   </Col>
   <Col sticky>
 
-    <CodeGroup title="Request" tag="POST" label="/text-to-audio" targetCode={`curl -o text-to-audio.mp3 -X POST '${props.appDetail.api_base_url}/text-to-audio' \\\n--header 'Authorization: Bearer {api_key}' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\n    "text": "你好Dify",\n    "user": "abc-123",\n    "streaming": false\n}'`}>
+    <CodeGroup title="Request" tag="POST" label="/text-to-audio" targetCode={`curl -o text-to-audio.mp3 -X POST '${props.appDetail.api_base_url}/text-to-audio' \\\n--header 'Authorization: Bearer {api_key}' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\n    "message_id": "5ad4cb98-f0c7-4085-b384-88c403be6290",\n    "text": "你好Dify",\n    "user": "abc-123"\n}'`}>
 
     ```bash {{ title: 'cURL' }}
     curl -o text-to-audio.mp3 -X POST '${props.appDetail.api_base_url}/text-to-audio' \
     --header 'Authorization: Bearer {api_key}' \
     --header 'Content-Type: application/json' \
     --data-raw '{
+        "message_id": "5ad4cb98-f0c7-4085-b384-88c403be6290",
         "text": "你好Dify",
         "user": "abc-123",
         "streaming": false

Rozdílová data souboru nebyla zobrazena, protože soubor je příliš velký
+ 17 - 4
web/app/components/develop/template/template_advanced_chat.en.mdx


Rozdílová data souboru nebyla zobrazena, protože soubor je příliš velký
+ 19 - 7
web/app/components/develop/template/template_advanced_chat.zh.mdx


Rozdílová data souboru nebyla zobrazena, protože soubor je příliš velký
+ 20 - 6
web/app/components/develop/template/template_chat.en.mdx


Rozdílová data souboru nebyla zobrazena, protože soubor je příliš velký
+ 20 - 6
web/app/components/develop/template/template_chat.zh.mdx


Rozdílová data souboru nebyla zobrazena, protože soubor je příliš velký
+ 12 - 0
web/app/components/develop/template/template_workflow.en.mdx


Rozdílová data souboru nebyla zobrazena, protože soubor je příliš velký
+ 12 - 0
web/app/components/develop/template/template_workflow.zh.mdx