使用自定义语音协议的 Microsoft 语音识别(Xamarin Android、Websocket)
Microsoft Speech Recognition with Custom Speech Protocol (Xamarin Android, Websocket)
我正在尝试使用 Microsoft Cognitive Speech for Xamarin Android 从麦克风构建连续语音识别。我不认为 Xamarin 有图书馆。文档是:https://docs.microsoft.com/en-us/azure/cognitive-services/speech/api-reference-rest/websocketprotocol
我已经完成了 websocket 连接的事情,现在我很难将消息发送到 websocket 服务器。我在文档中注意到
We have to send Headers on a specific Path everytime we send a Message
例如,这些 header 用于设置语音协议的第一个配置,
Path : speech.config
X-Timestamp : Client UTC clock time stamp in ISO 8601 format
Content-Type : application/json; charset=utf-8
我正在使用 WebSocketClient,但我找不到任何方法来设置 header 或更改路径。有什么方法可以设置 headers and/or 更改路径以便我可以将消息正确地发送到服务器吗?还是我认知有误?
我的第二个问题是 WebSocketClient 没有任何事件处理程序来接收消息,我所做的是:
private static async Task DataReceiving(ClientWebSocket ws)
{
while (true)
{
ArraySegment<byte> bytesReceived = new ArraySegment<byte>(new byte[1024]);
WebSocketReceiveResult result = await ws.ReceiveAsync(
bytesReceived, CancellationToken.None);
Log.Info("SOCKETRECEIVED",Encoding.UTF8.GetString(bytesReceived.Array, 0, result.Count));
if (ws.State != WebSocketState.Open)
{
Log.Info("SOCKETCLOSED", "CLOSED");
break;
}
}
}
但我没有收到任何消息或任何东西。
编辑:
这是我的代码 Headers,
//List<Tuple<string, string>> Headers <<Contains [Title] and [Content]
foreach (var item in Headers)
{
message += item.Item1 + " : " + item.Item2 + Environment.NewLine;
}
message += Environment.Newline; // ensure double carriage return
编辑:
这是我发送 WAV Header 的代码:
using (MemoryStream stream = new MemoryStream())
{
short channelCount = 1;
int sampleRate = 1024;
int bitsPerSample = 16;
using (var writer = new BinaryWriter(stream, Encoding.UTF8))
{
writer.Write("Path: audio"+Environment.NewLine);
writer.Write("X-Timestamp: " + DateTime.UtcNow.ToString("yyyy-MM-ddTHH:mm:ss.fffffffZ"+Environment.NewLine));
writer.Write("Content-Type : audio/x-wav"+Environment.NewLine);
writer.Write("X-RequestId: " + Guid.NewGuid().ToString().Replace("-",string.Empty)+Environment.NewLine);
writer.Write(Environment.NewLine);
//chunk ID
writer.Write('R');
writer.Write('I');
writer.Write('F');
writer.Write('F');
writer.Write(-1); // -1 - Unknown size
//format
writer.Write('W');
writer.Write('A');
writer.Write('V');
writer.Write('E');
//subchunk 1 ID
writer.Write('f');
writer.Write('m');
writer.Write('t');
writer.Write(' ');
writer.Write(16); //subchunk 1 (fmt) size
writer.Write((short)1); //PCM audio format
writer.Write((short)channelCount);
writer.Write(sampleRate);
writer.Write(sampleRate * 2);
writer.Write((short)2); //block align
writer.Write((short)bitsPerSample);
//subchunk 2 ID
writer.Write('d');
writer.Write('a');
writer.Write('t');
writer.Write('a');
//subchunk 2 (data) size
writer.Write(-1); // -1 - Unknown size
}
byte[] result;
//using (MemoryStream ms = new MemoryStream())
//{
// stream.CopyTo(ms);
// result = ms.ToArray();
//}
result = stream.ToArray();
ArraySegment<byte> byteresult = new ArraySegment<byte>(result);
await _socketclient.SendAsync(byteresult, WebSocketMessageType.Binary, false, CancellationToken.None);
Log.Info("SENDINGWAV", System.Text.Encoding.UTF8.GetString(result));
}
这是我发送数据字节的代码,
public async Task SendByteHeader(byte[] data)
{
string s = "";
s+=("Path: audio" + Environment.NewLine);
s +=("X-Timestamp: " + DateTime.UtcNow.ToString("yyyy-MM-ddTHH:mm:ss.fffffffZ" + Environment.NewLine));
s +=("Content-Type : audio/x-wav" + Environment.NewLine);
s +=("X-RequestId: " + Guid.NewGuid().ToString().Replace("-", string.Empty) + Environment.NewLine);
s +=(Environment.NewLine);
byte[] array = Encoding.UTF8.GetBytes(s);
List<byte> endres = new List<byte>(array);
endres.AddRange(data);
ArraySegment<byte> byteresult = new ArraySegment<byte>(endres.ToArray());
await _socketclient.SendAsync(byteresult, WebSocketMessageType.Binary, false, CancellationToken.None);
Log.Info("SENDINGBYTE", Encoding.UTF8.GetString(data));
}
我在连接开始时 运行 :
Task.Run(()=>DataReceiving(_socketclient));
所以,我先发送 Wav header,然后开始发送录音中的音频字节(我使用的是 Plugin.AudioRecording)。
我还没有收到任何消息/回复。
编辑 :
我每 200 毫秒向服务器发送一些数据以使其成为 "real time",但我注意到在发送 5-6 次后,我的所有 SendAsync 都崩溃在此代码上:
await _socketclient.SendAsync(byteresult, WebSocketMessageType.Binary, false, CancellationToken.None);
错误是"Cannot access disposable object (the websocket)) "。似乎 websocket 被处理掉了?或者连接终止了?
I am using WebSocketClient but I don't find any way to set up headers or change path. Is there any way to set up the headers and/or changing path so I can send message properly to the server? Or do I have a wrong perception?
如果您参考您发布的文档的 TextWebSocket Message 部分。您可以找到以下语句:
Text WebSocket messages carry a payload of textual information that consists of a section of headers and a body separated by the familiar double-carriage-return newline pair used for HTTP messages.
这意味着,您使用 client.SendAsync()
发送到服务的消息可以由两部分组成:header 部分和 body 部分,两部分由 [= 分隔11=].
My second problem is WebSocketClient doesnt have any event handler to receive message
关于这个问题,你做的是对的,你可以等你发消息正确后再试。该服务将发回它识别出的文字消息。
我正在尝试使用 Microsoft Cognitive Speech for Xamarin Android 从麦克风构建连续语音识别。我不认为 Xamarin 有图书馆。文档是:https://docs.microsoft.com/en-us/azure/cognitive-services/speech/api-reference-rest/websocketprotocol
我已经完成了 websocket 连接的事情,现在我很难将消息发送到 websocket 服务器。我在文档中注意到
We have to send Headers on a specific Path everytime we send a Message
例如,这些 header 用于设置语音协议的第一个配置,
Path : speech.config
X-Timestamp : Client UTC clock time stamp in ISO 8601 format
Content-Type : application/json; charset=utf-8
我正在使用 WebSocketClient,但我找不到任何方法来设置 header 或更改路径。有什么方法可以设置 headers and/or 更改路径以便我可以将消息正确地发送到服务器吗?还是我认知有误?
我的第二个问题是 WebSocketClient 没有任何事件处理程序来接收消息,我所做的是:
private static async Task DataReceiving(ClientWebSocket ws)
{
while (true)
{
ArraySegment<byte> bytesReceived = new ArraySegment<byte>(new byte[1024]);
WebSocketReceiveResult result = await ws.ReceiveAsync(
bytesReceived, CancellationToken.None);
Log.Info("SOCKETRECEIVED",Encoding.UTF8.GetString(bytesReceived.Array, 0, result.Count));
if (ws.State != WebSocketState.Open)
{
Log.Info("SOCKETCLOSED", "CLOSED");
break;
}
}
}
但我没有收到任何消息或任何东西。
编辑:
这是我的代码 Headers,
//List<Tuple<string, string>> Headers <<Contains [Title] and [Content]
foreach (var item in Headers)
{
message += item.Item1 + " : " + item.Item2 + Environment.NewLine;
}
message += Environment.Newline; // ensure double carriage return
编辑: 这是我发送 WAV Header 的代码:
using (MemoryStream stream = new MemoryStream())
{
short channelCount = 1;
int sampleRate = 1024;
int bitsPerSample = 16;
using (var writer = new BinaryWriter(stream, Encoding.UTF8))
{
writer.Write("Path: audio"+Environment.NewLine);
writer.Write("X-Timestamp: " + DateTime.UtcNow.ToString("yyyy-MM-ddTHH:mm:ss.fffffffZ"+Environment.NewLine));
writer.Write("Content-Type : audio/x-wav"+Environment.NewLine);
writer.Write("X-RequestId: " + Guid.NewGuid().ToString().Replace("-",string.Empty)+Environment.NewLine);
writer.Write(Environment.NewLine);
//chunk ID
writer.Write('R');
writer.Write('I');
writer.Write('F');
writer.Write('F');
writer.Write(-1); // -1 - Unknown size
//format
writer.Write('W');
writer.Write('A');
writer.Write('V');
writer.Write('E');
//subchunk 1 ID
writer.Write('f');
writer.Write('m');
writer.Write('t');
writer.Write(' ');
writer.Write(16); //subchunk 1 (fmt) size
writer.Write((short)1); //PCM audio format
writer.Write((short)channelCount);
writer.Write(sampleRate);
writer.Write(sampleRate * 2);
writer.Write((short)2); //block align
writer.Write((short)bitsPerSample);
//subchunk 2 ID
writer.Write('d');
writer.Write('a');
writer.Write('t');
writer.Write('a');
//subchunk 2 (data) size
writer.Write(-1); // -1 - Unknown size
}
byte[] result;
//using (MemoryStream ms = new MemoryStream())
//{
// stream.CopyTo(ms);
// result = ms.ToArray();
//}
result = stream.ToArray();
ArraySegment<byte> byteresult = new ArraySegment<byte>(result);
await _socketclient.SendAsync(byteresult, WebSocketMessageType.Binary, false, CancellationToken.None);
Log.Info("SENDINGWAV", System.Text.Encoding.UTF8.GetString(result));
}
这是我发送数据字节的代码,
public async Task SendByteHeader(byte[] data)
{
string s = "";
s+=("Path: audio" + Environment.NewLine);
s +=("X-Timestamp: " + DateTime.UtcNow.ToString("yyyy-MM-ddTHH:mm:ss.fffffffZ" + Environment.NewLine));
s +=("Content-Type : audio/x-wav" + Environment.NewLine);
s +=("X-RequestId: " + Guid.NewGuid().ToString().Replace("-", string.Empty) + Environment.NewLine);
s +=(Environment.NewLine);
byte[] array = Encoding.UTF8.GetBytes(s);
List<byte> endres = new List<byte>(array);
endres.AddRange(data);
ArraySegment<byte> byteresult = new ArraySegment<byte>(endres.ToArray());
await _socketclient.SendAsync(byteresult, WebSocketMessageType.Binary, false, CancellationToken.None);
Log.Info("SENDINGBYTE", Encoding.UTF8.GetString(data));
}
我在连接开始时 运行 :
Task.Run(()=>DataReceiving(_socketclient));
所以,我先发送 Wav header,然后开始发送录音中的音频字节(我使用的是 Plugin.AudioRecording)。 我还没有收到任何消息/回复。
编辑 :
我每 200 毫秒向服务器发送一些数据以使其成为 "real time",但我注意到在发送 5-6 次后,我的所有 SendAsync 都崩溃在此代码上:
await _socketclient.SendAsync(byteresult, WebSocketMessageType.Binary, false, CancellationToken.None);
错误是"Cannot access disposable object (the websocket)) "。似乎 websocket 被处理掉了?或者连接终止了?
I am using WebSocketClient but I don't find any way to set up headers or change path. Is there any way to set up the headers and/or changing path so I can send message properly to the server? Or do I have a wrong perception?
如果您参考您发布的文档的 TextWebSocket Message 部分。您可以找到以下语句:
Text WebSocket messages carry a payload of textual information that consists of a section of headers and a body separated by the familiar double-carriage-return newline pair used for HTTP messages.
这意味着,您使用 client.SendAsync()
发送到服务的消息可以由两部分组成:header 部分和 body 部分,两部分由 [= 分隔11=].
My second problem is WebSocketClient doesnt have any event handler to receive message
关于这个问题,你做的是对的,你可以等你发消息正确后再试。该服务将发回它识别出的文字消息。