Objc - Speech Sample

jfperusse · September 8, 2022, 1:56pm

objc - Speech Sample

Codea Version

3.6

Description

A simple example of using the SFSpeechRecognizer to control a character using voice commands. This sample uses the en_US locale but can easily be udpated to support a different language.

The SFSpeechRecognizer cannot easily recognize single words and thus the voice commands included are all two words long.

Here’s the list of voice commands included with the sample:

Go up
Go down
Go left
Go right
Say hello
Say goodbye

API Covered

objc.cls
objc.enum

sim · September 8, 2022, 2:43pm

This is so cool! Feels like a great prototype for the basis of a virtual pet type game where you can talk to your pet

Another fun use of this would be to make a virtual magic 8-ball, dictate questions and get flippant advice

binaryblues · September 8, 2022, 5:06pm

A very creative example！ very interesting！

I have a little problem, How to make it display text at the same time, can also produce sound to say hello？

I tried to modify the following code, but it didn’t work. It looks as if I’m not allowed to make a sound with the speech.say() while doing speech recognition

        ["Hello"] = function() say("Hello!"); speech.say("Hello!") end,

Bri_G · September 8, 2022, 5:42pm

@jfperusse - very neat, is the intention to include this facility within Codea or do we have to use these external C libraries?

@John - just a quick note on an observation that I have seen in V4 which seems to pop up intermittently. That is the stop execution button on the controls in the BRHS of the screen. I ran this demo, tapped the screen a few times and then tried to close the project - it was very slow to respond after several attempts.

Closing the project, restarting it and immediate closing worked on one tap quickly. But, a repeat and after several on screen taps before tapping the close control again took ages and seemed not to respond initially.

I don’t think this is limited to this project.

Feels like the parsing of the controls is not in the main touch loop.

Edit: weird, came out of Codea after posting this. Re-started Codea and loaded this project and ran it, but this time I noticed the speech bubble posted up (wasn’t there before) and the touch response was excellent. Looks like it may be one of those something left in memory interfering with a newly loaded project. Or, since I downloaded this project and ran it directly, some interference from the downloaded project.

On that topic - it seems a bit odd that you need to set up a new project to access the forum website from the Codea Project Editor menu - would it be better to access the forum website via the project files window and download and store before running projects?

jfperusse · September 8, 2022, 5:49pm

Thanks! It’s quite possible the audio engine used for the speech recognition is preventing the ability to play speech audio. I will have to investigate if that’s an issue with how it’s configured or if both simply cannot work together at the same time. If it’s the later, then one approach which should work would be to enable/disable the speech recognition based on inputs. For example, you could have a “Talk” button which you have to hold to use speech recognition, release it when you’re done talking to process the command, and then play the audio using the speech api.

jfperusse · September 8, 2022, 6:00pm

At the moment, the main intention of this sample is to show how the objc feature can be used to access speech recognition, but it should be easy to hide most of the low-level objc functionnality behind an higher level Lua library using the code I shared.

binaryblues · September 8, 2022, 6:13pm

I’ve learned that, like some social software, there should be a state-switching mode that turns on speech recognition and disables voice playback when you speak. When voice input is stopped, the voice recognition function is turned off and the voice is allowed to play. Thank you for the explanation.

UberGoober · December 21, 2022, 7:26pm

This is so cool! Hi everyone from the old forums, btw.

UberGoober · December 23, 2022, 4:45pm

Would it be okay to put this in WebRepo? I’ve been away for a while, is WebRepo still a thing? I hope so, it’s super cool.

sim · December 27, 2022, 11:14am

I think @jfperusse would be fine with that!

jfperusse · December 27, 2022, 1:43pm

Hey @UberGoober! That’d be great, I see no problem adding it to WebRepo

Steppers · January 27, 2023, 5:30pm

@jfperusse This is great! I’ve fixed a few issues with objc syntax regressions in new betas (objc.<class> vs. objc.cls.<class> ) for the WebRepo version.
@UberGoober I’ve added the project to WebRepo.

Forgive me for messing with the version numbers though I’ve hooked the backend up to the forum so new projects are announced automatically in the WebRepo thread.

UberGoober · February 22, 2023, 5:16pm

It’s no longer working for me. It always just shows a question mark no matter what I say.

jfperusse · March 8, 2023, 2:14am

Hi @UberGoober! Could you try changing the MaxRecognitionTime to 2.0? It seems like 0.5 might be way too short.

Thanks!

UberGoober · March 8, 2023, 5:53am

Thanks for the suggestion. I tried it just now, it doesn’t seem to change the results.

jfperusse · March 8, 2023, 1:45pm

Interesting… after doing the change I was able to use the commands, but maybe there’s something else going on.

One thing you could try is adding a print(message) around line 51, before looping over the possible commands. Then, when you see the “?”, look at the console to see what was recognized by the device. This might give us a hint as to what’s going on.

dave1707 · March 8, 2023, 4:25pm

@jfperusse I’m using the zip file at the very beginning of this post. When I run the code, I get

objc.cls.ClassName is deprecated. Use objc.ClassName instead.

Don’t know if that causes a problem or not or if the zip at the beginning is the correct version to use.

When I run the code, my volume goes to max and I can’t reduce it by trying to slide the volume down (control panel) or using the volume down button. I have to exit Codea before I can change the volume.

Here’s startListening() that I added print statements to.

When I run the code, it prints a1, a2, and a5. When I speak a command, nothing happens. I don’t get the a3 or a4 print statements.

function startListening()
    print("a1")
    recognitionRequest = objc.cls.SFSpeechAudioBufferRecognitionRequest()
    recognitionRequest.shouldReportPartialResults = true
    recognitionRequest.requiresOnDeviceRecognition = true
    recognitionTask = speechRecognizer:recognitionTaskWithRequest_resultHandler_(recognitionRequest,
    function (oResult, oError)
        print("a2")
        if oResult ~= nil and oResult.bestTranscription.formattedString ~= nil then
            print("a3")
            if messageStart == -1 then
                messageStart = ElapsedTime
            end
            local message = oResult.bestTranscription.formattedString
            local foundCommand = false
            print("a4")
            for k, v in pairs(commands) do
                if message == k then
                    v()
                    foundCommand = true
                    restartListening()
                    break
                end
            end
        end
        print("a5")
    end)
    inputNode = audioEngine.inputNode
    recordingFormat = inputNode:outputFormatForBus_(0)
    inputNode:installTapOnBus_bufferSize_format_block_(0, 1024, recordingFormat,
        function(oBuffer, oTime) recognitionRequest:appendAudioPCMBuffer_(oBuffer) end)
    audioEngine:prepare()
    audioEngine:startAndReturnError_(nil)
end

jfperusse · March 8, 2023, 4:36pm

@dave1707 Could you try the latest version from WebRepo please?

dave1707 · March 8, 2023, 5:01pm

@jfperusse Loaded from WebRep. I added the print statements in the same spots as before. When I run, it prints a1, a5, a2. The volume goes to max and won’t change until I exit Codea. I say the Go Up command and nothing happens.

jfperusse · March 8, 2023, 5:16pm

@dave1707 There might be an error at a2. Could you try printing out oError? If it’s not null, you could also try printing oError.localizedDescription