Create Amazon Alexa custom skill to interact with your PC

I have been developing custom Alexa skill lately and wanted to tell you how fun and easy it is to build an Alexa custom skill from scratch.

Why build a custom skill?

Amazon echo and echo dots are smart speaker devices developed by amazon.com and released at the end of 2014 to end-users. These devices are powered by Amazon Alexa intelligent personal assistant software service, again developed by Amazon. The most exciting thing about Alexa, when compared to its competitors (very few in fact), is its ability to add custom skills. We can develop custom skills for Alexa and host it in their repository. This allows people to install custom skills to their Alexa powered devices like amazon echo and make use of that skill! Exciting isn’t it?

I had a chance to get an echo dot device from the UK for testing from our office Gyanmatrix (as echo dot is not available in India yet. Amazon.. please make it fast!!). All thanks to Gyanmatrix!
I have developed a simple skill for Alexa which interacts with my PC and opens an application for me. How cool is that! Are you ready? Let’s dive in and get our hands dirty…

Our skill recipe

The first step is to get an AWS account. Go ahead and create one for yourself if you haven’t already. Because we are going to use AWS Lambda for Alexa to communicate.
Follow these steps:

  1. Go to developer.amazon.com
  2. Sign up / Sign in
  3. After that click the Alexa tab.
  4. Then go to the Alexa skills kit.
  5. To get into skill mode Click get started.
  6. Unleash the power of new skill by Clicking “Add new skill”.

Name: Display Name: PC Interaction Skill
invocation: ‘app automator’
This is to get “Alexa, Ask app automator to do something for me”
Click next

Intent Schema:

{
	"intents" : [
		{
			"intent": "ComputerAction",
			"slots": [
				{
					"name": "app",
					"type": "COMPUTER_APPS"
				}
			]
		}
	]
}

Add custom slot type:
Type: COMPUTER_APPS
Values: open

Sample Utterances:

ComputerAction open {app}
ComputerAction go to {app}
ComputerAction start {app}
ComputerAction open {app} in my computer
ComputerAction go to {app} in my computer
ComputerAction start {app} in my computer
ComputerAction open {app} in my pc
ComputerAction go to {app} in my pc
ComputerAction start {app} in my pc

Utterances are possibilities of what user can utter to the device. We can click next now.

Select Endpoint
Let us select Lambda ARN:

Oh, wait! We haven’t created a Lambda function yet. To get Lambda ARN we need to create a Lambda function
To do that go to console.aws.amazon.com

Let’s go serverless!

  1. Through the “Services” link on the top menu click Lambda
  2. Configure triggers
  3. There should be a blank dashed box, click on that
  4. Here click Alexa skills kit and click next

Name: ComputerActionAPI
Description: Computer action API for Alexa skill
Runtime: Node v{latest}

var request = require('sync-request');
var wsBaseUrl = 'https://my-server.herokuapp.com';

postBrowserAction = (input, url) => {
  var res = request('POST', url, { json: input });
  var data = JSON.parse(res.getBody('utf8'));

  if (data) return true;
  return false;
};

exports.handler = (event, context) => {

  try {

    if (event.session.new) {
      // New Session
      console.log("NEW SESSION")
    }

    switch (event.request.type) {

      case "LaunchRequest":
        // Launch Request
        console.log(`LAUNCH REQUEST`)
        context.succeed(
          generateResponse(
            buildSpeechletResponse("Welcome to Computer interaction Alexa Skill API, this is running on our lambda function", true),
            {}
          )
        )
        break;

      case "IntentRequest":
        // Intent Request
        console.log(`INTENT REQUEST`);

        switch(event.request.intent.name) {
          case "ComputerAction":
            var output = '';
            // Get app name variable from the utterance slot.
            var app = event.request.intent.slots.app.value;
            var res = postBrowserAction({computer_actions: {open: app}}, wsBaseUrl+'/computer-actions');

            if (res) output = 'Success';
            else output = 'Sorry. Failed';

            context.succeed(
              generateResponse(
                buildSpeechResponse(output, true),
                {}
              )
            );
            break;

          default:
            throw "Invalid intent"
        }

        break;

      case "SessionEndedRequest":
        // Session Ended Request
        console.log(`SESSION ENDED REQUEST`)
        break;

      default:
        context.fail(`INVALID REQUEST TYPE: ${event.request.type}`)

    }

  } catch(error) { context.fail(`Exception: ${error}`) }

}

// Helpers
buildSpeechletResponse = (outputText, shouldEndSession) => {

  return {
    outputSpeech: {
      type: "PlainText",
      text: outputText
    },
    shouldEndSession: shouldEndSession
  }

}

generateResponse = (speechletResponse, sessionAttributes) => {

  return {
    version: "1.0",
    sessionAttributes: sessionAttributes,
    response: speechletResponse
  }

}

Handler: index.handler
Role: This is the annoying thing AWS always asks for. Either create a custom role: lambda_basic_execution or choose existing role.
After that click Next and click the blue button to create the lambda function

Once the lambda function has been created we will get an ARN.
Copy this Amazon resource number.
Now go back to developer.amazon.com -> skill
Paste the ARN which we copied earlier and click Next.
This links our lambda function to our Alexa powered device (custom skill in fact).
We should be able to see this in our app (or alexa.amazon.com) -> skills -> Your skills

But wait… We defined wsBaseUrl as ‘https://my-server.herokuapp.com’, and dint create it yet!

Websockets are Superfast

How do we establish a connection to our computer from the lambda function? Simple. Let us create a NodeJS server that has an API and creates a WebSocket connection to our computer. Don’t worry about code. I have done this job for you ???? But Heroku app.. you need to do that. Sorry… But again, its a signup and ‘Create new app’. I am assuming your new app name here is “my-server” (You need to change things accordingly)

git clone https://github.com/karthikax/computer-action-server.git server
cd server
git remote add heroku [email protected]:my-server.git
git push heroku master

As simple as that! Didn’t I tell you?

Now that our skill is linked to lambda, lambda calls API in our server, how can it open an app in our machine. This is simple as breeze too!

git clone https://github.com/karthikax/computer-action-client.git client
cd client

Open index.js and enter proper Heroku app URL in Line number 2, start your node client script.

npm start

Bingo! Our client is ready and listening to the server.

CALL OUT THE DEVICE AND TEST IT…
“Alexa, Ask app automator to open Google Chrome”

We can test it using lambda as well
Console -> Lambda -> ComputerActionAPI -> Action -> Configure Test Event
Sample Event Template
Alexa Start Session
Save and Test

Also using developer.amazon.com -> skill -> Test
Enter Utterance: ‘open Google Chrome’

That concludes our tutorial. But it’s up to you to explore what this intelligent lady can do for you!!

Leave a Reply

Your email address will not be published. Required fields are marked *