Input, control and automation

Sunday 4 September 2016

Building a Home AI, Part 5: Fracted dates

01:43 Posted by Roxton , No comments
If you want occupation for an hour, drink. If you want occupation for a day, read. If you want occupation for a month, marry. If you want occupation for a lifetime, teach an AI to recognize every combination of dates and times in the English language.
—Ancient Chinese Proverb, trans. Roxton.

March and April have been slow months for Mervyn. This is partly because I took a holiday in Cornwall and partly because I have started a new and more demanding job that doesn't leave me as much time and energy as I was hitherto accustomed, but mostly because I have been working solely and solidly on date and time recognition. Though Roxton's mills grind slowly, they grind exceeding small: Mervyn now recognizes almost any date and time range you care to mention.

For example, here's me asking Mervyn what events I have from tomorrow to the Thursday after next (today is the 19th):
Yes, I have no social life. This is because I spend every waking moment feeding my brain to C#.
And here's me asking Google the same question:
It hasn't a clue what I'm on about. Of course, this is hardly a fair comparison - Google does much more than date recognition, and of course they could surpass Mervyn in an afternoon if they wanted to - but beating Google at anything is a nice achievement, nevertheless.

I think, and hope, that the date recognition is now solid enough that I can leave it alone for the present and return to work on actual functionality. Next on the list is finishing the "new event" procedure, and then building on that to produce similar behaviour for creating "to-do" items and reminders.

Caps Off

01:42 Posted by Roxton , No comments
All quiet here, I know. The new job is pretty time consuming, and most of what I'm writing for it falls under the IP clause in my contract and so can't be shared. However, I've been going through my code libraries and am trying to get all the non-protected things up on GitHub.

A very small and mostly useless tool, then: an app that automatically turns off Caps Lock whenever you lock your PC. Useful for me and people like me who rely on muscle memory to get through the day and haven't the patience to check whether Caps Lock is on every time we start typing our password. It's an actual app instead of a service (which would have been more useful) because we don't have local admin rights at work.

Monday 7 March 2016

Building a Home AI, Part 4: Calendars

07:01 Posted by Roxton , No comments
Another month, another update. That up above is the very basic Mervyn UI. User input is shown on the right; Mervyn's response is on the left. The speech recognition is still push-to-talk as I haven't got keyword spotting worked out yet (I'm hoping to start looking seriously at hardware in the next month or so).

Implementing calendars has taken longer than I would have liked. I decided that Mervyn ought to support iCalendar because it's the closest thing there is to a universal digital calendar format, and, while user calendars are stored locally by default, I need to be able to pull data from Outlook/Gmail/Apple Calendar and so on, and all of these use iCalendar.

Frustratingly, the only C# library for iCalendar I could find is DDay.iCal. It's nicely made but incomplete and untested, with more bugs than I'm comfortable including in Mervyn (knowingly including, that is). Consequently, most of February has been spent writing an iCalendar library from scratch. It's been fun but time-consuming; I've been writing the library with a mind to making it open source at a later date, and as I've never written anything under that level of (perceived) scrutiny before I've spent a lot more time on it than I do on my normal code.

All of this back-room work hasn't really translated into much progress for Mervyn - the only calendar functionality it has at the moment is to retrieve event information from specified times. I am quite pleased at how responsive it is to natural language requests, though - you can ask it pretty much any grammatically correct variant of "what am I doing tomorrow/next month/in the three weeks after Christmas" and it will handle it.

So, what next? There are a few tweaks I still need to make to the event retrieval code, but once that's done I can get started on creating events. After that, I think I want to do the same for to-do lists (iCalendar supports them so I'll probably just use that again). At this point, Mervyn should be helpful enough that I'll actually want to use it, so I'll have to start work on hardware. Using a Raspberry Pi (as have Jasper and Mycroft) is definitely the path of least resistance here, and I'll admit that I can't easily think of a better alternative.

On the subject of hardware, it's interesting to note that none of the big players in this field - not even Amazon Echo - have managed to get a solution that covers multiple rooms without having to buy about a hundred dollars of hardware per room (if Amazon cuts the cost of Dot this may change, of course). Of course, this may be intentional, but as I'm not particularly interested in turning Mervyn into a business I need to find a less expensive way of doing it. A Bluetooth mic/speaker combo like this could be an answer, if the range/mic quality is up to scratch. The hope is that if the software is good enough you'd be able to use pretty much any hardware you happen to have, but this is a point against the Pi, which isn't particularly well-known for its audio compatibility. Who knows - maybe there'll be some use for Windows 10 on the Pi after all...

Wednesday 3 February 2016

Building a Home AI, Part 3: The basics

03:30 Posted by Roxton , No comments
A month in to my New Year's resolution, it's probably time to report on my progress so far. After all, Zuck has.

One month in, I have an application that resembles nothing so much as an extremely rudimentary chatbot. Mervyn can tell the time on request, and learn a few facts about you... and that's about it. Not terribly exciting, but the foundations have now been laid for more interesting things. Let's have a look at how it currently works.

First, the user types or speaks something - right now there's speech recognition, but no grammars or keyword spotting or any of the clever stuff that would be necessary for it to work in the home (as opposed to at the desk). Then, that input is examined to determine which task to run. There are lots of ways to do this, ranging from a simple switch over the input to the kind of deep learning kit that Google is reportedly embedding in its new handsets. For the kind of instruction-driven set-up I want, I can probably get by with normalization and basic pattern-matching; there are some very nice open-source tools available for more advanced NLP and sentiment analysis and so on, but for Mervyn (as it is currently envisaged), they're overkill.

What I need is a system that provides normalization, advanced pattern-matching (including sets, variables, etc.), and support for conversation trees and recursion. For now I'm using SIML, a variant of AIML (the Artificial Intelligence Markup Language developed for Alicebot). It's not without its drawbacks - it carries a number of features I don't need or want, such as EmotionML, and its logic can be frustratingly limited (text comparison, for example) - but it meets the current requirements, is easy to edit, and has a really nice little prototyping tool for rapid development. At some point I'm going to have to abandon it and roll my own solutions, but it does the trick for now.

Anyway, SIML takes the input and passes back a command to Mervyn to run a particular task with certain arguments. Mervyn then runs that task, which may include passing some kind of output to the user.

For really basic things such as telling the time, that's all there is to it. However, for anything more complicated, Mervyn will have to refer to its store of information. For example, if I ask Mervyn whether I'll need an umbrella today, it needs to know:
  • Who I am
  • My schedule for today
  • The physical locations (if only approximate) of each item in the schedule
  • Weather forecasts for each of those locations.
That last item will be pulled from the internet, as might be parts of the penultimate one, but the first two are things I don't really want to ever touch the internet's cloudy appendages. We'll need some kind of database, then.

As a rule I'm more pro- than anti-SQL; I find all the MongoDB-is-webscale nonsense a bit tiresome, but I think this is one of those cases where SQL is not really the right answer. Mervyn needs to store lots of different bits of data, much of it structured differently or even indifferently. What I think I really want is almost an OO-type model, where I can add properties and relationships quickly and on the fly. I haven't worked out the best way of doing this so far - certainly I could hammer it all into SQL, and maybe that will be what happens, but I'd rather not if I can possibly avoid it. I'm looking into the various NoSQL database options at the moment, but for now I've just been using a very loose XML file. I'm still working this out and changing my mind every few minutes, so I shan't say more until I have something more definite. This is probably going to become the single most important aspect of Mervyn - being able to properly handle user data is key to providing a good experience.

So, what next? Here's my to-do list:
  • Settle on a database design so that I don't have to re-write the getVar and setVar calls every day,
  • Continue to add functions
    • The next one should probably be appointments and scheduling
    • I have a horrible feeling that this will involve writing an entire calendar app to avoid storing everything with Google/Facebook etc.
    • Obviously this means that there will have to be some Google or Facebook integration so that events can be pulled or pushed as the user wants, but I might postpone that for now.
    • All this is really so that I can ask Mervyn "what have I got on this weekend?" and have it know. I'm terrible at that stuff and having an external brain that doesn't involve bothering my wife would be very welcome for both of us.
  • Investigate keyword spotting and microphone technologies for home integration (I've heard good things about CMU Sphinx for the former; I'm not so sure about the latter).
  • Try and find a voice synth package that's not completely foul (Google had quite a nice British male voice a few years ago, but it's since disappeared).
  • Write my own version of SIML to parse user input and match it to the requested task.

A fairly chunky list, that. Wish me luck.



Saturday 9 January 2016

Building a Home AI, Part 2: Specifications

02:05 Posted by Roxton , No comments
Having decided to build a cyber-valet, it's probably a good idea to think about what such an assistant would be - which, in the case of something that is entirely servile, is probably synonymous with what it does. Zuck's Facebook post mentions a few things:
  • Controlling the lights and temperature in his house
  • Watching his daughter’s room for hazards
  • Admitting guests based on facial recognition.
The first and last of these are relatively simple goals; so much so that there are already off-the-shelf products that have achieved them. The second is rather more challenging – object recognition is a really tough problem, especially when you’re trying to look for lots of different objects at once – and, as I don’t currently have any children, is something I might put aside for the moment.

Let’s add a few more things our valet should be able to do:
  • Run a bath
  • Make a pot of tea
  • Create, manage and remind me of appointments
  • Know my schedule and make recommendations for clothing, travel routes, etc.
  • Handle quotidian correspondence
  • Play and discover music, videos, etc.
  • Draw the curtains
  • Watering plants
  • Track groceries/household goods
  • Know when you’ve put something in the oven and when it’s done
  • Answer trivia questions and source quotations
  • Resolve intricate domestic dramas
  • Read Spinoza.
Maybe not those last two. Not yet, anyway.

What you’ll notice from that little list, which is really just an expanded version of Zuck’s, is that there a lots of different problems to be solved. I think the best way to group these problems is as follows:
  • Input
    • Verbal commands
    • Internet data
    • Sensor readings (light, heat, moisture, etc).
  • Processing
  • Output
    • Software
      • Booking an appointment online
      • Sending an email
    • Hardware
      • Running a bath
      • Making toast
Input problems tend to be either trivial (sensor readings) or fantastically difficult (speech recognition).  In both cases I’m happy to let someone else solve the problem for now. In fact, because wiring my entire house with microphones and/or buying and hacking a dozen Amazon Echos is a pretty extreme starting step, I’m going to pass all input to Mervyn (oh, I’m calling it Mervyn for now, because I’m terrible at names) in text form through a console, with various tags (sensor, speech, etc.) to simulate different types of input. The point is that input solutions should really be plug-and-play, and aren’t particularly relevant to the ‘core’ problem of having an AI that can actually understand what it’s receiving.

The same can be said for many of the output problems, particularly those that rely more heavily on hardware. It shouldn’t matter exactly which brand of computer-linked thermostat I have, or whether I’ve built my own; Mervyn should be able to handle it. If the other man can do it better, let him.
The place to start, then, is on that un-expanded ‘Processing’ step in the middle. Which is of course rather obvious. Some of this can be outsourced as well, of course: there’s no need to build a trivia engine when Google search will do it for you; a combination of Tensor Flow and Google Inbox can be used to generate email replies. Mervyn doesn’t need to have a massively deep neural network; it just needs to access them from time to time.

So far I’ve mostly talked about what Mervyn won’t be. It will, of course, change over time, but for now I’m anticipating two main components. The first will be a rather messy flowchart – basically a collection of rules and procedures for handling various types of input. For example, if I ask whether or not I need an umbrella, Mervyn should look up where I’m going in the near future, check that against weather reports, and reply.

The second component will be a large, loosely structured database containing all the information that Mervyn needs to do its job. This should include big chains of relationships between items, as well as previous instructions, modifications to those instructions and so on. This database would be the thing that really made Mervyn a personal assistant, and would be different for each installation. Because it would inevitably include a fair amount of personal information, it’s important to state straight away that I would never, ever want this database to be accessible over the internet or shared with anyone. If you want to take Mervyn with you, you can clone it to a memory stick or load it onto your phone. Maybe, if I were absolutely sure I had the encryption nailed down properly, you could have remote access. But there’d never be any kind of corporate exploitation of that data – it would be entirely local and under the control of the user.

So, what next? Let’s start with the little things:
  • Build the interface (a basic command line will do for now).
  • Accessing, storing and interpreting weather data.
  • Using that data to respond to weather queries.
This shouldn’t be particularly difficult (famous last words), and opens up nice avenues for further development – location tracking, scheduling, and so on.

Tuesday 5 January 2016

Building a home AI: Introduction

00:14 Posted by Roxton , , No comments
Another year, another press release from Mark Zuckerberg's PR team, this time listing his New Year's resolutions. One of these is to build "a simple AI" to run his home and help with his work, à la Jarvis from Iron Man. Being fantastically lazy myself, I have often wished for such an assistant, digital or otherwise, and I thought that a pleasant project for this year might be to attempt its creation, mirroring Zuck's progress.

Of course, I'm going to be facing slightly different challenges. Zuckerberg has several billion dollars; I have a student loan. He has hundreds of the brightest minds in AI research; I have a cat. On the other hand, I don't need to 'visualise data in VR' or find new ways to extort value from the harvested data of a billion users.

Next time, I'll run through some of the specifications and requirements of the proposed system, but until then I'll be working on the hardest part of any new project: naming it. In keeping with the robo-butler theme, here are some ideas:
  • Jarvis (getting the obvious out of the way)
  • Jeeves (actually a valet, and also terribly obvious)
  • Beach
  • Bunter
    • Mervyn
  • Kyrano
  • Merridew
  • Seppings
  • Butterfield
  • Oakshott
  • Wodehouse.
What's that, Verity? Using the names of fictional servants is forbidden (and possibly in violation of copyright law)? Bah. If anyone reading this is capable of original thought (I'm not, clearly), shove your ideas below.

Tuesday 22 December 2015