Confinement Jour 15 : Codons une connerie avec de l’IA Javascript (TensorFlow.JS)

Lockdown : let’s code a dumb idea with Javascript AI

Lockdown day 15, I’m going nuts. So I figured, i’m gonna do some machine learning with my face. A genius idea that turned out to be psychologically helpful. Today we’re going to play with APIs, libraries and my face. Fun ahead.



Lockdown day 15

I had a lot of time, I spent it doing weird stuff. I founded an old webcam in a drawer of my desk and started using it during the lockdown. My co-workers had to put up with my face every morning at the sync meeting. Was horrible for everyone.

Anyway, i’ve always wanted to have fun doing stuff with TensorFlowJS. I knew it could be fun, but I never had time to try it. So I’ve started sniffing around demos. You can find a lot of already working experiment with it.





So, what is TensorFlow again? Well, as is often the case, it all starts in the Google office. They obviously have a project about artificial intelligence. It’s called Google Brain and it’s been running since 2011.



What’s TensorFlow.JS?

TensorFlow.JS is an open source library that allows you to define, train and/or use learning machine models directly in Javascript. All the complex part of machine learning is abstracted. You just use a high level API.

So, if you’re not sure to understand what is AI, machine learning or models: I’m not going to talk about it today. We’ve discussed it in the past when we understood artificial intelligence in just 10 minutes (french version).

Anyway, I continued to test all the official Google models. In fact, I spent a lot of time just messing around with them. Don’t judge me.





And then, I had a stupid idea! I needed a model who could recognize facial expressions. Instead of reinventing the wheel, I decided to find out if anyone had ever trained a similar model before. Bingo !



The fun part

I had everything I needed to do my idea. The fun coding part was about to start ! First, I had to familiarize myself with the library and its API. So I decided to start by exploring everything it had to offer. I wasn’t disappointed.





Now that I had this base, I could implement my dumb idea. This idea came to me after spending time in conference call during this lockdown. Whether it’s pro or personal, I spent a lot of time in video call with a lot of people. Screen is splited to see everyone’s faces and most of the time it’s awkward.

The idea was simple. I’m gonna create a fake video call. Have the artificial intelligence monitor my emotions using the webcam, and change the other spots in the room with gifs that match my emotion. If I smile, i’ll only get smiling gifs. If I’m surprised, i’ll only get surprised gifs. Etc etc…

It’s useless, but it’s very funny to do. And in a few hours, it was done ! Look at this !





Show me the code

Everything works in the browser. Any browser. I made it responsive too, so it’s usable with any device with a camera. Your phone, your iPad, your PC, it’ll work. Yeah, you need a camera, otherwise nothing’s gonna happen. And your face. You have to use your face, otherwise it won’t work either.

If you want to try it now, it’s hosted on this static page. I put the link to the codesandbox too so you can get your hands on the code. And it’s embedded just here.



Ho yeah, here’s the main code used !



let expressionGifs = {};
const expressionSpots = {
  topRight: null,
  bottomRight: null,
  bottomLeft: null
};

const video = document.getElementById("video");
video.addEventListener("play", refreshState);

/**
 * Launch the whole process following thoses steps
 * - Preload models from faceapi
 * - Fetch the data JSON for the gifs
 * - Preload all the gifs
 * - Ask user for the video stream and setup the video
 * @async
 */
async function launch() {
  await faceapi.nets.tinyFaceDetector.loadFromUri(
    "https://1rz6q.sse.codesandbox.io/dist/models"
  );
  await faceapi.nets.faceExpressionNet.loadFromUri(
    "https://1rz6q.sse.codesandbox.io/dist/models"
  );

  expressionGifs = await faceapi.fetchJson(
    "https://1rz6q.sse.codesandbox.io/dist/data/expressions.json"
  );

  preloadImages();
  setupVideo();
}

/**
 * Iterate on all the gifs fetched before and reload all the gifs
 * using Image constructor.
 */
function preloadImages() {
  let fullListOfGifs = [];
  const images = [];
  const expressionGifsKeys = Object.keys(expressionGifs);

  for (const expressionGifsKey of expressionGifsKeys) {
    fullListOfGifs = fullListOfGifs.concat(expressionGifs[expressionGifsKey]);
  }

  for (let i = 0; i < fullListOfGifs.length; i++) {
    images[i] = new Image();
    images[i].src = fullListOfGifs[i];
  }
}

/**
 * Setup the video stream for the user.
 * On success, the stream of the video is set to the source of the HTML5 video.
 * On error, the error is logged and the process continue.
 */
function setupVideo() {
  navigator.mediaDevices
    .getUserMedia({ video: true, audio: false })
    .then(stream => {
      video.srcObject = stream;
    })
    .catch(err => console.error("can't found your camera :(", err));
}

/**
 * Get the most likely current expression using the facepi detection object.
 * Build a array to iterate on each possibility and pick the most likely.
 * @param {Object} expressions object of expressions
 * @return {String}
 */
function getCurrentExpression(expressions) {
  const maxValue = Math.max(
    ...Object.values(expressions).filter(value => value <= 1)
  );
  const expressionsKeys = Object.keys(expressions);
  const mostLikely = expressionsKeys.filter(
    expression => expressions[expression] === maxValue
  );

  return mostLikely[0] ? mostLikely[0] : "Neutral";
}

/**
 * Set the backgound emotion gif on each div related to the current expression.
 * @param {String} expression current expression
 */
function spreadByExpression(expression) {
  const expressionSpotsKeys = Object.keys(expressionSpots);
  const randomGifsByExpression = getRandomGifsByExpression(expression);

  for (const expressionSpotKey of expressionSpotsKeys) {
    if (expressionSpots[expressionSpotKey] !== expression) {
      expressionSpots[expressionSpotKey] = expression;

      const randomGif = randomGifsByExpression.shift();
      document.getElementById(
        expressionSpotKey
      ).style.backgroundImage = `url('${randomGif}')`;
    }
  }
}

/**
 * Get three random gifs from the JSON data by related expression and return an array
 * @param {String} expression current expression
 * @return {Array}
 */
function getRandomGifsByExpression(expression) {
  const randomGifs = [];
  const poolOfGifs = JSON.parse(JSON.stringify(expressionGifs[expression]));

  for (let i = 0; i <= 2; i++) {
    const randomNumber = Math.floor(Math.random() * poolOfGifs.length);
    const randomGif = poolOfGifs.splice(randomNumber, 1);

    randomGifs.push(randomGif);
  }

  return randomGifs;
}

/**
 * Set an refresh interval where the faceapi will scan the face of the subject
 * and return an object of the most likely expressions.
 * Use this detection data to pick an expression and spread background gifs on divs.
 * @async
 */
async function refreshState() {
  setInterval(async () => {
    const detections = await faceapi
      .detectAllFaces(video, new faceapi.TinyFaceDetectorOptions())
      .withFaceExpressions();

    if (detections && detections[0] && detections[0].expressions) {
      spreadByExpression(getCurrentExpression(detections[0].expressions));
    }
  }, 500);
}

launch();


I’m not going to go through the whole code to explain it. A quick overview will be enough.

First we preload the learning machine models files and then we setup the video. Then, using the faceapi API, we scan the user’s face twice a second. Faceapi will kindly return an expression object with a probability for each one.

We will iterate on each emotion and choose the most likely. The one closest to 1, without exceeding it, is the right one. Once we have the right emotion, we’ll spread it around. We’re going to take three random gifs and distribute them over each of the divs. Every time the user changes emotion, we refresh the state!





Epilogue

It kept my head busy. If you are interested in the AI topic, do not hesitate to tell me in the comments. We could train a model ourselves next time ! And if you liked seeing me coding useless stuff, you’ll be happy, I might do it again pretty soon.

Written by

jesuisundev
I'm a dev. Right now i'm Backend Developer / DevOps in Montreal. Dev is one of my passions and I write as I speak. I talk to you daily on my Twitter. You can insult me at this e-mail or do it directly in the comments below. There's even a newsletter !

7 thoughts on “Lockdown : let’s code a dumb idea with Javascript AI”

  1. Liked your work done in funny way but coming out so meaningful.
    Would you be interested in teaching me ML at least to this level that you have used in this program ?
    Or please guide me to some sources which help a layman like me to learn ML.

    Thanks and Regards
    Shantanu

    1. Na bud you can figure this out on your own like everyone else, author included. You wanna pay his 1 million dollars an hour fee to learn? Just youtube Coursera Lynda udemy it.

  2. Heh you can make this actionable too. Take the different camera inputs into their own class instance, and make a logger for recording the facial state during that given point. When there is a shift of emotional state on the face record that period of what is on the screen and what was said. Take that record and normalize it to smoothen out things and then you get the points where the end-user is most happy sad mad etc. And you can ask them why did you react {x} way during this {timestamp}. AI assisted automated feedback loop. Boom we just augmented the Neilson ratings with ai

  3. Great article, thanks for sharing!
    Knowing TF has a JS interface I’ll definitely give it a spin.

    BTW – couldn’t view the code snippet on my Android but I trust is legit

Leave a reply

Your email address will not be published. Required fields are marked *