April 17, 2017

Episerver Automatic Image Tagging with Microsoft Cognitive Services

I must admit I'm enjoying messing around with Microsoft's Cognitive Services, as is probably evidenced by the SMS App I made for my Dad, I also figured I could turn this into a practical application for work as well.

I'm willing to bet that most organizations are still a bit behind on the content strategy curve at this point and aren't adequately tagging their content and images. So I decided to build out a demo-ware / proof of concept for auto-tagging images, which will soon grow into processing content text, possibly as an add-on for the community. We'll see.

It's worth noting that the results that come back are suspect at best. Machine-based tagging should be looked at much like we look at machine-translation. It'll get close, but if you use it then there should be some human moderation put into play. Either pre-publish or post-publish doesn't matter, just know that you're not going to get 100% accuracy. You should expect some tags you'll want to remove and that you'll want to step in and add your own as well.

Still, I find that the tagger inserts values that are likely helpful as well as likely un-thought-of by some authors.

All of the code below goes into an Episerver IInitializableModule with the following includes (because I dislike when people leave these out of their instructions/samples...):

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using BrightfindAlloyDemo.Models.Media;
using EPiServer;
using EPiServer.Core;
using EPiServer.Framework;
using EPiServer.Framework.Initialization;
using EPiServer.ServiceLocation;
using Microsoft.ProjectOxford.Vision;
using Microsoft.ProjectOxford.Vision.Contract;

Microsoft's package for working with these services is pretty good. Very easy to implement to make quick calls out. Setting up a client is a one-line task, which I put inside a Lazy initializer in order to prevent it being initialized until needed. Probably not necessary, but I've gotten into that habit...

private static Lazy _visionService => new Lazy(() => new VisionServiceClient(Global.MicrosoftDemoValues.CognitiveServicesApiKey));

The method I wanted to use to auto-tag content, however, only existed in an Async form. I quickly found out that this didn't work well when used somewhat directly with Episerver's synchronous content event handlers.

Effectively, you can't add an async / await method as part of the handler. I started getting errors in the event log when I tried this. However, if I took those out, then the method would close out before the asynchronous request to Microsoft's services would return any data, which meant it did nothing but waste cycles.

Thanks to a little guidance from a fellow developer and a bit of Googling, I came upon something that works - using Tasks to run an Async method within one that isn't.

Without further ado, here's the code with comments to explain things.

public void Initialize(InitializationEngine context)
    var contentEvents = ServiceLocator.Current.GetInstance();
    contentEvents.SavingContent += AddImageAnalysis;

public void Uninitialize(InitializationEngine context)
    var contentEvents = ServiceLocator.Current.GetInstance();
    contentEvents.SavingContent -= AddImageAnalysis;

// This is the non-async method I want to add to the event handler so it doesn't go fubar on me.
private static void AddImageAnalysis(object content, ContentEventArgs e)
    // Exit if the content is not an image or if the image content already has data in the fields (only want to add if empty, don't want to override user-entered data).
    // if either is empty, we want to call the MS service, only exit if both have a value
    var imageContent = e.Content as ImageFile;
    if (!string.IsNullOrEmpty(imageContent?.Tags) && !string.IsNullOrEmpty(imageContent?.Description)) return;

    // Task code to run an async method. Provides a token so we could cancel the task if we wanted to. You also could establish a hard timeout in ms in this area.
    var tokenSource = new CancellationTokenSource();
    var token = tokenSource.Token;
    // sets up the task to run via delegate
    var t = Task.Run(async () =>
        await DoAnalysisAsync(imageContent);
    }, token);
        // wait for the task to compelte
    } catch(AggregateException ex)
        // TODO Log exceptions 
    } finally
        // probably could use a "using" statement instead of try/catch/finally, but you'd still want the try/catch portion anyway... so little difference.

// The async method being run by the task above. This is where I call the MS service.
private static async Task DoAnalysisAsync(ImageFile image)
    // MS return type
    AnalysisResult analysis = null;

    // passing the image as binary data - alternatively could pass a publicly accessible URL, but this is pre-publish so that won't work.
    using (var stream = image.BinaryData.OpenRead())
        // calling MS services, second argument specifies the results in which I'm interested - not doing anything with categories yet
        analysis = await _visionService.Value.AnalyzeImageAsync(stream, new List() { VisualFeature.Description, VisualFeature.Categories, VisualFeature.Tags });
    // return if we got nothing back
    if (analysis == null) return;

    // only add/update if the field is empty  
    if (string.IsNullOrEmpty(image.Tags))
        // only get the tags that are above a confidence threshold
        var confidentTags = analysis.Tags?.Where(t => t.Confidence >= 0.25).Select(t => t.Name) ?? new string[] { };
        // of note - this format is used by Geta.Tags add-on, which will push these into the tags data store on publish :)
        image.Tags = string.Join(",", confidentTags);

    if (string.IsNullOrEmpty(image.Description))
        // only get descriptions/captions above a confidence threshold 
        var confidentDescriptions = analysis.Description.Captions.Where(c => c.Confidence > 0.25).Select(c => c.Text);
        // usually just one, but in case there are multiple, set them up in sentence format for screen-readers to add appropriate pauses
        image.Description = string.Join(". ", confidentDescriptions);

Sample image:

Sample output (using Geta Tags):

April 13, 2017

200 Lines or Less: Combining Twilio, AWS Lambda, and MS Cognitive Services into an SMS Image Analyzer

(Cool story. Show me the code!)

I've been doing a bit in the way of personal projects and a major source of inspiration for them is my Dad, who has been slowly but surely losing his eyesight for the past decade or so. I really want to help him navigate an intensely, naturally visual world as his affliction progresses. Through hours... and hours... of searching Google, I've found very few resources that are practical enough for every-day use.

That's not to say that there's nothing out there. Apple has been doing a fairly great job making iOS accessible for him - he zooms and has text read to him all the time on his phone. He uses his phone camera to snap pictures of items he wants to zoom in on to read. Not to mention Siri and the help she/it has provided so he can stay in touch with family and friends. Siri also helps him find how to get home, which is pretty critical to a man who walks nearly everywhere and can't read road signs. If he takes a wrong turn, it's easy for him to ask "Where am I?" to regain his bearings.

I've introduced my Dad, an avid reader all his life, to the wonder of audiobooks. He now has multiple Alexa-enabled devices around his condo that he uses to control lights, set timers, manage lists, and get tide schedules (he lives by the beach).

So there's not nothing for him to use, but there are a lot of things that are not as good as they could be for him. And some of what I've found online, such as desktop magnifiers, that are simply absurdly priced as medical equipment rather than the convenient household electronics they could or should be. The cheap ones I've seen hover around the $1,500 mark. For a camera and a screen. I'm considering building one out of an old monitor and a Raspberry Pi with a camera. Maybe even a Pi Zero. I figure it'll set me back about $200.

The point is that the resources are few, expensive, or only moderately effective.

What's a developer to do? Why, build something of course!

So I spent about 3 days of my time researching and building out a service that will take the "take a picture and zoom in" process a step further. At least that's my hope and that's what he's testing out right now.

It's a SMS/MMS-based app or service. You can send it a message with an image, it will analyze the image, perform a bit of OCR, and send back a text or two containing, respectively, a brief description of the image and any text found in the photograph.

To build this out, I used Twilio to handle SMS/MMS and pass the received message to a webhook. The webhook is programmed in Node.js and hosted in an AWS Lambda Function with an AWS API Gateway endpoint sitting in front providing pass-through for the function. All image processing is handled by Microsoft Cognitive Services.

As near as I can tell, the only charges I'll incur are for the SMS/MMS, which range from 1 cent for an MMS message to some fraction of that for the SMS replies.

AWS Lambda Functions are in a free tier that don't fall into the 12-month restriction. In order to max out that tier, I'd have to be processing some 400,000 requests per month - my Twilio costs would skyrocket before I ever came close to that.

Microsoft Cognitive Services - specifically the Computer Vision API - allow for 5,000 requests per month. I'm making two requests per image, so that becomes 2,500 texts I can handle per month for free using their API. If my dad is sending that many... he's got some sort of addiction.

Really, it's amazing what these companies give away for free / cheap. What they're doing is fueling innovation and exploration of technology by smaller companies and solo developers, and that's just plain awesome of them.

In the interest of brevity, I opted to post all of the code, hopefully well-commented, to GitHub.

March 20, 2017

Quick tip: Checking the version of your Episerver Database

Just a quick post on this one. A friend was recently having problems trying to make updates to his local Episerver site via Nuget. Even after running updates to the DB via the package manager console (update-epidatabase), he was getting errors and he was concerned that there was a DB version mismatch between what he had and what the code was expecting.

(Note: he was trying to use the auto-create schema function in a version prior to when it was implemented.)

If, for whatever reason, you want to verify your DB version against what you should have for your code to work correctly, check the version for your DLL first and go to Episerver's nuget site, which has a handy Compare Database tool. Plug in values for the from/to that at least encompass your DLL version number. This will give you a list of DLL versions paired with DB versions, so you can see where changes to the database have been made.

Next, open up SQL Server Management Studio and connect to your DB server. Navigate the tree to [YourDB] > Programmability > Stored Procedures and execute dbo.sp_DatabaseVersion. The return value of this function will be the version for your DB.

A screenshot of Sql Server Management Studio running the Database Version stored procedure.

Compare and proceed accordingly.

March 6, 2017

Face-Based Login with Episerver and Microsoft Cognitive Services

Last week was Episerver Ascend 2017 and as part of the last day a good number of us participated in a Microsoft-sponsored Code Bash - a miniaturized hackathon, if you will. While the event itself was not without flaw (we could all have used more time and there were scheduling conflicts with other sessions), the technology employed was exciting and fun to work with.

October 13, 2016

A Comprehensive Guide to using Generated .NET Content Types with Ektron CMS 8.5+

The example I'm creating below is for output of images in a list, as though I were going to make a gallery. Keep in mind that there are other things I would do when actually creating a gallery, so this is by no means completed code in that regard. It is, however, a very good and simple illustration for how to retrieve and handle content from Ektron in the most .NET way possible (without a lot more separation of concerns).