April 17, 2017

Episerver Automatic Image Tagging with Microsoft Cognitive Services

I must admit I'm enjoying messing around with Microsoft's Cognitive Services, as is probably evidenced by the SMS App I made for my Dad, I also figured I could turn this into a practical application for work as well.

I'm willing to bet that most organizations are still a bit behind on the content strategy curve at this point and aren't adequately tagging their content and images. So I decided to build out a demo-ware / proof of concept for auto-tagging images, which will soon grow into processing content text, possibly as an add-on for the community. We'll see.

It's worth noting that the results that come back are suspect at best. Machine-based tagging should be looked at much like we look at machine-translation. It'll get close, but if you use it then there should be some human moderation put into play. Either pre-publish or post-publish doesn't matter, just know that you're not going to get 100% accuracy. You should expect some tags you'll want to remove and that you'll want to step in and add your own as well.

Still, I find that the tagger inserts values that are likely helpful as well as likely un-thought-of by some authors.



All of the code below goes into an Episerver IInitializableModule with the following includes (because I dislike when people leave these out of their instructions/samples...):

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using BrightfindAlloyDemo.Models.Media;
using EPiServer;
using EPiServer.Core;
using EPiServer.Framework;
using EPiServer.Framework.Initialization;
using EPiServer.ServiceLocation;
using Microsoft.ProjectOxford.Vision;
using Microsoft.ProjectOxford.Vision.Contract;

Microsoft's package for working with these services is pretty good. Very easy to implement to make quick calls out. Setting up a client is a one-line task, which I put inside a Lazy initializer in order to prevent it being initialized until needed. Probably not necessary, but I've gotten into that habit...

private static Lazy _visionService => new Lazy(() => new VisionServiceClient(Global.MicrosoftDemoValues.CognitiveServicesApiKey));

The method I wanted to use to auto-tag content, however, only existed in an Async form. I quickly found out that this didn't work well when used somewhat directly with Episerver's synchronous content event handlers.

Effectively, you can't add an async / await method as part of the handler. I started getting errors in the event log when I tried this. However, if I took those out, then the method would close out before the asynchronous request to Microsoft's services would return any data, which meant it did nothing but waste cycles.

Thanks to a little guidance from a fellow developer and a bit of Googling, I came upon something that works - using Tasks to run an Async method within one that isn't.

Without further ado, here's the code with comments to explain things.

public void Initialize(InitializationEngine context)
{
    var contentEvents = ServiceLocator.Current.GetInstance();
    contentEvents.SavingContent += AddImageAnalysis;
}

public void Uninitialize(InitializationEngine context)
{
    var contentEvents = ServiceLocator.Current.GetInstance();
    contentEvents.SavingContent -= AddImageAnalysis;
}

// This is the non-async method I want to add to the event handler so it doesn't go fubar on me.
private static void AddImageAnalysis(object content, ContentEventArgs e)
{
    // Exit if the content is not an image or if the image content already has data in the fields (only want to add if empty, don't want to override user-entered data).
    // if either is empty, we want to call the MS service, only exit if both have a value
    var imageContent = e.Content as ImageFile;
    if (!string.IsNullOrEmpty(imageContent?.Tags) && !string.IsNullOrEmpty(imageContent?.Description)) return;

    // Task code to run an async method. Provides a token so we could cancel the task if we wanted to. You also could establish a hard timeout in ms in this area.
    var tokenSource = new CancellationTokenSource();
    var token = tokenSource.Token;
            
    // sets up the task to run via delegate
    var t = Task.Run(async () =>
    {
        await DoAnalysisAsync(imageContent);
    }, token);
    try
    {
        // wait for the task to compelte
        t.Wait(token);
    } catch(AggregateException ex)
    {
        // TODO Log exceptions 
    } finally
    {
        // probably could use a "using" statement instead of try/catch/finally, but you'd still want the try/catch portion anyway... so little difference.
        tokenSource.Dispose();
    }
}

// The async method being run by the task above. This is where I call the MS service.
private static async Task DoAnalysisAsync(ImageFile image)
{
    // MS return type
    AnalysisResult analysis = null;

    // passing the image as binary data - alternatively could pass a publicly accessible URL, but this is pre-publish so that won't work.
    using (var stream = image.BinaryData.OpenRead())
    {
        // calling MS services, second argument specifies the results in which I'm interested - not doing anything with categories yet
        analysis = await _visionService.Value.AnalyzeImageAsync(stream, new List() { VisualFeature.Description, VisualFeature.Categories, VisualFeature.Tags });
    }
    // return if we got nothing back
    if (analysis == null) return;

    // only add/update if the field is empty  
    if (string.IsNullOrEmpty(image.Tags))
    {
        // only get the tags that are above a confidence threshold
        var confidentTags = analysis.Tags?.Where(t => t.Confidence >= 0.25).Select(t => t.Name) ?? new string[] { };
        // of note - this format is used by Geta.Tags add-on, which will push these into the tags data store on publish :)
        image.Tags = string.Join(",", confidentTags);
    }

    if (string.IsNullOrEmpty(image.Description))
    {
        // only get descriptions/captions above a confidence threshold 
        var confidentDescriptions = analysis.Description.Captions.Where(c => c.Confidence > 0.25).Select(c => c.Text);
        // usually just one, but in case there are multiple, set them up in sentence format for screen-readers to add appropriate pauses
        image.Description = string.Join(". ", confidentDescriptions);
    }
}

Sample image:

Sample output (using Geta Tags):


2 comments:

Anonymous said...

I see you had the same id as me :)

The problem I ran into with awaiting the execution is that Episerver gives an error message that the upload failed, although it didn't. So I chose to not await the analysis.

eGandalf said...

I was getting the same error for all prior attempts that were even close to successful. In this case, though, it takes a couple of seconds longer for the upload to complete, but it doesn't show a failure. It does the tagging at the time of the upload, too, giving authors the option to change the values after uploading.