How does a Machine Learn?
Talk to a machine learning expert about implementing anything and they will say, we need to “train” the system first. What they usually mean is that the system will be fed known inputs and outputs, and once trained considerably, the model should be able to predict an output, given an input, even if the input is not an exact match of what it has seen before.
A classic example of this is seen with image recognition. If you have an iPhone it may have recently asked you to tag pictures of people with names. You are essentially “training” the model, which is internally correlating the facial features of different people in your pictures to names. Once you have tagged a few pictures of your friends, it is then able to label other pictures of your friends with their name. Humans are very good at image recognition and correlating these images and we spend the first few years of our lives building these correlations.
Facebook and Apple made you work for them and you didn’t even realize it!
An Ideal Use Case
How do you train your algorithm if you are a company that has taken pictures of every highway and street over a third of the world and are trying to classify these images as store fronts vs. street signs? You acquire a CAPTCHA company. Yes, the annoying widget on the bottom of forms that ask you to identify all the street signs in pictures when you are trying to buy tickets to your favorite show.
Google acquired reCAPTCHA in 2009 in part to power the training of its image recognition and OCR (Optical Character Recognition) algorithms. It is reported that reCAPTCHA is used hundreds of millions of times a day by users all over the world. It was also used to help train and augment Google’s OCR engines in their effort to digitize older books and newspaper articles that have the print that is distorted or faded away and couldn’t be recognized.
RPA and ML/AI
Robotic Process Automation is often augmented with such AI and Machine Learning algorithms to process natural language or identify documents. The usual technique for creating a learning data set for these types of implementations is to use historical data. When historical data is not available, humans are initially used to seed the algorithm with the various variations of documents or phrases that can be expected and when the system encounters a scenario that it is not able to classify with a high level of confidence, it is rerouted to a human for confirmation. Once the human addresses the ambiguity, this new information is added to the training data set and thus the system becomes more “intelligent”.
Interested in learning more about how Machine Learning, AI and Software Robots can be used in your organization? Click here to schedule a meeting.