A Look at the 3 Most Common Machine Learning Obstacles

September 23, 2019
Machine Learning

 

When we talk about artificial intelligence (AI), the research and its findings have surpassed our little expectations. Some experts also believe that this is the golden age of AI and machine learning (ML) projects where the human mind is still surprised at all the possibilities that they bring to the table. However, it is only when you start working on a project involving these advanced technologies that you realize that there are a few obstacles that you need to address before you can start throwing a party.

Predictive assembly line maintenance, image recognition, and automatic disease detection are some of the biggest applications of ML-driven automation. But what are the hurdles that data scientists need to cross if they want to practically execute these applications and gain the desired outcome?

This article will give you an overview of the three common obstacles involved in machine learning models.

Common Machine Learning Obstacles

On theory, machine learning evangelists tend to liken the technology to magic. People scrolling through Facebook watch videos that use buzzwords in their captions and believe that AI can do wonders. Of course, it can, but when you think practically it is not as easy as it sounds. Commercial use of machine learning still has a long way to go because the reference dataset that is essential for any such model to function needs to be tidied up and organized at such a minute level it becomes tedious.

Ask any data scientist who has worked in a deep learning project and she will tell you all about it: the time, the resources, and the particular skills needed to create the database, sometimes known as a training set. But these are challenges found in any project. When you deal with machine learning, there are a few peculiar ones too.

Let’s dig deeper into these three common obstacles and find out why they are so integral to the larger machine learning problem.

The Problem of Black Box

Imagine a machine learning training program that is developed to predict if a given object is a red apple or not. During the early days of machine learning research, this meant writing a simple program with elements that involved the color red and the shape of an apple. Essentially, such a program was created through a thorough understanding of those who developed it. The problem with modern programs is that although humans developed it, they have no idea how the program actually detects the object, and that too with unprecedented accuracy. This is also one of the issues hampering the wide application of data classification models.

Experts have studied this problem and tried to crack it, but the solution still seems elusive because there is absolutely no way to get into the process while it is running. Although the program gives out fabulous results – results that are much needed to detect if a given fruit is a red apple or not from a wide range of fruits that also include non-apples – but the lack of knowledge as to how it works makes the whole science behind it feel like alchemy.

If you have been following world news related to AI-enabled products, this is probably the biggest cause of ‘accidents.’ That self-driving car hit a divider when there was no reason for it to hit it? That’s the black box problem right there.

What Classification Model to Choose?

This is another common obstacle that comes in the way of data scientists and successful AI tools. As you might know, each classification model has its own set of pros and cons. There is also the issue of the unique set of data that has been fed to it and the unique outcome that is desired.

For example, a program wanting to detect a fruit as red apple is totally different from another program that requires the observation to be classified into two different possibilities. This puts the scientists behind the program in a difficult situation.

Although there are ways to simplify this to an extent, it often ends up as a process of trial-and-error. What needs to be accomplished, what is the type and volume of data, and what characteristics are involved are some of the common questions that need to be asked. Answers to these will help a team of engineers and data scientists selects an appropriate model. Some of these popular models are a k-nearest neighbor (kNN), decision trees, and logistic regression.

Data Overfitting

Understanding this will be easier because it can be described using an example. Take, for instance, a robot who has been fed the floor plan of a Walmart store. It is the only thing that has been fed to it, and the expected outcome is that the robot can successfully follow a set of directions and reach any given point in the store. What will happen if the robot instead is brought to a Costco store that is built entirely differently? Assumption tells us that it won’t be able to go beyond the initial steps as the floor plan in its memory and the floor plan of this new store do not match.

A variation of this fallacy is what is known as data overfitting in machine learning. A model is unable to generalize well to a set of new data. There are easy solutions to this, but experts suggest prevention rather than cure. Data regularization is one of those prevention mechanisms where a model is fed data sufficient for the requests that it will handle. 

The above-mentioned three obstacles are the most common, but there are many more like talent deficit, unavailability of free data, and insufficiency of research and development in the field. In that vein, it is not fair of us to demand a lot more of the technology when it is relatively new compared to the technologies that took years and decades to evolve and are part of our routine use (internet protocol, hard disks, and GPS are some examples).

If you are an aspiring data scientist, the one thing that you can do is contribute to the research and development of machine learning and engage in more discussion both online and offline.

Post a comment