At a recent NYC Meetup that we co-hosted, it was evident that the attendees were confused about the difference between Artificial Intelligence (AI), Machine Learning, Predictive Analytics and Natural Language Processing. Whilst AI is used a generic term to encompass all four of these processes, in reality they can be considered as separate entities. AI itself is characterized by two flavors: strong and weak. Strong AI is the ability of machines to think like human beings, like HAL9000 in 2001 A Space Odyssey. Weak AI has an appearance of AI, but underneath is the application of an algorithm or process that is custom-built for a specific task. What we are seeing with weak AI is the ability to process large amounts of data in a reasonable time, which gives the appearance of AI, but it isn’t AI in the general sense.
In this post, we detail what we think is the difference between the types of “AI”, and how they interrelate. We give some example use-cases we have experienced in our careers. This is by no means the final word on this, but is intended to be the starting point for a discussion. Please feel welcome to comment and share.
A Brief History into the Internet & Today’s Data World
Before we take a closer look at AI, let’s take a brief look at what many consider are the fundamental building blocks we’ve already learned about data, today.
When the internet first became public there were initial obstacles of how to interconnect everything from a macro scale (Wide Area Network, WAN), handle high data transfer capacity (DS-x/OC-x circuits), then be able to index it (search engines), then at the local level create LANs (local area network), secure it (firewalls/encryption, etc.), enable high capacity processing (mainframes/DBs), and last but not least compliantly store it (Tape to NAS/SAN).
From this era we were catapulted into an era with lots more data, bringing similar concepts to distributed systems. The diagram below highlights this data evolution to where we are today (small, big, & fast data era).
Really? Data Science, Data Engineering and AI are related?
Now that we are in the thick of the data era, where the decreasing value density of data, with increasing data variety/complexity and exponential daily generation of this data helps set the stage for the importance of organizing it.
Data has successfully evolved into it’s very distinct areas of data engineering and data science. These major areas are further broken down into specific components we hear a lot about in our everyday business dealings and technology articles.
The way that we see the components of Data Science are as follows:
Major components of Data Science:
2. Business Intelligence
3. Data Visualization
Major components of Data Engineering:
1. DB Administration
2. Data Storage
3. Systems Implementation
Okay so that we have a better idea of the of the data relationships. Let’s get into the advanced data components and their relationship to AI, listed in the following (diagram below):
As this diagram illustrates the (2) major areas of data science and data engineering encompass many other more specific and advanced concepts like data mining and machine learning which are related to neural networks and deep learning respectively.
Now Let’s Get Down to AI in Retail
We hope now we have a good understanding of 1) how we got to this stage of data evolution and most importantly 2) how we got to this state of data convergence where true (strong) AI is becoming more feasible. We can now define the different AI techniques. Let’s start off with the pinnacle and most advanced level of AI, Strong AI.
Strong AI – Artificial Intelligence – The ability of machines to think like humans. Machines that know they can think and are self-aware. Characterized by tests such as the Turing Test, Steve Wozniac’s “Ability to make a cup of coffee” Test. Applications are for general robotics (the kind of robot that you will get in the home, eventually), driverless cars, etc. Humans are generally frightened by this type of AI, forewarned by sci-fi stories such as HAL 9000 in 2001 – A Space Odyssey, which more often that not show AI exceeding and then subjugating humans.
The following diagram shows the relationship between the different AI techniques:
Weak AI – Formally, weak AI is an AI that cannot determine whether it can think. There is no “mind” here. This is AI that is concerned with specific problems that have the appearance of AI because of complexity, but in reality cannot be applied to a general problem. They have become more and more prevalent because of the expansion of the volume of data and the availability of processing power.
The three areas below are generally regarded as being Weak AI.
NLP – Natural Language Processing – Extraction, matching, unstructured –> structured. This type of processing, which has been enabled by the advancement of processing power, is where free-form text is scanned and processed to extract information relevant to the use case. An example of this is the scanning of medical records to extract ICD9 codes.
Generically, NLP is the process of taking unstructured data and turning it into structured data. In the majority of cases this can be accomplished with parsing and matching individual words or phrases. In some cases, though, it is necessary to add further machine learning-based processing to the back-end to provide further insight.
ML – Machine Learning – Identification and classification. Processes can learn, based on previous history, and use that to identify, classify and determine actions for new inputs. There are generally two different forms: supervised learning, where a human being tags results; and unsupervised learning where the result is determined automatically.
If we use the NLP use case, ML can use past inputs to characterize newly entered information. It may take a human to examine the output and categorize it, and then use that to automatically categorize new inputs. This is an example of supervised learning.
If we use on-line advertising as an example, banner ads that are served to a requestor can be automatically categorized as either unsuccessful or successful in the case that the banner ad is clicked on, albeit that there could be a large time difference between serving the ad and the click. Previous performance (CTR – Click Through Ratio) can be used as the basis for handling subsequent ad requests. This is an example of unsupervised learning.
PA – Predictive Analytics – Making predictions. Once we have characterized and learned from past history, we want to be able to predict future behavior. The prediction of future behavior can be used to evaluate and give the probability of many alternatives.
As a simple example, examining past weather patterns and current conditions can give the percentage chance of it raining today. PA does not say, “There will be rain today”, but says instead “There is a 45% chance of rain today”.
If we extend the online advertising example given above, after the incoming request has been passed through a machine learning process to identify and classify the request, it can then be pushed through a predictive analytics stage that will predict the probability of a response for all available banner ads. The banner ad most likely to get a response can then be chosen and returned to the requestor.
Please keep in mind that many of these techniques and many other terms may change over time, as many of these concepts are evolving as I type this post.
In closing we feel that the gradual data explosion and working with different technologies to organize and govern it, has also opened to the door for the possibility of AI, beyond conceptual thinking.
We hope you’ve found this post informative so the next time you hear the term “AI,” you have a really good idea of what it entails and some the technologies involved to make Strong AI, something like our famous Star Wars friend, C3PO come to “life.”
Interested in learning more? Join our mailing list!