Session 2: The Basics of AI – The ORBIT (Object Recognition for Blind Image Training) Dataset

Session overview

In the second session, we focus on the basics of machine learning, data and algorithms. We give you information and examples that you can read or listen to. You can also complete quizzes and questions to think more deeply about what you have learned. Let’s get started!

Reminder: What is AI?

In the previous session we defined AI as a process that consists of three parts:

A dataset
an algorithm
a prediction.

The data is used to learn a pattern that can predict something about the future. In this session, we are going to talk a little bit more about these different aspects. Particularly, we are going to talk about one common form of artificial intelligence called machine learning.

What is machine learning?

As it currently stands, the vast majority of the AI advancements and applications you hear about, including facial, voice and object recognition, refer to a category of algorithms known as machine learning. It is used in many of the services we use today – recommendation systems like Spotify or YouTube, social media like Facebook as well as in services that are important for blind and low vision people such as voice assistant like Siri and Alexa or objects recognition applications such as SeeingAI and TapTapSee.

Machine learning uses statistics to find patterns in massive amounts of data. This pattern is expressed in an algorithm that can be applied to new data the system receives to get a prediction. Machine learning is rarely 100% correct as it makes educated guesses.

What is the difference between AI and machine learning?

Very often they are used interchangeably by big companies that want to announce their latest innovation. However, machine learning and AI are different in some ways. As we mention in the first session, the goal of AI is to create machines to perform some sort of behaviour that a human would do. You can have an AI that is based on rules, without any machine learning. For example, early chatbots were based on pattern-matching rules that were able to answer questions. Some systems also contain some more complicated reasoning, like matching symptoms to diseases that was developed by talking to lots of doctors.

Machine learning is a distinct subset of AI that learns how to make a prediction by itself. The important thing about machine learning is that the algorithm does not need to be explicitly programmed by people. You give the computer some data, the computer is learning the pattern and developing the algorithm, in order to figure out what do when it sees a new example.

What is data?

Data is really important for machine learning. Data here can be anything that can be digitally stored and then fed into a machine learning algorithm. It can be numbers, words, images or even the clicks of your mouse on something.

A dataset is a collection of data. A dataset can be automatically collected, for example, the words that people type into a search box along with the web pages that they visit can become a new dataset. Or datasets can be collected by inviting people to contribute to it. For example, Imagenet is the database that most existing apps like TapTapSee or SeeingAI use. ImageNet is an open-source repository of images consisting of 1000 classes and over 1.5 million images taken by people. All images are hand-annotated by the project team to indicate what objects are pictured.

Different kinds of data in machine learning

Datasets that are used for machine learning are usually divided in two categories – the training data and the testing data. Let’s stick to the previous example of ImageNet and TapTapSee to explain what that means. The TapTapsee should predict what object is in the image taken. This means we need to develop an algorithm with machine learning and for that we use a training dataset to teach the algorithm how to recognise the things in the image. We then use another dataset of images – which we call a testing dataset – to test if the algorithm can reliably predict what object is in the picture. This is important so that we can make sure the app works in real-life and to measure how well it works. It is also really important that the two datasets are different. Imagine that you finish some homework and instead of having someone else to mark it, you mark it yourself. This won’t be very reliable!

Example: Pattern Radio – Whale Songs

Let’s look at a different example! Researchers at Google in collaboration with oceanographers at National Oceanic and Atmospheric Administration (NOAA) Pacific Islands Fisheries Science Center (PIFSC) are working on a research project that protects whales against whaling by being able to identify them using their voice.

Like us, whales sing but their songs can travel hundreds of miles underwater. Those songs potentially help them find a partner, communicate, and migrate around the world. Whales suffer from threats like entanglement in fishing gear and collisions with vessels, which are among the leading causes of non-natural deaths for whales.

The research team says that to be able to protect those animals, the first step is to know where they are and when, so that we can control the risks they face – whether that is putting the right marine protected areas in place or giving warnings to vessels. Since most whales spend very little time at the surface of the water, visually finding and counting them is very difficult. This is why they decided to monitor populations of whales in U.S Pacific waters using underwater audio recordings. In other words, they collected a large dataset from recordings of whales’ voices so they can then feed this into a learning algorithm and when it hears a new sound to be able to predict if it belongs to whale or not. They collected 170,000 hours of underwater audio recordings at 12 different sites in the Pacific Ocean.

Explore the whale songs

Let’s explore the whale songs together through an interactive website. The research team created an interactive website with all the audio recording from the whales. The website lets anyone explore thousands of hours of whale songs.

Pattern Radio: https://patternradio.withgoogle.com/

If you click on the link, it takes you to the homepage where you have to click on the ‘start exploring’ button to go to the main page where you can find the whale songs. On the top center of the screen click on the drop-down menu that says select tour and select the name Annie Lewandowski, a composer and whale song researcher. This will take you to a pop-up window with a short bio of Annie. Click the begin tour button to go to Annie’s tour and then the play button to enjoy her recordings. Listen to the sounds you hear for about 5 minutes.

Question: Briefly describe your experience

What were the sounds like that you heard?
Did you recognise when there was a whale or not and how?
What other sounds did you hear? Did you recognise any of them?

Click here to add your answer

What is an algorithm?

An algorithm is a process that involves some input data and follows specific steps to give us a desired output. Computers use algorithms, but we also do. For example, a recipe is a kind of algorithm where we take ingredients as input, then combine them in the right way to get the food we want as output. So, the input will be the ingredients, the steps will be the mixing of the ingredients as well as the cooking of them and the output will be the cooked food. So, algorithms are everywhere but machine learning algorithms are a bit more complicated than that since they also involve the learning process.

How is machine learning developing algorithms from data?

Now that we know what an algorithm is, let’s explore together how machine learning learns from a set of data. We already know it tries to learn a pattern. Let’s take a simple example. Collect together three different mugs and three different glasses. By touching these items, how are mugs different from glasses, and how are they similar?

Quiz: Is it a mug or a glass?

Take our quiz to find out what does the algorithm needs to learn to be able to tell if an object is a mug and not a glass?

How does prediction work?

The last part of our definition is prediction. I think you all know by now what a prediction is – it is the guess the algorithm will make based on what it has learned from the training dataset. A prediction can be anything such as what YouTube video you might like or what an object is in an image that you have taken with your phone. It takes some input, i.e. new data, runs it through the algorithm, and then produces an output, the prediction.

A prediction might not be very accurate sometimes. It might be that it is given input that the algorithm hasn’t been trained on. For example, think about an algorithm that is very good at telling apart mugs from glasses. Now you are giving it a vase. Or imagine giving it a handleless mug. In both cases it might predict the wrong thing.

What’s in the next session?

In the next session we will discuss ethics in AI. This will include thinking about how AI can cause problems for people or society and how to address that.

Additional resources

https://www.guru99.com/artificial-intelligence-tutorial.html

https://www.teradata.com/Blogs/AI-without-machine-learning

https://becominghuman.ai/machine-learning-for-dummies-explained-in-2-mins-e83fbc55ac6d

Whales songs project – How does it work video