Tampere
24 May, Friday
20° C

The library of essays of Proakatemia

Deciphering AI



Kirjoittanut: Aarno Lind - tiimistä Crevio.

Esseen tyyppi: Yksilöessee / 2 esseepistettä.
Esseen arvioitu lukuaika on 3 minuuttia.

Introduction

Artificial intelligence (AI) is a topic I am very fascinated by, but it is something that is not easy to grasp. Nowadays the term AI is thrown around in every direction possible and people can’t seem to get enough of it. 

To start working on AI projects, one has to have a very deep understanding of what it means and how it’s built, which is the reason I am writing this essay. This essay will be more focused on my learnings and what I have researched thus far. For you, most of this might be common knowledge.

History of AI, oversimplified

1956

The term “artificial intelligence” was coined by John Mccarthy
The first AI conference was held, called “Dartmouth workshop”

1969

‘Shakey’ was built. Shakey was the first general-purpose mobile robot, who was able to do simple tasks such as switching lights on and off.

1997

The supercomputer ‘Deep Blue’ was developed by IBM and it defeated the world chess champion, Garry Kasparov,  in a 6-game chess match.

2002

‘Roomba’ The first commercially viable robotic vacuum cleaner was produced. 

What is defined as AI

AI, or Artificial Intelligence, refers to computer systems or machines that simulate human intelligence processes. These processes include the ability to learn, reason, problem-solve and interact using natural language. It encompasses a range of technologies typically built for the purpose of performing tasks that in the past have required human intelligence.

There are multiple ways AI can be categorised, hence why it might seem so confusing to many. The most common ways AI is classified are by capability, functionality and technology.  

I find the classification by functionality to be the best when trying to understand different AI types.

Types of AI based on functionality

Reactive

  • Usually specializes in one field of work only. 
  • They don’t have much prior data and gather data from what’s in front of them. Deepblue was a reactive AI.

Limited memory

  • Uses previous data and stores data to make better decisions. 

Theory of mind

  • Can understand human emotions and thoughts. Can interact socially
  • Still mostly fiction, a fully-fledged model has not been created yet.

Self-aware

  • Often described as conscious, sentient

AI can also be classified by technology. But the classifications are changing as new advancements are made and technology improves. Also, in many cases, multiple different technologies are implemented into the same AI.

The main technologies used today:

Machine learning

  • AI systems with the ability to gather data from experience and improve over time
  • (Deep learning and neural networks are a subfield of this)
  • Example: ChatGPT

Natural language processing

  • AI that can understand and respond to human language
  • Example: ChatGPT

Robotics

  • AI that is encased in a physical form. These are usually in the form of robots.
  • Example: Spot, Boston Dynamics

Computer vision

  • AI that has the ability and understanding of digital images or videos.
  • Example: DALL-E

How are AIs built

Many of the AI tools you see popping up today are built on pre-trained models. In most cases, this is a better approach since training and building an AI from the ground up is extremely expensive. 

In this essay I will not dive into building an AI from scratch, instead, I will focus on using a pre-trained model.

Steps of building an AI tool using a pre-trained model:

Define and plan

  • Plan out what your AI will do. This is crucial since the pre-trained model you choose will have to be suitable for your purpose.
  • Research pre-trained models and choose an appropriate model for your project. It is recommended to use models with large datasets

Setup the environment

  • Install the necessary tools into your development environment. 
  • Install the pre-trained model

Prepare data

  • Collect data if you need to fine-tune the model.
  • Preprocess the data if necessary. Format your data in the way the pre-trained model expects. This might involve normalisation, resizing images, or tokenizing text.

Integration

  • Use the framework’s API to load the pre-trained model into your application.
  • Understand Model Architecture. Familiarize yourself with the architecture to know where and how to fine-tune, if necessary.

Fine-tuning and training

  • Prepare for training. If your task requires it, you may need to retrain the model with your data. Set aside a validation set to monitor performance.
  • Train the Model. Use an optimizer and loss function suitable for your data and objective. Monitor the loss and accuracy metrics during training.

Evaluate and validate

  • Use the validation set to evaluate the model’s performance. Adjust parameters or the training process based on this feedback.
  • Once satisfied with the validation results, perform a final test on a test dataset to ensure the model’s generalizability.

Implementation

  • Embed the model into your application’s backend or frontend, depending on the use case.
  • If you’re developing a service, wrap your model in an API to handle requests and responses.

Deployment

  • Decide if you’ll deploy on-premises, in the cloud, or use serverless options.
  • Ensure the model can handle the number of requests you expect. This might involve optimizing the model itself or the infrastructure it runs on.

Afterword:

I tried to avoid using too many technical terms and keep it as simple as possible. Also, I will update this essay in the future when I am further along in the process.

Sources

 

https://www.analyticsvidhya.com/blog/2017/06/transfer-learning-the-art-of-fine-tuning-a-pre-trained-model/

https://www.businessinsider.com/garry-kasparov-talks-about-artificial-intelligence-2017-12?r=US&IR=T#the-territory-of-games-where-the-machine-prevails-because-humans-make-mistakes-1

Post a Comment