Question

In this project, you are given a dataset of American Sign Language sign images representing the 26 letters of the alphabet and 3 other popular signs. Your job is to create an AI that can take an image as an input and predict what letter is represented in that image.

Setup
The dataset import contains 29 classes, the letters A - Z, as well as delete, nothing, and space. The setup code imports images into a training(train_dir) and validation(validation_dir) directory for you to use. The individual images are in sub-directories from there.

A list called class_names has been created and points to the names of the different categories. The index value for this list will be the same as the output from the model prediction. For example, if the model predicts the image to be 2, that would correspond to the letter "C", which is at index 2 in the list.

import os
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import image_dataset_from_directory

# Note: This data is already split into a train and validation set, so you do
# not need to split it.
_URL =
zip_dir = tf.keras.utils.get_file('asl.zip', origin=_URL, extract=True)

base_dir = os.path.join(os.path.dirname(zip_dir), 'asl')
train_dir = os.path.join(base_dir, 'asl_alphabet_train')
validation_dir = os.path.join(base_dir, 'asl_alphabet_validate')

class_names = ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "L", "M", "N", "O",
"P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "del", "nothing", "space"]

Answers

GPT-4o mini
To create an AI model that can recognize the American Sign Language (ASL) signs from the provided dataset, you can follow these steps:

1. **Data Loading and Preprocessing**: Load the dataset and prepare it for training the model.
2. **Model Creation**: Design a convolutional neural network (CNN) that can learn from the images.
3. **Training the Model**: Train the model using the training dataset.
4. **Evaluation**: Evaluate the model's performance on the validation dataset.
5. **Prediction**: Create a function to make predictions on new images.

Here’s how you can implement each of these steps in Python using TensorFlow and Keras:

### 1. Data Loading and Preprocessing

First, you need to load the images and preprocess them for the model. This includes normalizing the image data and augmenting the training dataset to improve the model's generalization capabilities.

```python
# Set image parameters
img_width, img_height = 150, 150 # You may choose a size suitable for your dataset
batch_size = 32

# Create data generators
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest'
)

validation_datagen = ImageDataGenerator(rescale=1./255)

# Load training data
train_dataset = image_dataset_from_directory(
train_dir,
image_size=(img_width, img_height),
batch_size=batch_size,
seed=42
)

# Load validation data
validation_dataset = image_dataset_from_directory(
validation_dir,
image_size=(img_width, img_height),
batch_size=batch_size,
seed=42
)
```

### 2. Model Creation

Next, you can create a CNN model using Keras.

```python
from tensorflow.keras import layers, models

# Define the model
model = models.Sequential([
layers.Conv2D(32, (3, 3), activation='relu', input_shape=(img_width, img_height, 3)),
layers.MaxPooling2D(pool_size=(2, 2)),

layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D(pool_size=(2, 2)),

layers.Conv2D(128, (3, 3), activation='relu'),
layers.MaxPooling2D(pool_size=(2, 2)),

layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(len(class_names), activation='softmax') # Output layer
])

# Compile the model
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
```

### 3. Training the Model

Now you can train the model using the fit method.

```python
# Train the model
epochs = 10 # You may increase this number for better performance

history = model.fit(train_dataset,
validation_data=validation_dataset,
epochs=epochs)
```

### 4. Evaluation

After training the model, it’s a good practice to evaluate its performance on the validation dataset.

```python
# Evaluate the model
val_loss, val_accuracy = model.evaluate(validation_dataset)
print(f"Validation accuracy: {val_accuracy * 100:.2f}%")
```

### 5. Prediction

Finally, to make predictions on new ASL sign images, you can use the following function:

```python
from tensorflow.keras.preprocessing import image

def predict_image(img_path):
img = image.load_img(img_path, target_size=(img_width, img_height))
img_array = image.img_to_array(img) / 255.0 # Normalize
img_array = np.expand_dims(img_array, axis=0) # Add batch dimension

predictions = model.predict(img_array)
predicted_index = np.argmax(predictions[0])

return class_names[predicted_index]

# Example of using the predict function
predicted_sign = predict_image('path_to_your_image.jpg')
print(f'The predicted sign is: {predicted_sign}')
```

### Summary

You now have a complete setup for creating an AI model to recognize ASL signs! The performance of the model can be improved further by tuning hyperparameters, experimenting with different architectures, and using more data augmentation. Make sure to troubleshoot and examine the training/evaluation losses and accuracy to assess the model's performance and adjust accordingly.

Related Questions