Ruby on Rails

How to Use Ruby Functions for Image Recognition and Classification

In the age of visual data explosion, image recognition and classification have become vital tools in various domains such as healthcare, e-commerce, and autonomous vehicles. Ruby, a versatile and elegant programming language, might not be the first choice for image processing tasks, but it can be a surprisingly effective one.

Table of Contents

In this comprehensive guide, we will explore how to use Ruby functions for image recognition and classification. We’ll cover everything from setting up your development environment to building a basic image classifier. So, let’s embark on this journey to leverage Ruby’s potential for image analysis.

1. Getting Started with Ruby

Before we dive into image recognition, it’s essential to have a basic understanding of Ruby. If you’re new to Ruby, you can easily install it on your system by following the official installation guide (https://www.ruby-lang.org/en/documentation/installation/).

Once Ruby is installed, open your terminal and type irb to launch the interactive Ruby environment. This is an excellent way to experiment with Ruby code snippets and get comfortable with the language.

2. Setting Up Your Environment

To work with images in Ruby, you’ll need a few additional libraries. The most essential one is MiniMagick, a Ruby wrapper for the ImageMagick command-line tools. MiniMagick allows you to manipulate and process images with ease.

Install MiniMagick using the following command:

ruby
gem install mini_magick

With MiniMagick in place, you’re ready to start working with images. Let’s begin by loading an image and performing some basic operations.

3. Loading and Displaying an Image

To load and display an image, you can use MiniMagick. First, make sure you have an image file in your working directory. Let’s assume you have an image named example.jpg.

ruby
require 'mini_magick'

# Open an image
image = MiniMagick::Image.open('example.jpg')

# Display image information
puts "Image width: #{image.width}"
puts "Image height: #{image.height}"
puts "Image format: #{image.format}"

In this code snippet, we first require the mini_magick gem. Then, we open an image named example.jpg using MiniMagick::Image.open. Afterward, we print some basic information about the image, such as its width, height, and format.

4. Image Preprocessing

Image preprocessing is often a crucial step in image recognition and classification. It involves operations like resizing, cropping, and converting to grayscale. Let’s explore some of these preprocessing techniques using MiniMagick.

4.1. Resizing an Image

Resizing an image is a common preprocessing step to ensure all input images have the same dimensions. To resize an image using MiniMagick, you can use the resize method:

ruby
# Resize the image to 300x300 pixels
image.resize '300x300'
image.write 'resized_example.jpg'

In this code, we resize the image to a width and height of 300 pixels and save it as resized_example.jpg.

4.2. Cropping an Image

Cropping allows you to focus on a specific region of interest within an image. You can achieve this with the crop method:

ruby
# Crop a 100x100-pixel region starting at (50, 50)
image.crop '100x100+50+50'
image.write 'cropped_example.jpg'

Here, we crop a 100×100-pixel region starting at coordinates (50, 50) and save it as cropped_example.jpg.

4.3. Converting to Grayscale

Converting an image to grayscale simplifies the image by removing color information, making it easier to work with for certain tasks. To convert an image to grayscale, you can use the colorspace method:

ruby
# Convert the image to grayscale
image.colorspace 'Gray'
image.write 'grayscale_example.jpg'

This code converts the image to grayscale and saves it as grayscale_example.jpg.

5. Building an Image Classifier

Now that we’ve covered the basics of image manipulation in Ruby, let’s take it a step further and build a basic image classifier. We’ll use a popular machine learning library called TensorFlow, specifically the TensorFlow Ruby gem (tensorflow-ruby), for this task.

5.1. Installing TensorFlow Ruby

Before you can use TensorFlow in Ruby, you need to install the tensorflow-ruby gem. Note that TensorFlow itself should also be installed on your system.

ruby
gem install tensorflow-ruby

5.2. Loading and Preprocessing Images

To build an image classifier, you’ll need a dataset of labeled images. For this example, let’s assume you have a folder structure where each subfolder represents a class, and the images within those folders are labeled accordingly.

Here’s a simplified directory structure:

bash
dataset/
  ??? cat/
  ?   ??? cat1.jpg
  ?   ??? cat2.jpg
  ?   ??? ...
  ??? dog/
  ?   ??? dog1.jpg
  ?   ??? dog2.jpg
  ?   ??? …

You can load and preprocess these images using MiniMagick, just like we did earlier. Then, convert them into a format suitable for TensorFlow.

ruby
require 'tensorflow'

# Load and preprocess images
image_files = Dir.glob('dataset/**/*.jpg')

# Convert images to TensorFlow tensors
images = image_files.map do |file|
  image = MiniMagick::Image.open(file)
  image.resize '100x100'
  image.write 'temp.jpg'
  TensorFlow::Tensor.from_file('temp.jpg')
end

In this code, we use Dir.glob to obtain a list of image files within the dataset directory. We then load each image, resize it to a uniform size (e.g., 100×100 pixels), and convert it into a TensorFlow tensor.

5.3. Creating a TensorFlow Model

Next, let’s create a simple convolutional neural network (CNN) model using TensorFlow. This model will serve as our image classifier.

ruby
model = TensorFlow::Keras::Sequential.new

# Add a convolutional layer
model.add(
  TensorFlow::Keras::Layers::Conv2D.new(
    filters: 32,
    kernel_size: [3, 3],
    activation: 'relu',
    input_shape: [100, 100, 3]
  )
)

# Add a max-pooling layer
model.add(
  TensorFlow::Keras::Layers::MaxPooling2D.new(
    pool_size: [2, 2]
  )
)

# Add a flattening layer
model.add(TensorFlow::Keras::Layers::Flatten.new)

# Add a dense layer
model.add(
  TensorFlow::Keras::Layers::Dense.new(
    units: 2,
    activation: 'softmax'
  )
)

In this code, we create a sequential model and add layers to it. The architecture includes a convolutional layer, a max-pooling layer, a flattening layer, and a dense layer. This is a simple CNN architecture, and you can adjust it based on your specific image classification task.

5.4. Compiling and Training the Model

Before we can use the model, we need to compile it with a loss function and an optimizer.

ruby
model.compile(
  loss: 'categorical_crossentropy',
  optimizer: 'adam',
  metrics: ['accuracy']
)

Now, it’s time to train the model using our preprocessed image data and labels.

ruby
# Assuming you have labels for your dataset (e.g., cat = 0, dog = 1)
labels = [0, 1, 0, 1, ...]  # Corresponding labels for your dataset

# Convert labels to one-hot encoding
one_hot_labels = TensorFlow::OneHot.one_hot(labels, depth: 2)

# Split the dataset into training and validation sets
split_index = (images.length * 0.8).to_i
train_images = images[0...split_index]
train_labels = one_hot_labels[0...split_index]
val_images = images[split_index..-1]
val_labels = one_hot_labels[split_index..-1]

# Train the model
model.fit(train_images, train_labels, epochs: 10, validation_data: [val_images, val_labels])

In this code, we first convert our categorical labels into one-hot encoding. Then, we split our dataset into training and validation sets. Finally, we use the fit method to train the model for a specified number of epochs.

5.5. Making Predictions

Once the model is trained, you can use it to make predictions on new images.

ruby
# Load and preprocess a new image
new_image = MiniMagick::Image.open('new_image.jpg')
new_image.resize '100x100'
new_image.write 'temp.jpg'

# Convert the new image to a TensorFlow tensor
input_tensor = TensorFlow::Tensor.from_file('temp.jpg')

# Make predictions
predictions = model.predict(input_tensor)

In this code, we load a new image, preprocess it, and convert it into a TensorFlow tensor. Then, we use the model’s predict method to obtain predictions.

Conclusion

In this guide, we’ve explored how to use Ruby functions for image recognition and classification. We started by setting up the development environment, loading and preprocessing images using MiniMagick, and then built a basic image classifier using TensorFlow in Ruby.

Keep in mind that this is just the tip of the iceberg. Image recognition and classification are complex tasks, and you can explore more advanced techniques, such as transfer learning and fine-tuning pre-trained models, to improve the accuracy of your classifiers.

By combining the power of Ruby with the capabilities of machine learning libraries like TensorFlow, you can create robust image recognition solutions for a wide range of applications. So, go ahead and start building your own image classifiers in Ruby, and unlock the potential of visual data analysis. Happy coding!

Table of Contents

Previously at

About

Caio

Senior Ruby on Rails Developer Ex-Reply

Brazil

GMT-3

Senior Software Engineer with a focus on remote work. Proficient in Ruby on Rails. Expertise spans y6ears in Ruby on Rails development, contributing to B2C financial solutions and data engineering.

Ruby on Rails

Python

Hire Caio

Ruby on Rails Guides

30th Jan 2024

What is the Rails command line and what are its main commands?

26th Jan 2024

How to manage state in a Rails application?

26th Jan 2024

How to handle background processing with Sidekiq in Rails?

Hire a Ruby on Rails Developer