0% found this document useful (0 votes)
340 views9 pages

Image Caption Generator Using Deep Learning: Guided by Dr. Ch. Bindu Madhuri, M Tech, PH.D

This document describes an image caption generator project that uses deep learning techniques of convolutional neural networks (CNN) and recurrent neural networks (LSTM). CNN is used for feature extraction from images, while LSTM is used for sentence generation. The goal of the project is to generate captions for given images by recognizing context using computer vision and describing it with natural language. It involves both computer vision to understand image content and natural language processing to describe the image. The document discusses CNNs, LSTMs and how they are combined in a CNN-RNN model for the image caption generator, with CNN extracting image features and LSTM using that information to generate captions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
340 views9 pages

Image Caption Generator Using Deep Learning: Guided by Dr. Ch. Bindu Madhuri, M Tech, PH.D

This document describes an image caption generator project that uses deep learning techniques of convolutional neural networks (CNN) and recurrent neural networks (LSTM). CNN is used for feature extraction from images, while LSTM is used for sentence generation. The goal of the project is to generate captions for given images by recognizing context using computer vision and describing it with natural language. It involves both computer vision to understand image content and natural language processing to describe the image. The document discusses CNNs, LSTMs and how they are combined in a CNN-RNN model for the image caption generator, with CNN extracting image features and LSTM using that information to generate captions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

IMAGE CAPTION

GENERATOR USING DEEP


LEARNING
Guided By:- Dr. Ch. BINDU MADHURI, M Tech, Ph.D
Assistant Professor
Dept.of Information Technology
JNTU K - UCEV

By
P.RAMESH
18VV1F0008
Image Caption Generator
Abstract : -
• Image caption generator deals with generating captions for a given image.
• You saw an image and your brain can easily tell what the image is about, but can a
computer tell what the image is representing .?
• Computer vision researchers worked on this a lot and they considered it impossible
until now! With the advancement in Deep learning techniques, availability of huge
datasets and computer power, we can build models that can generate captions and
descriptions for an image.
• This is what we are going to implement in this Python based project where we will
use deep learning techniques of Convolutional Neural Networks and a type of
Recurrent Neural Network (LSTM) together.
• In this project CNN is used for feature extraction from image and RNN is used for
sentence generation.
• This project aims to predicts images using convolutional and recurrent neural
networks to generate captions for a given Image.
It is a task that involves computer vision and natural language concepts to recognize
the context of an image and describe them in natural language.

Introduction
• Image Caption generation is an interesting artificial intelligence problem where a
descriptive sentence is generated for a given image.
• It involves the dual techniques from computer vision to understand the content of
the image and a language model from the field of natural language processing to
turn the understanding of the image into words in the right order.
• Image captioning has various applications such as recommendations in editing
applications, usage in virtual assistants, for image indexing, for visually impaired
persons, for social media, and several other natural language processing
applications.
• Recently, deep learning methods have achieved this problem. It has been
demonstrated that deep learning models are able to achieve optimum results in the
field of caption generation problems
1 .A Boy is Playing
Cricket.
2. A Boy Holding the
Cricket Bat.

?
Deep Learning
• Deep learning is an artificial intelligence (AI) function that imitates the workings of
the human brain in processing data and creating patterns for use in decision making.
And also known as deep neural learning or deep neural network.
• Deep Learning is a subfield of machine learning concerned with algorithms
inspired by the structure and function of the brain called artificial neural networks.
• In this Python project, we will be implementing the caption generator using CNN
(Convolutional Neural Networks )and LSTM (Long short term memory).
Ex :-

• In this project we will used deep learning Techniques


1. Convolutional Neural Network
2. Recurrent Neural Network (LSTM)
CNN ( Convolutional Neural Network)
• A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning algorithm
which can take in an input image, assign importance (learnable weights and biases)
to various aspects/objects in the image and be able to differentiate one from the
other.
• A convolutional neural network (CNN, or ConvNet) is a class of deep neural
network, most commonly applied to analyze visual imagery.
LSTM (Long Short Term Memory)
• LSTM stands for Long short term memory, they are a type of RNN (recurrent
neural network) which is well suited for sequence prediction problems.
• Long short-term memory (LSTM) is an artificial Recurrent Neural Network (RNN)
architecture used in the field of deep learning.
So, to make our image caption generator model, we will be
merging these architectures. It is also called a CNN-RNN model.

CNN – RNN MODEL


• CNN is used for extracting features from the image. We will use the pre-trained
model Xception.
• LSTM will use the information from CNN to help generate a description of the
image.

Ex :-
THANK YOU

You might also like