IJARESM Menu

Download

Latest News

Peer-Reviewed Journals List

Posted Date : 07th Mar, 2025

Peer-Reviewed Journals List: A Guide to Quality Research Publications ...

More...
How to Choose the Right Peer-Reviewed Jo...

Posted Date : 07th Mar, 2025

Choosing the right journal is crucial for successful publication. Cons...

More...
Why Peer-Reviewed Journals Matter ?

Posted Date : 27th Feb, 2025

Why Peer-Reviewed Journals Matter Quality Control: The peer revie...

More...
What is Peer Review Process?

Posted Date : 27th Feb, 2025

The Peer Review Process The peer review process typically follows sev...

More...
Peer-Reviewed Journals

Posted Date : 27th Feb, 2025

What Are Peer-Reviewed Journals? A peer-reviewed journal is a publica...

More...

Visitor Counter

4910821215

Optimizing Image Labeling Efficiency through ...

You Are Here :

Issues

Volume 13

Issue 1, January 2025

Optimizing Image Labeling Efficiency through ...

Optimizing Image Labeling Efficiency through Active and Transfer Learning Integration

Author Name : Noonety Mani Mokshith, Amit Singh, Suraj Anand, Yash Gupta, Prasanna Gopalrao Shinde, Addanki Yutesh Vishnu, Jitendra Suwalka, Yash Shindey

ABSTRACT Captioning images automatically lies at the core of mimicking the human visual system's ability to interpret scenes. An automated application capable of generating textual descriptions for images can have diverse applications, particularly in assisting visually impaired individuals to comprehend their surroundings. In this project, we present a deep learning model that leverages CNN-LSTM (Convolutional Neural Network - Long Short-Term Memory) neural networks to detect objects in images and generate corresponding captions. Our approach integrates pre-trained models for object detection using Transfer Learning, enhancing the detection accuracy on a diverse range of datasets. The image captioning process utilizes a combination of Convolutional Neural Networks (CNNs) for feature extraction and Recurrent Neural Networks (RNNs) with LSTM units to generate descriptive text. The system performs two key operations: object detection using CNNs and caption generation using RNN-based LSTMs. The model interface was developed using Flask, a Python web development framework, enabling seamless deployment and interaction. By training the CNN with extensive hyperparameter tuning on large-scale datasets like ImageNet and combining its outputs with RNN-based LSTM networks, the model achieved a BLEU (BiLingual Evaluation Understudy) score of 56 on the Flickr8k dataset. This project demonstrates the potential of combining CNNs and RNNs for real-world applications, such as aiding visually impaired individuals in understanding their environment.