Posted Date : 07th Mar, 2025
Peer-Reviewed Journals List: A Guide to Quality Research Publications ...
Posted Date : 07th Mar, 2025
Choosing the right journal is crucial for successful publication. Cons...
Posted Date : 27th Feb, 2025
Why Peer-Reviewed Journals Matter Quality Control: The peer revie...
Posted Date : 27th Feb, 2025
The Peer Review Process The peer review process typically follows sev...
Posted Date : 27th Feb, 2025
What Are Peer-Reviewed Journals? A peer-reviewed journal is a publica...
Optimizing Image Labeling Efficiency through Active and Transfer Learning Integration
Author Name : Noonety Mani Mokshith, Amit Singh, Suraj Anand, Yash Gupta, Prasanna Gopalrao Shinde, Addanki Yutesh Vishnu, Jitendra Suwalka, Yash Shindey
ABSTRACT Captioning images automatically lies at the core of mimicking the human visual system's ability to interpret scenes. An automated application capable of generating textual descriptions for images can have diverse applications, particularly in assisting visually impaired individuals to comprehend their surroundings. In this project, we present a deep learning model that leverages CNN-LSTM (Convolutional Neural Network - Long Short-Term Memory) neural networks to detect objects in images and generate corresponding captions. Our approach integrates pre-trained models for object detection using Transfer Learning, enhancing the detection accuracy on a diverse range of datasets. The image captioning process utilizes a combination of Convolutional Neural Networks (CNNs) for feature extraction and Recurrent Neural Networks (RNNs) with LSTM units to generate descriptive text. The system performs two key operations: object detection using CNNs and caption generation using RNN-based LSTMs. The model interface was developed using Flask, a Python web development framework, enabling seamless deployment and interaction. By training the CNN with extensive hyperparameter tuning on large-scale datasets like ImageNet and combining its outputs with RNN-based LSTM networks, the model achieved a BLEU (BiLingual Evaluation Understudy) score of 56 on the Flickr8k dataset. This project demonstrates the potential of combining CNNs and RNNs for real-world applications, such as aiding visually impaired individuals in understanding their environment.