🎬 IMDB Movie Review Sentiment Analysis using Deep Learning (Optimized ANN)
📘 Project Overview
This project applies Deep Learning techniques to perform Sentiment Analysis on the IMDB Movie Review Dataset using an Artificial Neural Network (ANN). The model determines whether a movie review expresses a positive or negative sentiment.
It uses TF-IDF for text feature extraction and a carefully optimized neural network with:
Batch Normalization
Leaky ReLU Activation
Dropout Regularization
L2 Weight Regularization
Early Stopping
The architecture is fine-tuned to achieve robust accuracy (~87%) with controlled overfitting.
🎯 Objective
To develop a robust text classification model that can accurately predict the sentiment polarity of IMDB movie reviews using optimized neural network architecture and advanced NLP preprocessing.
🎬 IMDB Movie Review Sentiment Analysis using Deep Learning (Optimized ANN)
📘 Project Overview
This project applies Deep Learning techniques to perform Sentiment Analysis on the IMDB Movie Review Dataset using an Artificial Neural Network (ANN).
The model determines whether a movie review expresses a positive or negative sentiment.
It uses TF-IDF for text feature extraction and a carefully optimized neural network with:
The architecture is fine-tuned to achieve robust accuracy (~87%) with controlled overfitting.
🎯 Objective
To develop a robust text classification model that can accurately predict the sentiment polarity of IMDB movie reviews using optimized neural network architecture and advanced NLP preprocessing.
📂 Dataset Information
Dataset Name: IMDB Movie Review Dataset
Size: 50,000 reviews (Balanced)
Label Distribution: | Sentiment | Label | Count | |————|——–|——–| | Positive | 1 | 25,000 | | Negative | 0 | 25,000 |
Each review is a textual paragraph expressing a user’s opinion about a movie.
⚙️ Project Workflow
1️⃣ Data Preprocessing
2️⃣ Data Splitting
42for reproducibility3️⃣ Feature Extraction
max_features = 15000ngram_range = (1, 2)stop_words = 'english'This converts textual reviews into a 15,000-dimensional numeric vector, representing word importance.
4️⃣ Model Architecture
Additional Enhancements:
🧠 Model Compilation and Training
Early Stopping: Stops training when
val_lossno longer improves, ensuring the best weights are restored.📊 Results and Analysis
✅ Final Evaluation Metrics
🔍 Classification Report
accuracy 0.87 10000
🧩 Confusion Matrix
Interpretation:
📈 Training Behavior
🧰 Technologies Used
🔹 Programming Language
🔹 Data Handling & Preprocessing
🔹 Natural Language Processing
TfidfVectorizer→ Text vectorizationtrain_test_split→ Data partitioningaccuracy_score,classification_report,confusion_matrix→ Performance metrics🔹 Deep Learning Framework
Sequential,Dense,Dropout,BatchNormalization,LeakyReLU→ Neural network architectureAdam Optimizer→ Adaptive gradient optimizationEarlyStopping→ Regularization and convergence controll2→ Weight regularization🔹 Optional Visualization (Recommended)
🚀 Future Improvements
👨💻 Author
Ali Khan AI Engineer 📧 alikhan132311@gmail.com
💡 Passionate about Deep Learning, NLP, and Model Optimization.
⭐ If you find this project helpful, please give it a star on GitHub! # Sentiment-Analysis-using-MLP # Sentiment-Analysis-using-MLP