Welcome to the Reddit Data Cleanup & Sentiment Analysis repository!
This tool helps you analyze the sentiment of Reddit submissions provided in JSON format and outputs a single CSV file summarizing the results — all with a beginner-friendly setup.
Reddit-Sentiment-Analysis/
├── input/ # Folder containing input .json files (sample files included)
│ ├── sample1.json
│ └── sample2.json
├── sentimentAnalysis.py # Python script to perform sentiment analysis
├── output.csv # Output CSV (auto-generated after running the script)
└── README.md # This file
- Accepts
.jsonfiles of Reddit submissions placed inside theinput/folder. - Performs sentiment analysis using NLTK's SentimentIntensityAnalyzer.
- Outputs a single
output.csvsummarizing sentiment scores per day.
No prior experience? No worries — follow these steps to set up everything from scratch!
git clone https://github.com/VISHNUDAS-tunerlabs/Reddit-Sentiment-Analysis
cd reddit-data-cleanupIt's recommended to use a virtual environment to manage dependencies:
# Create a virtual environment
python3 -m venv venv
# Activate the virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
venv\Scripts\activatepip install -r requirements.txtIf there's no
requirements.txt, just run:
pip install nltk pandasAnd don't forget to download NLTK resources:
# Run this in a Python shell
import nltk
nltk.download('vader_lexicon')Once everything is set up, simply run:
python sentimentAnalysis.pyThis will:
- Read all
.jsonfiles in theinput/folder. - Perform sentiment analysis on each submission.
- Generate a single
output.csvfile in the root directory.
We've included a couple of sample .json files inside the input/ folder so you can test the script right away.
- Make sure your
.jsonfiles contain a top-level field namedsubmissions, where each item has acreated_utctimestamp andbodyortextfor sentiment analysis. - The script assumes the timestamp is in UNIX format and converts it to daily aggregation.
Feel free to fork the repo and contribute via pull requests. Suggestions and improvements are always welcome!
Have questions or need help? Open an issue or reach out!