Data were downloaded from Mengting Wan .
The datasets were collected in late 2017 from goodreads. Details of the datasets are described in the dataset website
- Explore high dimensional user data from Goodreads.
- Learn new-to-me approaches in NLP and implicit collaborative filtering/recommender systems.
Used for my analyses so far:
- makeDBs_complete.ipynb and makeDBs_romance.ipynb: Create SQLite database and tables for data organization/storage.
- statistics.ipynb: EDA notebook to compute basic statistics of data and understand rating/review trends across genres.
- reviews_sentiment.ipynb: Sentiment analysis on >3.4 million romance book reviews.
- romance_recommender.ipynb: Implicit collaborative filtering on >16 million romance book ratings.
- Mengting Wan, Julian McAuley, "Item Recommendation on Monotonic Behavior Chains", in RecSys'18. [bibtex]
- Mengting Wan, Rishabh Misra, Ndapa Nakashole, Julian McAuley, "Fine-Grained Spoiler Detection from Large-Scale Review Corpora", in ACL'19. [bibtex]