Methodology
data analysis using online comments
For this project, I looked at 3 major luxury brands with varying consumer perceptions over the past few years. I then scraped Youtube comments data to understand sentiment toward the brands and associated words.
the brands
I looked at three unique luxury brands: Burberry, Louis Vuitton, and Dolce & Gabbana
DATA COLLECTION
youtube
I looked at the official YouTube channels of Burberry, Louis Vuitton, and Dolce & Gabbana and each had a playlist of all their fashion shows. Playlists often included broke up fashion shows into multiple videos and thus each brand typically had about 100 videos per playlist. I also used the YouTube API (mentioned below).
web scraper
I used the Chrome Web Scraper extension in order to go through each playlist and scrape the titles of each video, the video link, and the date the video was published. As a result, I ended up with 3 data files with over a hundred rows in each one.
excel
In order to do basic cleaning on the scraped data, I used Excel. I formatted the dates to prep it for R. I also spliced the video link in order to get the unique id for each video.
r studio
To both fetch comments and analyze all the data, I used R in the R Studio environment. For packages, I used tuber to scrape data, syuzhet to analyze sentiment, wordcouds to create word clouds, and rdrobust to conduct discontinuity analysis.
r packages
TUBER
Tuber helped me scrape all the comments for each of the video_ids from YouTube that I extracted using Web Scraper and Excel. After connecting to the YouTube API, I authorized myself as a user and was able to extract all the comments on every single video.
SYUZHET
Afterward, I analyzed the comments for each video to get an average viewer sentiment rating for the video. I then plotted this for each brand to see how viewer sentiment changed over time.
WORD
CLOUDS
I also created word clouds using the appropriate package. I created 3 word clouds for the most recent video: one of the most frequent word, one of the most positive words, and one of the most negative words. These would indicate what the most popular opinions are, as well as what the largest proponents and opponents of the brand think.
RDROBUST
​Finally, I ran a discontinuity analysis using rdrobust. Discontinuity analysis made the most sense because I wanted to understand how large events and scandals affected each of the companies. By running a discontinuity analysis, I could see if sentiment on videos changed from before to after the event.