top of page

Methodology

data analysis using online comments

For this project, I looked at 3 major luxury brands with varying consumer perceptions over the past few years. I then scraped Youtube comments data to understand sentiment toward the brands and associated words.

IMG_1981.JPG

the brands

I looked at three unique luxury brands: Burberry, Louis Vuitton, and Dolce & Gabbana

DATA COLLECTION

IMG_1986_edited_edited.png

youtube

I looked at the official YouTube channels of Burberry, Louis Vuitton, and Dolce & Gabbana and each had a playlist of all their fashion shows. Playlists often included broke up fashion shows into multiple videos and thus each brand typically had about 100 videos per playlist. I also used the YouTube API (mentioned below).

IMG_1990.PNG

web scraper

I used the Chrome Web Scraper extension in order to go through each playlist and scrape the titles of each video, the video link, and the date the video was published. As a result, I ended up with 3 data files with over a hundred rows in each one.

IMG_1989.PNG

excel

In order to do basic cleaning on the scraped data, I used Excel. I formatted the dates to prep it for R. I also spliced the video link in order to get the unique id for each video.

IMG_1988.PNG

r studio

To both fetch comments and analyze all the data, I used R in the R Studio environment. For packages, I used tuber to scrape data, syuzhet to analyze sentiment, wordcouds to create word clouds, and rdrobust to conduct discontinuity analysis.

r packages

TUBER

Tuber helped me scrape all the comments for each of the video_ids from YouTube that I extracted using Web Scraper and Excel. After connecting to the YouTube API, I authorized myself as a user and was able to extract all the comments on every single video.

SYUZHET

Afterward, I analyzed the comments for each video to get an average viewer sentiment rating for the video. I then plotted this for each brand to see how viewer sentiment changed over time.

WORD

CLOUDS

I also created word clouds using the appropriate package. I created 3 word clouds for the most recent video: one of the most frequent word, one of the most positive words, and one of the most negative words. These would indicate what the most popular opinions are, as well as what the largest proponents and opponents of the brand think.

RDROBUST

​Finally, I ran a discontinuity analysis using rdrobust. Discontinuity analysis made the most sense because I wanted to understand how large events and scandals affected each of the companies. By running a discontinuity analysis, I could see if sentiment on videos changed from before to after the event.

bottom of page