[Miscellaneous] Find my lipstick shade! - Sephora.com web scraping & EDA

When choosing color makeups such as lipsticks, natural colors that individuals possess such as skin tone, hair color, and eye color contribute. In some cases, hair and eye colors characterize an individual's dominant color templates due to genetics. For instance, people with suppressed melanin production express blue eyes and blonde hair with relatively fair skin. However, in some cases where hair and eye colors are different spectrums of brown due to increased melanin production, skin "undertone" may contribute more in determining the dominant "wearable" colors. While some attention is given to the lightness/darkness of the skin tone as Sephora provides skin tone category input option (9 levels from porcelain to ebony) to the reviewers, it is up to the customers to determine which hue (ie. "yellow", "pink", or "olive" undertone) the best matching shade of the foundation has. It becomes even more complicated when choosing lip or eye color makeups because there are no two pink lipsticks that are the same! Wouldn't it be wonderful if you can determine which shade of foundation (depending on the brand) is the best fit as well as whether this particular color of lipstick would look good on you or not based on your purchase and review history?

As the first step, the goal of this project was to explore the color spectrum of the foundations and lipsticks given reviewer's dominant colors (self-reported hair color, eye color, and skin tone from Sephora's reviewer inputs) to see if particular features are strongly correlated with the purchased and liked foundation and lipstick colors.

Data Collection

Since sephora.com runs heavily on dynamic API, product URLs were first collected from Sephora using Selenium. Then product information needed to pull review JSON from Bazaarvoice were collected from the product page using selenium. Then RBG values were extracted from color swatches provided in the reviews.

The following table shows the web scraped data organized into the data frame that was used in the analysis herein:

Total number of scraped data with Ratings of 4 or 5: 176958

Total number of unique author IDs: 127859

Total number of unique reviewer Nicknames: 125759

Total number of unique foundation products: 191

Total number of unique lipstick products: 230

Total number of unique foundation shades: 3689

Total number of unique lipstick shades: 3083

Exploratory Data Analysis

Pie charts below show the breakdown of the self-reported personal colors in both foundation and lipstick reviews. It's worth noting that there are only 9.7% of customers with 3 darkest skin tones, deep, dark, and ebony.

In order to observe whether such underrepresentation of dark skin tones in reviews was particularly for foundation since there are relatively less darker shades to choose from, foundation and lipstick colors were broken down in the following analysis.

And it turned out that product category didn’t really matter. For both lipstick and foundation products, customers with deep/dark/ebony skin tones consisted only 10% of the reviews. The scatter plots (Right column in the figure above) show that there are wider range of colors in lipstick products, and customers with darker skin tone didn’t write as many reviews even when such wider range of colors was offered for lipstick products. It is hard to determine without seeing the actual sales record because reviews don’t require log in, but this under representation of customers with darker skin tone is likely to reflect their lower preference to shop online. This indicates potential online market growth by reaching out to this group.

Now, let’s look at what the foundation color is like for different categories of self-reported color traits. RGB is 3 dimensional, and it can be reduced to 1 dimensional numerical value by combining red, green, blue values to luminosity (L):

L = 0.2126 * R + 0.7152 * G + 0.0722 * B

As the word luminosity suggests, it is a measure of brightness.

As expected, skin tone category reflected the foundation luminosity such that there was a monotonous trend of decreasing foundation luminosity with darker skin tone categories. However, we can see that tan and olive are almost identical in terms of luminosity (see the latter pair plot analysis). Categorical linear model analysis (foundation L ~ skin tone (C)* + hair color (C) + eye color (C)) indicated that skin tone had a strong and significant effect on foundation luminosity (|β|> 0.019, P<0.0001) while eye color had weak but significant effect (|β|< 0.014, P<0.005). On the other hand, for hair colors, only red and black hair colors had a weak but significant effect (|β|< 0.007, P<0.005) on the foundation luminosity while other hair colors didn't have significant effects. Overall, 45.8% of the variance in the foundation luminosity was explained by skin tone, hair color categories (R2=0.458, P<0.0001).

*(C) indicates categorical variable.

On the contrary, categorical linear model analysis (lipstick L ~ skin tone (C) + hair color (C) + eye color (C)) for lipstick products yielded the results that were quite different from the foundation. Because of the large sample size, the statistical analysis showed that the effects of skin tone, hair color, and eye color on the lipstick color was significant, but the Variance explained (R2 = 0.038, P<0.0001) was so small, those color traits practically have negligible effect on the lipstick color.

Since luminosity is a 1D measurement derived from R, G, B, as the next step, I investigated how these 3D values are related to the self-reported color traits. For foundation products, RGB color distribution (diagonal plots below) showed distinct patterns for each of the self-reported skin tones except for tan and olive, which were indistinguishable.

In the pair plots above, it was found out that the RBG histograms (diagonal plots) between tan and olive overlapped. This indicates that even though Tan and Olive are supposed to reflect at least different skin undertones, or the hue, people are not necessarily able to distinguish between the two. This implies that the categories used in Sephora skin tone self-reporting only needs 8 instead of 9. Pair plots of R, G, B by each self-reporting categories (skin tone, hair color, eye color) showed that there are systematic shifts in RGB distribution depending on skin tone, which becomes less visible when grouped by hair color and eye color.

Categorical linear model analysis was performed here to draw interpretations above: R | G | B ~ skin tone (C) + hair color (C) + eye color (C). All three R, G, B resulted in moderate variance explained (R: R2 = 0.362, P<0.001; G: R2 = 0.468, P<0.001; B: R2 = 0.467, P<0.001).

Pair plots were obtained for lipstick RGBs as well:

In terms of lipstick RGB color distribution, skin tones can be divided into two groups (Group 1: ebony, dark, deep; Group 2: porcelain, fair, light, medium, tan, olive). Group 1 purchased red shifted colors with less blue and green colors, while Group 2 purchased more lighter and neutral (stronger green and blue components) colors. This reflects that people with deep/dark/ebony skin tones prefer lipstick colors that may be distinct from the rest of the skin tone group.

Categorical linear model analysis was performed here to draw interpretations above: R | G | B ~ skin tone (C) + hair color (C) + eye color (C). All three R, G, B did not produce meaningful variance explained (less than 5%) although significant (R: R2 = 0.037, P<0.001; G: R2 = 0.028, P<0.001; B: R2 = 0.027, P<0.001). Individual coefficients from the categorical linear model were greater for Group 1 (ebony, dark, deep) from the skin tone group such that their coefficients were approximately 3 times greater in magnitude than Group 2 for all of R, G, B (P<0.05).

Modeling

Now that several interesting trends in the foundation and lipstick reviews were found, what can I do with these to help me find a lipstick shade that matches my complexion? When I started this project, I hoped that there would be several lipstick and foundation reviews written by the same customer using the same nickname. However, unfortunately ZERO reviews overlapped between lipstick and foundation reviews under the same reviewer's nicknames out of 176,958 reviews! Is it the end of the journey? No, and this is where all the hard work in EDA and statistical analysis pays off.

EDA and statistical analyses showed that skin tone may mediate the relationship between foundation and lipstick colors. This means that linear models can be developed using aggregates grouped by self-reported skin tone. Before jumping to the models, it is worth going over the assumptions made in these models:

Ratings of 4 and 5 indicated that the reviewers liked the shades as well as other qualities of the product.
Although only the reviews with color swatches were selected, this selection process randomly sampled from the whole review data.
Swatches represent the true color, and its averaged RGB represents the whole image.

Linear models between each of medians of R, G, and B values of foundation and lipstick grouped by skin tone showed strong correlations (Left plots: R2 = 0.898, 0.929, 0.912, for R, G, B, respectively) that were significant (P < 0.0001). For hair and eye colors, linear models showed moderate to strong correlations, but did not show statistical significance. Specifically, for hair color, the linear model for R values showed positive relationship with a strong correlation (R2 = 0.614, P = 0.0402) but the model did not survive bonferroni correction. For eye color, the linear models for B and G values showed positive relationships with strong correlations (R2 = 0.713, 0.656 for G and B, respectively) but the models did not achieve statistical significance (P = 0.456 and 0.0607 for G and B, respectively) after bonferroni correction.

From the models, it can be concluded that on average, the lipstick shade that doesn't collide with the foundation shade can be described as following:

Lipstick{R,G,B|r,g,b} = Foundation{91+0.40r,19+0.36g,53+0.26b}.

Conclusion

This project started out with a simple question that every person asks when buying a lipstick. What color looks good on me that I like enough to spend on? The linear models gave us an equation. How can we utilize it? Can we make a recommender system based on it to help online customers? To test it, I recruited a google image with "less natural" pink lipstick, and gave it a "more natural" one using the lipstick formula from this project.

After all, make ups are subject to personal preferences, so it's difficult to "judge" which is better, but data suggests that there are average trends of colors of lipsticks that people prefer (higher rating) based on their skin color (inferred by foundation colors). In that sense, this formula can provide guidelines for recommending "safe" colors as well as assessing the "risk" of choosing colors that are far away from the safe recommendation (e.g. measure distance between the risky color and the models).

In addition, contrary to the general belief that categorized eye color and hair color help finding matching lipstick shades, on average these two color features did not have any predictive power on the purchased and liked lipstick shades. These results may help in formatting reviewer's responses to provide better information to the new customers or generating suggestions for the future purchases.

Future directions include running paired analysis (i.e. pair lip and foundation colors) and look at the similarity between those two colors and people's preferences in relation to such index.