Exploratory Data Analysis on Beers & Breweries Datasets icon Exploratory Data Analysis on Beers & Breweries Datasets
Personal Projects #Data Science#Python

Overview#

A comprehensive exploratory data analysis (EDA) project examining the relationship between beer characteristics (ABV, IBU) and brewery data.

Key Achievements#

  • Discovered moderate correlation (0.670) between IBU and ABV values
  • Identified that ounces has no significant correlation with other features (0.054-0.172)\
  • Found 41.7% missing values in IBU column
  • Determined American IPA is the most common style (424 occurences out of 99 styles)

Descriptive Statistics of Numerical Values#

ColumnNon NullMissing %MinMaxMeanMedianModeStd DevQuartile 0.25Quartile 0.5Quartile 0.75
IBU140541.7%4.020.042.71335.020.025.95421.035.064.0
ABV23482.6%0.0010.1280.05980.0560.050.01350.050.0560.067
Ounces24100%8.432.013.59212.012.02.35212.012.016.0

Correlation Matrix#

abvibuounces
abv1.00.6700.172
ibu0.6701.00.054
ounces0.1720.0541.0

Frequency Distribution Plots#

For each numerical column, use seaborn and pyplot to create distribution plots of numerical columns while dropping missing values.

ibu Distribution#

abv Distribution#

ounces Distribution#

Technologies#

Python, Pandas, NumPy, Seaborn, Matplotlib, Pearson Correlation

← Back to Projects