Sei sulla pagina 1di 20

R Project Presentation

On Beer Data Sample

Presented by
Group 4
Amandeep Singh Chawala
Dushyant Sharma
Intekhab Aslam
Pallavi Bothra
Pranav Rampal
Varun Bajaj
Data snapshot

beer_brewerid review_time review_overall review_text review_aroma review_appearance review_profilename beer_style review_palate review_taste beer_name beer_abv beer_beerid
140783 17981 1301176680 3 A soft pour into my Lost Abbey Teku 3 glass produces an almost English
five finger thick, lightly
4 Gueuzedude Ale head that leaves
Browndarkish
browned, 3.5 a fair amount3.5ofBolita on theDouble
lacingBrown glass. The
myBrown
sides ofNut 46368
Ale beer9is a dark, conce
84325 140 1194833634 4.5 HombreWing
3.5 nice little bubbles,brightNose:
5 2007Eyes: Pours frothy, dark orange, Americansome
very little of anything, 4.5 another night5and
IPA hops ...I will smell again Sierra Nevada Celebration
updateTongue: Ale of light
A fine balance 1904
6.8sweet caramel
135323 5318 1289727085 4.5 -black as can be with a straight up4.5thin brown head. not much 4 juhl31 Americantime
on the retention aspect-big chocolate,
Double Stoutthen4.5some good coffee
1st wiff.
/ Imperial 5 Infamous come in-a veryNAgood, bittersweet63521
Chocolate
and roastW/notes choco
86688 140 1288122242 with a mammoth tan head
3.5 Poured into a pint glass. The beer3.5is transparent nut brown4 Bendurgin American Ale of soapy lace.3.5
Browntons
that leaves 3.5 and
The aroma is nutty Nevada
Sierragrainy withTumbler malts, hints
toastedAutumn Ale and60420
5.5
of caramel
Brown just a
445966 590 1310181648 4 12 oz bottle into a stangeA - Pours4 a hazy yellow-orange with 4 mooselvlt of off white foam. Minimal lacingS4.5
a huge pillowy headHefeweizen Dancing
- Banana, cloves, 4yeast, Man-Wheat
wheatT 37586
Right along the lines of the7.2nose. Its actual
473320 392 1193633294 4 light tan head. Good
4 Pours a light copper color with a fluffy stickage and decent lacing.
4 crookedhalo Aromas
American Amber Ale malt and4mild hops with3.5some
biscuity
are of/ Red Ale
Househoneysuckle-like Lighter bodied
floral attributes. 4.5 w
31337
338367 16353 1287721706 4.5 appearance - straw yellow, foamy4head, good retention and 4 djbreezy Blonde Ale- hop forward,4.5
hops and yeasttaste
lacingsmell - cascade American Fresh Hopbacked
hop characteristics
earthy and fresh,4.5nwGoldilocks 62806
GoldenbyAlea bready4.2bodymouthfee
707231 1471 1240372433 4.5 Got this one in a BIF and I'm going rayjaybeer but I'm familiarAmerican
4.5in blind; never heard3.5of this name of the brewery.A black
with theStout 4 beer with a4.5deepDarkbrown Tres Blueberry
Horseaverage Stout time
head. Expected.Big 7.5 blueberry6227ar
356581 1177 1295228598 4.5 From a growler given to be by a4.5 good fingers of light tan
killer trader brewnic. A-4.5twooglmcdgl head that
American IPAholds nicely for a bit then 4.5 Masala
4.5 settles leaving sticky lace
tons of Mama Ale on top. Deep
Palea skin
Indiaand 6368
5.9 copper orange
1294635 1422 1302569529 3 Beer 30 Light has a thick, egg-shell golden appearance with
2.5 RonaldTheriot
2.5 colored head and a clear, Lagerbubbles streaming up and little
Lightsome aroma
3 lacing left.2.5TheBeer is strange, to say the least, with 4a green sour
30 Light ap
32918
Looking at the dataset, the 'review_overall'
input must be the most accurate metric to rank
the beers. While there are other features that
Three Most judge facets of the beers' quality, the user takes
these all into consideration when choosing the
famous picks overall score.

out of the For a more robust recommendation, beers with


beer data too few reviews will be excluded. The review
count is first added to the data frame, and
visualized to find a good threshold.
Beer_beerid Beer_abv Review_overall Review_count
The average Count 28724.000000 28724.000000 28724.00000 28724.000000

number of Mean 21366.908369 7.052047 3.81780 11.354477


reviews per
beer is 12. Std 21760.066995 2.359091 0.72546 12.246361

Min 5.000000 0.300000 1.00000 1.000000


25% 1670.00000 5.200000 3.50000 2.000000
50% 12719.000000 6.500000 4.000000 7.000000

75% 39043.000000 8.500000 4.500000 16.000000

max 77162.000000 41.000000 5.000000 60.000000


With this knowledge,
10 reviews seems like
a sensible threshold
for the minimum
count.
Beer_name Beer_style Beer_abv Review_count Review_name
3947 Pliny The Younger 21690 11 11 4.636364
1747 Trappist 1545 10.2 22 4.590909
Westvleteren 12
817 Pliny The Elder 7971 8 46 4.586957

Top 3 Beers with >10 reviews


The top right data point with
~65 reviews looks interesting
too. Maybe it can be sourced
locally.

More Insights here

Relationship between review


count and overall score.
Beer_name Review_name
3947 Pliny The Elder 4.586957
1747 Trappist Westvleteren 12 4.536585

817 Pliny The Younger 4.590909


926 The Abyss 4.550000

The recommended beers, based on overall


review and popularity
Which of the factors
(aroma, taste,
appearance, palette) • The correlation between
are most important columns needs to be
investigated. Below, the
in determining the Pearson product-
overall quality of a moment correlation is
beer? calculated and plotted
on a heatmap.
Here we can see that
the most influential
factors in overall score
is ABV (0.35), followed
by Taste and
Appearance (0.3).
Beer_style Beer_name Beer_abv, ‘size’ Beer_abv, ‘mean’

512 lt 1 6
Which brewery 84/09 Double Alt 1 9.8
produces the Altbier AlTerior Motive 1 6.3
strongest beers Alaskan Amber 8 5.3

by ABV%? Alt 8 8.5375


Oval Beach Blonde Ale 1 5

Grouping the reviews by


Paienne 2 5
brewery, then beer, gives
a list of each brewery's
beer collection. The [ size Peregrine Golden Ale 1 4
] column shows how many American Blonde Ale
reviews are logged for Pete’s Wicked Rally Cap 6 5
each beer. Ale
Premium Blonde 1 5

Rapscallion Premier 1 7
The top breweries by
average ABV are
revealed. Plotting the
results to compare the
highest ABV breweries
with the rest:
Visualizing the data
shows that most
breweries brew around
6-8% ABV.
If you typically enjoy a beer due to its aroma and appearance,
which beer style should you try?
Beer_style Review_appearance, Size Agg_mean

American Double/ Imperial Stout 1033.0 4.169242

Russian Imperial Stout 1044.0 4.157807


Quadrupel (Quad) 367.0 4.115123
Flanders Red Ale 110.0 4.111364
Gueuze 102.0 4.075980
To answer this, we need to group the data frame by beer style, then sort by an aggregated mean of the
aroma and appearance score. A scatter plot would be a good way to visualize the relationship between
aroma and appearance.
Creating an aggregate mean of appearance and aroma, and then sorting by this value, gives the highest
rated beer type for the two variables.
Here a linear
relationship is shown
between the
appearance and aroma
score. This is evidence
that a beer's look and
smell are deeply linked.
There is a significant linear
relationship between the two
feature, with one-two obvious
outliers.
With this information we can
conclude that the appearance,
aroma, and overall score of a
beer are strongly correlated. Beer
recommendations using only the
overall score is a safe and
sensible option.
Steps
1. Data Extraction from source
2. Parsing comments into sentences
Text Analytics 3. Tokenization
4. Term Frequency Analysis
5. Word Clouds
Relative Frequencies of different words
Word Cloud

Potrebbero piacerti anche