Skip to contents

Results for individual bakers across all GBBO series. We have processed this data some to make it suitable for a "bakeoff".

Usage

bakeoff_train

bakeoff_test

Format

A data frame with 71 rows representing individual bakers and 16 variables:

winners

was the baker in the final episode?

series

an integer denoting UK series (1-8)

age

an integer denoting age in years at first episode appeared

occupation

a character string giving occupation

hometown

a character string giving hometown

percent_star

the percentage of episodes achieving star baker

percent_technical_wins

percent of episodes the baker won the technical

percent_technical_bottom3

percent of times a given baker was in the bottom 3 on the technical challenge

percent_technical_top3

percent of times a given baker was in the top 3 (1st, 2nd, or 3rd) on the technical challenge

technical_highest

an integer denoting the best technical rank earned by a given baker across all episodes appeared (higher is better)

technical_lowest

an integer denoting the worst technical rank earned by a given baker across all episodes appeared (higher is better)

technical_median

an integer denoting the median technical rank earned by a given baker across all episodes appeared (higher is better)

judge1

the name of one of the judges

judge2

the name of the other judge

viewers_7day

number of viewers in millions within a 7-day window from airdate

viewers_28day

number of viewers in millions within a 28-day window from airdate

An object of class tbl_df (inherits from tbl, data.frame) with 71 rows and 16 columns.

An object of class grouped_df (inherits from tbl_df, tbl, data.frame) with 30 rows and 15 columns.

Source

This is a combination of two datasets in Allison Hill's bakeoff package.

Details

bakeoff_train is the training set for use on the homework.

bakeoff_test is the Test set for use on the homework. It contains a held out set of 137 bakers and omits the winners column.

Examples

bakeoff_train
#> # A tibble: 71 × 16
#>    winners series   age occupation  hometown percent_star percent_technical_wins
#>    <lgl>    <dbl> <dbl> <chr>       <chr>           <dbl>                  <dbl>
#>  1 FALSE        1    30 CEO of the… Northam…        0                      0    
#>  2 FALSE        1    45 Advertisin… Brackne…        0                      0    
#>  3 FALSE        1    25 Council Wo… London          0                      0.333
#>  4 FALSE        1    51 Student     Teignmo…        0                      0    
#>  5 FALSE        1    44 Student su… Sneinto…        0                      0    
#>  6 FALSE        1    48 IT program… Poynton…        0                      0    
#>  7 TRUE         1    31 Charity wo… Barton-…        0                      0    
#>  8 TRUE         2    31 Pastor      Essex           0.25                   0.25 
#>  9 FALSE        2    40 Charity wo… West Ki…        0                      0    
#> 10 TRUE         2    41 Former sch… Woodfor…        0.125                  0.375
#> # ℹ 61 more rows
#> # ℹ 9 more variables: percent_technical_bottom3 <dbl>,
#> #   percent_technical_top3 <dbl>, technical_highest <dbl>,
#> #   technical_lowest <dbl>, technical_median <dbl>, judge1 <chr>, judge2 <chr>,
#> #   viewers_7day <dbl>, viewers_28day <dbl>

bakeoff_test
#> # A tibble: 30 × 15
#>    series   age occupation          hometown percent_star percent_technical_wins
#>     <dbl> <dbl> <chr>               <chr>           <dbl>                  <dbl>
#>  1      1    24 Assistant Credit C… Leicest…        0                      0.333
#>  2      1    37 PA and administrat… Saltley…        0                      0.333
#>  3      1    31 Student             Swansea…        0                      0    
#>  4      2    19 Photographer        Midhurs…        0.4                    0.2  
#>  5      2    31 IT programme manag… Radlett…        0                      0.25 
#>  6      2    63 Project engagement… Erith           0.143                  0.143
#>  7      3    28 Dentist             North L…        0                      0.143
#>  8      3    36 Graphic designer    Manches…        0                      0    
#>  9      3    23 Trainee anaestheti… Merseys…        0.1                    0.1  
#> 10      4    31 Intensive care con… Yeovil          0.1                    0.1  
#> # ℹ 20 more rows
#> # ℹ 9 more variables: percent_technical_bottom3 <dbl>,
#> #   percent_technical_top3 <dbl>, technical_highest <dbl>,
#> #   technical_lowest <dbl>, technical_median <dbl>, judge1 <chr>, judge2 <chr>,
#> #   viewers_7day <dbl>, viewers_28day <dbl>