Reproducing: Introduction to Association Rules in R by Yosuke Yasuda

356 words · 2 min read

options(width = 150)
options(max.print = 30)
.libPaths("/Users/mertnuhoglu/.exploratory/R/3.4")
library(dplyr, warn.conflicts = F)

This document reproduces the code explained in Introduction to Association Rules in R by Yosuke Yasuda

csv files can be accessed from http://github.com/mertnuhoglu/study_data

readr::read_csv("/Users/mertnuhoglu/projects/study_data/ds/article_introduction_to_association_rules_in_r_by_yosuke_yasuda_groceries.csv", , quote = "\"", skip = 0 , col_names = FALSE , na = c("","NA")) %>%
  dplyr::mutate(transaction_id = row_number()) %>%
  tidyr::gather(key, product, -transaction_id, na.rm = TRUE, convert = TRUE) %>%
  dplyr::arrange(transaction_id) %>%
  exploratory::do_apriori(product, transaction_id, min_support = 0.0001) %>%
  dplyr::filter(support > 0.0004) %>%
  dplyr::group_by(rhs) %>%
  dplyr::top_n(3, lift)
## Parsed with column specification:
## cols(
##   X1 = col_character(),
##   X2 = col_character(),
##   X3 = col_character(),
##   X4 = col_character()
## )
## Warning in rbind(names(probs), probs_f): number of columns of result is not a multiple of vector length (arg 1)
## Warning: 8830 parsing failures.
## row # A tibble: 5 x 5 col     row   col  expected    actual expected   <int> <chr>     <chr>     <chr> actual 1     2  <NA> 4 columns 3 columns file 2     3  <NA> 4 columns 1 columns row 3     6  <NA> 4 columns 5 columns col 4     7  <NA> 4 columns 1 columns expected 5     8  <NA> 4 columns 5 columns actual # ... with 1 more variables: file <chr>
## ... ................. ... ................................. ........ ................................. ...... ................................. .... ................................. ... ................................. ... ................................. ........ ................................. ...... .......................................
## See problems(...) for more details.
## # A tibble: 30 x 5
## # Groups:   rhs [15]
##                                lhs          rhs      support confidence      lift
##                              <chr>        <chr>        <dbl>      <dbl>     <dbl>
##  1 canned beer, liquor (appetizer)         soda 0.0004067107  0.5714286  5.526057
##  2  frozen potato products, yogurt   whole milk 0.0004067107  0.8000000  3.589416
##  3         frankfurter, liver loaf      sausage 0.0005083884  0.7142857  7.602814
##  4             liver loaf, sausage  frankfurter 0.0005083884  0.5000000  8.478448
##  5         flour, other vegetables   whole milk 0.0005083884  0.8333333  3.738975
##  6        canned beer, canned fish   rolls/buns 0.0004067107  0.8000000  6.319679
##  7          liquor, red/blush wine bottled beer 0.0016268429  0.9411765 19.528419
##  8    bottled beer, red/blush wine       liquor 0.0016268429  0.6666667 81.958333
##  9            red/blush wine, soda       liquor 0.0006100661  0.5454545 67.056818
## 10                    liquor, soda bottled beer 0.0010167768  0.7692308 15.960727
## # ... with 20 more rows

 Tech    12 Jan, 2018

Any work (images, writings, presentations, ideas or whatever) which I own is always provided under
Creative Commons License Creative Commons Attribution-Share Alike 3.0 License

Mert Nuhoglu is a Trabzon-born programmer and data scientist.

You may also like...