Market Basket Analysis using Association Rule-Mining

Problem Statement 

Up-till this point, we are sure that you are clear about what is Market basket Analysis, Apriori Algorithm and statistical concepts related to Association rules. 

Process

Step 1 : Download the famous grocery dataset, available here : Click Here 

Step 2 : Perform following Exploratory Data Analysis over the above dataset
  •     read the csv file into a dataframe. 
              

  •     Get the shape  find the top 20 "sold items" that occur in the dataset 
              





  •     find how much of the total sales they account for.
              


Step 3 : Create a function prune_dataset, which will help us reduce the size of our dataset based on our                    requirements. The function should perform Pruning based on percentage of total sales. for 
 
              example the function call would look like this : output_df, item_counts =   
               prune_dataset(input_df=grocery_df, length_trans=2,total_sales_perc=0.4)

              


step 4 : We need to specify two pieces of information for generating our rules: support and confidence. 
             We have already defined both of them conceptually earlier (on this web page itself), so we will 
             not be defining them again. An important piece of information is to start with a higher support,               as lower support will mean a higher number of frequent itemsets and hence a longer execution               time.

             


jupyter notebook link  


Comments