Association Analysis in a Nutshell

In this post, I will be explaining the uses and how can one apply association analysis onto solving real-life issues. Let’s begin by defining Association analysis.

What is Association analysis?

In short, association analysis is used to determine how input variables are associated with the outputs or the relationships between them. Inputs are termed “antecedents” and outputs are termed “consequents”.

- Advertisement -

Possible applications and uses for Association analysis often include the Market basket analysis. The Market basket analysis is useful to determine what items are frequently purchased by consumers. By using the results obtained, the store can create new discounts, bundles or even a change in their layout to increase sales of targeted products.

An example of the Market basket transactions are as follows:

Itemset 1: {jam,ham,bread}

Itemset 2: {jam,milk,bread}

Itemset 3: {milk,rice,jam}

By running an Association analysis on the Market basket transactions, the analyst can obtain various relationships between the items a customer buys. For example, jam -> bread (If a customer buys jam, he/she may buy bread).

One of the most commonly used algorithms for Association analysis is the Apriori algorithm. The Apriori algorithm generates association rules in the form of antecedents and consequents, as mentioned above.

Where X = antecedent and Y = consequent and the rule = X -> Y. And the chance of X occurring is termed the “support” and Y as the “confidence.”

However, unlike usual logical rules, association rules involve some level of uncertainty. To quantify this uncertainty, we can apply the Support and Confidence Framework.

The framework incorporates the Rule support, which is the percentage of X and Y appearing together and the Confidence that Y appears when X occurs.

Rule Support = P(X and Y occurring together)

Confidence = [P(X and Y) / P(X)]

Additionally, the Apriori algorithm works best with categorical data in a tabular or transactional format. It does not work well with numeric data. For that, we would have to bin or convert numeric data into categories which I would not explain in too much detail in this post.

Tabular data format, aka truth-table or basket data, is represented by having a flag field indicating the absence or presence for each item as seen in the table below.

IDItem 1Item 2Item 3
Cust 1TTT
Cust 2FFT
Cust 3TFF

Unlike the tabular data format, the transactional format has a separate record for each transaction or item as seen in the table below.

IDItems
Cust 1A
Cust 2B
Cust 3C

Thus, by applying the Apriori algorithm, we can generate rules based on user-specified support and confidence %. This can be seen as the threshold for which association rules are created.

However, not all rules with high support and confidence value are useful. For example: If nearly all customers buy jam and almost all customers buy bread, the confidence will be high regardless of whether there is any real association between these variables.

There are also alternatives which one can use to establish association rules. Several techniques include:

  • Confidence Difference
  • Confidence Ratio
  • Information Difference
  • Normalised Chi-Square

Well, this pretty much sums up Association analysis, what would you apply Association analysis on?

More from author

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Related posts

Latest posts

10 Best Websites to Browse During Weekend Hours

The Internet is full of amazing and addictive things: social networking, endless amounts of information, and incredible websites that keep your mind...

How to Dramatically Increase Session Time on Your Website

Tweaks and methods of increasing average time on page and average session duration on science and technology websites

Best web hosting service of 2020 : top host providers for websites

Need a new web hosting service for your company website or personal blog? We've tested the best options on the market We've reviewed (and continuously...

Want to stay up to date with the latest news?

We would love to hear from you! Please fill in your details and we will stay in touch. It's that simple!

RSS
Follow by Email
LinkedIn
LinkedIn
Share
Instagram