Unit 5: Cluster Analysis

Cluster analysis A cluster is a collection of data objects that are similar to one another within the same cluster and are dissimilar to the objects in other clusters. The process of grouping a set of physical or abstract objects into classes of similar objects is called clustering. Clustering techniques apply when there is no […]

Unit 4: Association Analysis

Introduction Many business enterprises accumulate large quantities of data from their day-to-day operations. For example, huge amounts of customer purchase data are collected daily at the checkout counters of grocery stores. Table 1 illustrates an example of such data, commonly known as market basket transactions. Each row in this table corresponds to a transaction, which […]

Unit 3: Classification

3.1 Introduction Classification is the process where a model or classifier is constructed to predict categorical labels of unknown data. Classification problems aim to identify the characteristics that indicate the group to which each case belongs. This pattern can be used both to understand the existing data and to predict how new instances will behave. […]