Unit 5: Cluster Analysis

Cluster analysis A cluster is a collection of data objects that are similar to one another within the same cluster and are dissimilar to the objects in other clusters. The process of grouping a set of physical or abstract objects into classes of similar objects is called clustering. Clustering techniques apply when there is no […]

Unit 4: Association Analysis

Introduction Many business enterprises accumulate large quantities of data from their day-to-day operations. For example, huge amounts of customer purchase data are collected daily at the checkout counters of grocery stores. Table 1 illustrates an example of such data, commonly known as market basket transactions. Each row in this table corresponds to a transaction, which […]

Unit 3: Classification

3.1 Introduction Classification is the process where a model or classifier is constructed to predict categorical labels of unknown data. Classification problems aim to identify the characteristics that indicate the group to which each case belongs. This pattern can be used both to understand the existing data and to predict how new instances will behave. […]

Unit 2: Data Preprocessing

2.1 Data Types and Attributes A data is a known fact that can be recorded and have implicit (imbedded) meaning. For example, the names, eye color, telephone numbers, and addresses of the students of a class. So, data is a collection of data objects and their attributes. A collection of attributes describe an object. Attribute […]

Unit 1: Introduction | Data Mining and Data Warehousing

Data Mining Databases today can range in size into the terabytes — more than 1,000,000,000,000 bytes of data. Within these masses of data lies hidden information of strategic importance. But when there is such enormous amount of data how could we draw any meaningful conclusion? We are data rich, but information poor. The abundance of […]

IT 305: Object-Oriented Database Management | BIM | Solution

Very Short Questions 1. Define Relational Data Model. RDM is a type of model representing the database as a collection of relations or tables that deal with data, not objects. 2. Define Parameter and Methods. Parameters are the passed values that are reused and methods define some functionality to act on parameters when available. 3. […]