COURSE OUTLINE
1. Introduction
1.1. Motivation
1.2. Data mining – applicability
1.3. Patterns that can be mined
1.4. Interestingness of patterns
1.5. Classification of data mining systems
1.6. Major issues in data mining
2. Preliminaries
2.1. Fourier analysis
2.2. Wavelets
2.3. Karhunen-Loeve transform and Singular Value Decomposition
3. Data preprocessing
3.1. Necessity of data preprocessing
3.2. Data cleaning
3.3. Data integration and transformation
3.4. Data reduction
3.4.1. Dimensionality reduction
3.4.2. Data compression
3.5. Feature extraction
3.6. Data discretization and concept hierarchy generation
4. Data warehouse and OLAP technology
4.1. A multidimensional data model
4.2. Data cubes
5. Characterization and Comparison
5.1. Data generalization and summarization-based characterization
5.2. Discriminating between different classes
5.3. Mining class comparisons
5.4. Mining descriptive statistical measures in large databases
5.4.1. Central tendency
5.4.2. Dispersion of data
5.5. Statistical tests of significance
5.6. Hypotheses evaluation
5.7. Correlation analysis
6. Mining frequent patterns, associations and correlations
6.1. Association rule mining
6.1.1. Apriori algorithm and its extensions
6.2. Single-dimensional and multi-dimensional association rules
6.3. From association analysis to correlation analysis
6.4. Constraint-based association mining
7. Classification and prediction
7.1. Definitions
7.2. Classification and Regression Trees (CART)
7.3. Bayesian classification and Bayesian belief networks
7.4. Neural networks
7.5. Other classification methods
7.5.1. Support Vector Machines
7.5.2. K-Nearest neighbor classifiers
7.5.3. Genetic algorithms
7.6. Prediction
8. Cluster analysis
8.1. Partitioning methods (K-means, K-medoids)
8.2. Hierarchical methods (BIRCH, CURE, ROCK)
8.3. Density-based methods (DBSCAN)
8.4. Grid-based methods (STING)
8.5. Model-based clustering
8.6. Clustering high dimensional data
8.7. Outlier analysis
9. Mining time series, streams and sequence data
9.1. Mining time series
9.2. Mining data streams
9.3. Mining sequence patterns in biological data
10. Searching by content
10.1. Approximate nearest-neighbor queries
10.2. Database indexing methods
10.2.1. Primary key access methods
10.2.2. Secondary key access methods
10.2.3. Spatial access methods
10.2.4. Access methods for text
10.2.5. Indexing signals
10.2.6. Fractals in databases
11. Mining complex types of data
11.1. Multidimensional analysis
11.2. Spatial data mining
11.3. Multimedia data mining
11.4. Text mining
11.5. Graph mining
11.5.1. Mining frequent subgraphs, similarity searches, clustering
11.5.2. Social network analysis
11.6. Mining the World Wide Web
11.7. Visualization