Xiaojing Liao
  • Home
  • Publications
  • Teaching
  • Student
  • Contact
CSCI-B 365 Data Analysis and Mining

​​This course serves as an introduction to Data Analysis and Data Mining, in which we extract knowledge and understanding from data in an algorithmic and visual ways. We will learn about probability as a language that supports and unifies our understanding of data. We will build models using probability and turn these models into algorithms for data analysis. The aim is for students to master some basic probabilistic grounding, learn several popular algorithms, and develop experience thinking critically about the overall process of understanding and interpreting data.
Course information
Instructor: ​Xiaojing Liao (xliao@indiana.edu)
Time: Monday, Wednesday ​ 4:00 pm - 5:15 pm
Place: Info East 150​
Office hours: Tuesday, Thursday 4:00 pm - 5:00 pm
Textbook
<Principles of Data Mining> by Max Bramer e-copy available in IU Library
​
The following books are recommended to read:
[1] <Introduction to Data Mining> by Pang-Ning Tan, Michael Steinbach, and Vipin Kumar
[2] <Data Mining: Concepts and Techniques> by Jiawei Han and Micheline Kamber
Week
Date
Agenda
Reading
HW
Week 1
1/7
Course overview
Data representation
Syllabus/
Textbook chapter 2​
 
1/9
Data representation/
​Probability
 An Introduction to R/
​Probability terminology
Week 2
1/14
Probability 
Conditional probability  chapter 2
 
1/16
Probability  
Confidence Interval/
Conditional probability  chapter 3
HW1
Week 3
1/21
No class (MLK Jr. Day)​ 
 
 
1/23
Probability 
Conditional probability  chapter 3 & 6
Week 4
1/28
Probability 
Conditional probability  chapter 7
HW2
 
1/30
No class (severe winter weather) 
 
 
Week 5
2/4
  Probability
Conditional probability  chapter 7
 
2/6
Classification 
Textbook chapter 3.1
 
Week 6
2/11
Classification 
Textbook chapter 3.2
HW3
 
2/13
Classification 
​Textbook chapter 3.2
 
Week 7
2/18
Classification 
Textbook chapter 3.3
 
2/20
Classification
​Textbook chapter 3.3/7.2
HW4
Week 8
2/25
No class (travel for conference)
 
2/27
No class (travel for conference) 
 
Week 9
3/4
Classification 
Textbook chapter 4.1/5.3
Intro to Tree Classifier
 
 
3/6
Classification 
Textbook chapter 9.4
​
Intro to Tree Classifier​
Week 10
3/11
Spring Break 
 
 
3/13
Spring Break 
 
 
Week 11
3/18
Classification 
Textbook chapter 9.2/9.4
 
 
3/20
Regression 
Linear regression
Page​
 3-4
Week 12
3/25
Regression 
Linear regression
Page​
 3-4
HW5
 
3/27
Regression & Linear Algebra 
Linear regression
Page​
 4-7
 
Week 13
4/1
Regression & Linear Algebra ​ 
Linear regression
Page​
 4-7
 
4/3
Regression 
Linear regression
Page​
 7-11
HW6
Week 14
4/8
Guest Lecture
​by Wen Chen
 
 
 
4/10
Regression
Linear regression
Page​
 14-15
 
Week 15
4/15
Regression
 
HW7
 
4/17
Regression
 
 
Week 16
4/22
Clustering
Textbook Chapter 19.1/19.2
 
4/24
Final exam review and wrap up
 
 
4/29
Final Exam
(2:45pm - 4:45pm)
 
​Prerequisites
Students should have basic programming skills such as would be acquired through CSCI-C 200, C-211 or INFO-I 210.
Grading
40% of Homework; 
25% of In-class Midterm exams; 
35% of Final Exam
Others
Laptop policy: ​Using laptop/tablets is only allowed to take notes
Academic accommondation: It is the policy of Indiana University Bloomington to accommodate students with disabilities and qualifying diagnosed conditions in accordance with federal and state laws. Any student who feels s/he may need an accommodation based on the impact of a learning, psychiatric, physical, or chronic health diagnosis should contact Office of Disability Services for Students (DSS) to determine if accommodations are warranted and to obtain an official letter of accommodation. For more information, please click here.
Honor code: Students are required to follow the Honor System of Indiana University Bloomington
  • Home
  • Publications
  • Teaching
  • Student
  • Contact