Lectures
The readings consist mainly of the chapters from the textbook supplemented with sections from other reference books (referenced by the lastname of the first author) and papers.
Date |
Topics and Slides/Notes |
Readings/Hw |
01/17 |
Snow day |
01/22 |
Introduction |
Ch 1 |
01/24 |
Data exploration |
Ch 2 |
01/29 |
Dimension reduction - PCA and SVD |
MMDS Ch 11 |
01/31 |
Frequent Itemset Mining |
Ch 6
Hw1 |
02/05 |
02/07 |
Truth discovery using crowdsourced data (Daniel Garcia-Ulloa) |
VLDB '17 |
02/12 |
Classification: basic concepts |
Ch 8
Hw2 |
02/14 |
02/19 |
02/21 |
Classification: advanced methods |
Ch 9 Ng
Lecture Notes
Linear Regression Notes |
02/26 |
02/28 |
03/05 |
Clustering: concepts and methods |
Ch 10
Hw3 |
03/07 |
03/19 |
03/21 |
Midterm exam (review) |
|
03/26 |
Clustering: concepts and methods |
Ch
11 |
03/28 |
04/02 |
Similarity search |
MMDS Ch 3 |
04/04 |
04/09 |
Recommender Systems |
MMDS Ch 9 |
04/11 |
04/16 |
Privacy Preserving Data Mining |
|
04/18 |
Making Sense of High-dimensional health data
(Prof. Joyce Ho) |
|
04/23 |
Project workshop 1: Culture, Social, Sports
1. What's cooking, Shen Gao, Tongjia Mou, Zhengzhe
Yang
2. what's cooking, Yusi (Heather) Zhou and Zhu (Bambi) Zhang
3. World happiness
report, Tru Powell, Phil Edwards
4. Titanic survival, Phillip Gorbunov
5. Titanic
survival, Claudia Cooperstein & Julie Wiegel
6. Energy “Forecasting”
for Individual Buildings via Black-Box SVM Approach, Asher Mouat
7. Pump it up: data mining the water table,
Kate Li, Joshua McEnroe, Weixing Tang
8. Predictive Analysis of NCAA March
Madness Results, Christopher Tseng and Yibo Wang
9. Kaggle March madness competition, Tim Davidson, Eli Zelle, Aditya Maheshwari
10. MLB win prediction, Sean Jones
11. Optimizing Baseball
Performance and Player Salary
|
|
04/25 |
Project workshop
2: Politics, Business
12. Predicting bike usage using neural networks, Patrick Wang
13. Airline delays, Christopher Dale
14. Otto group product classification challenge, Yicheng (Jason) Wang,
Chenxiao Wang, Axel Chauvin
15. Book genre classification, Ramzi Daswani
16. Game sales prediction, Ningyuan Jiang
17. Movie based recommender systems, Mia Schoening
18. Movie Ratings with Genre and Profiles, Mickeal Prince, Connie Song, Liv Wang
19. Spool: discover and listen to music everyone likes, Tyler Angert, Angela He, Rahul Nair
Using
20. Behavioral and Transactional Data to Optimize Two-sided Online Marketplace Offerings, Adam Sanders
21. TalkingData AdTracking Fraud Detection Challenge, Colin Jiang, Justin Luo, Michael Ly
22. Stock price prediction using neural networks and sentiment analysis, Rohan Bansal, Alvin Choi
23. 2016 voter data analysis, Jamelia Mays, Morgan Roberts, Priya
Elangovan
|
|
04/30 |
Project workshop 3: Foundations, medicine, multimedia
24. Ensemble learning, Yuzhang Guo, Zhangyi Pan, Zigeng Zhu
25. Effects of learning rates on convergence of gradient descent, Yijun Dong
26. Motality prediction using WHO dataset, Alex Smadja
27. Cancer type prediction using sequencing data (cBioPortal), Donald Li, Peter Zheng and Yifeng Xu
28. Prediction of Dengue fever epidemics, Xinyi Jiang, Wenying Zhu
29.
Sentence clustering using structural features, James Martin, Michelle Menzies, Quintin Kauchick
30. Hand writing recognition and font changer, Tony Dongwoo Kang
31. Face classification, Almas Myrzatay
32. Speaker recognition, Tammany Grant, Lindsay Hexter, Jake Cronin
|
|
05/07 |
Final exam (8am) (review) |
|
|