The readings consist mainly of the chapters from the textbook supplemented with sections from other reference books (referenced by the lastname of the first author) and papers.

Date Topics and Slides/Notes Readings/Hw
01/17  Snow day
01/22 Introduction Ch 1
01/24 Data exploration Ch 2
01/29 Dimension reduction - PCA and SVD MMDS Ch 11
01/31 Frequent Itemset Mining Ch 6
02/07 Truth discovery using crowdsourced data (Daniel Garcia-Ulloa) VLDB '17
02/12 Classification: basic concepts Ch 8
02/21 Classification: advanced methods
Ch 9
Ng Lecture Notes
Linear Regression Notes
03/05 Clustering: concepts and methods Ch 10
03/21 Midterm exam (review)
03/26 Clustering: concepts and methods Ch 11
04/02 Similarity search MMDS Ch 3
04/09 Recommender Systems MMDS Ch 9
04/16 Privacy Preserving Data Mining
04/18 Making Sense of High-dimensional health data (Prof. Joyce Ho)  
04/23 Project workshop 1: Culture, Social, Sports

1. What's cooking, Shen Gao, Tongjia Mou, Zhengzhe Yang

2. what's cooking, Yusi (Heather) Zhou and Zhu (Bambi) Zhang

3. World happiness report, Tru Powell, Phil Edwards 

4. Titanic survival, Phillip Gorbunov

5. Titanic survival, Claudia Cooperstein & Julie Wiegel

6. Energy “Forecasting” for Individual Buildings via Black-Box SVM Approach, Asher Mouat

7. Pump it up: data mining the water table, Kate Li, Joshua McEnroe, Weixing Tang

8. Predictive Analysis of NCAA March Madness Results, Christopher Tseng and Yibo Wang

9. Kaggle March madness competition, Tim Davidson, Eli Zelle, Aditya Maheshwari

10. MLB win prediction, Sean Jones

11. Optimizing Baseball Performance and Player Salary

04/25 Project workshop 2: Politics, Business

12. Predicting bike usage using neural networks, Patrick Wang

13. Airline delays, Christopher Dale

14. Otto group product classification challenge, Yicheng (Jason) Wang, Chenxiao Wang, Axel Chauvin

15. Book genre classification, Ramzi Daswani

16. Game sales prediction, Ningyuan Jiang

17. Movie based recommender systems, Mia Schoening

18. Movie Ratings with Genre and Profiles, Mickeal Prince, Connie Song, Liv Wang

19. Spool: discover and listen to music everyone likes, Tyler Angert, Angela He, Rahul Nair Using

20. Behavioral and Transactional Data to Optimize Two-sided Online Marketplace Offerings, Adam Sanders

21. TalkingData AdTracking Fraud Detection Challenge, Colin Jiang, Justin Luo, Michael Ly

22. Stock price prediction using neural networks and sentiment analysis, Rohan Bansal, Alvin Choi

23. 2016 voter data analysis, Jamelia Mays, Morgan Roberts, Priya Elangovan
04/30 Project workshop 3: Foundations, medicine, multimedia

24. Ensemble learning, Yuzhang Guo, Zhangyi Pan, Zigeng Zhu

25. Effects of learning rates on convergence of gradient descent, Yijun Dong

26. Motality prediction using WHO dataset, Alex Smadja

27. Cancer type prediction using sequencing data (cBioPortal), Donald Li, Peter Zheng and Yifeng Xu

28. Prediction of Dengue fever epidemics, Xinyi Jiang, Wenying Zhu

29. Sentence clustering using structural features, James Martin, Michelle Menzies, Quintin Kauchick

30. Hand writing recognition and font changer, Tony Dongwoo Kang

31. Face classification, Almas Myrzatay

32. Speaker recognition, Tammany Grant, Lindsay Hexter, Jake Cronin
05/07 Final exam (8am) (review)