Business Analytics 2 is a follow-on to Business Analytics that I designed and taught for the first time in the Fall of 2018; I designed XLKitLearn in parallel, and use it extensively in the class. Within 6 months, this class grew to become a "staple elective" - I teach 6-7 sections of the class per year, and demand far outstrips supply.
In my time as a consultant, I noticed that the strongest indication that a data-science project would succeed was strong involvement by often non-technical manages deeply ingrained in the business. These subject-matter-experts were an essential bridge between the analytics we did and the realities of a business. Others have made similar observations, most notably McKinsey in this 2018 article. My aim in this class is to equip our students to become these analytics translations, and to empower them to function effectively in an increasingly data-driven world.
The syllabus for the class can be found here, and this page contains a more detailed outline listing the topics covered in the class. The class comprises four modules - all cases described below are listed on the research and cases page.
- Introduction reviews linear regression (including interaction variables, dummy variables, and the omitted variable bias), followed by an introduction to the bias-variance tradeoff and K-fold cross validation.
- Powerful Predictions covers the basics of decision trees (CART) in particular, boosting, bagging, boosted trees, and random forests. I teach these concepts in the context of two case. LendingClub guides students through using random forests to predict what loans will produce maximum returns, and the USPS case introduces students to computer vision using random forests.
- Data Visualization in Tableau covers an introduction to Tableau, in the context of data made available by Citibike pertaining to bikeshare trips taken in New York city. I am grateful to Tableau for making a student version of their software available for this part of the course.
- Text Analytics covers the fundamentals of text analytics, including the bag-of-words representation, sentiment analysis, and Latent Dirichlet Allocation. I illustrate these concepts in the context of the Evisort case.
I set myself the challenge of writing the entire class using only real datasets - no synthetic or generated data.
As well as XLKitLearn, I use "reversed classroom videos" in this class. Specifically, anytime I do a practical exercise with data in class, I upload a corresponding pre-recorded video guiding students through the same exercise. This avoids the all-too-common problem that arises when I cover this kind of technical material in class, in which different people in the class move at different speeds - they can each follow the video in their own time at their own pace.