"Data is the new oil..."
You might have heard this saying or a similar phrase before. Big Data, Analytics, Data Science, Artificial Intelligence, Machine Learning, ... many 'colorful' terms refer to the increasing use of analytical models that aim at extracting insight from the vast amounts of data, which the digital society is producing.
The module Business Analytics and Data Science (BADS) is concerned with theories, concepts, and practices to inform and support decision-making by means of formal, data-driven methods. We will revisit different forms of model-based decision support, examine the standard workflow of modern data analysis, and discuss a broad set of models for descriptive and predictive analytics. Predictive modeling is the main focus of the course. Many corporate use cases of analytics and data science involve the prediction of some future state or behavior, for example, the way in which customers will respond to certain marketing stimuli. We will introduce statistical principles of learning from data and cover several common prediction methods, ranging from established industry workhorses like logistic regression to state-of-the-art machine learning algorithms such as gradient boosting and heterogeneous ensembles. Subsequently, we will dive into specific tasks in the predictive modeling pipeline such as e.g., feature selection or remedies to the class imbalance problem. Given a variety of specialized modeling tasks and challenges we will focus on topics with high relevance to the managerial decision problem. Cost sensitivity is a good example. The prediction may, and typically will, be inaccurate. When building a predictive model to guide managerial decision-making, different types of ours are often associated with different costs. How can we make our analytical models aware of error costs? Beyond error costs, what is a good approach to judging the adequacy of an analytical model in business applications? The second half of the course will concentrate on specific management problems in marketing and credit risk analytics to equip students with a solid understanding of the interdependencies between methods (e.g., a machine learning algorithm) and their applications in business.
The module consists of a lecture and a tutorial session. The lecture introduces relevant concepts and provides room for discussion. The goal of the tutorial is to empower students to develop state-of-the-art analytical models using contemporary programming libraries for data science. Specifically, we will use the Python programming language. Students receive demos on how to implement specific algorithms from scratch and work with real-world data to solve common modeling tasks themselves. While the final grade for the module will depend on a written exam, students will be required to pass a programming assignment before taking the exam. The tutorial will prepare students for this assignment.
It is not strictly necessary that students join the course with prior experience in computer programming. We reserve the first two weeks of the tutorial to introduce programming principles and the Python programming language. That said, high and continuous engagement with the module in general and the tutorial in particular including ample time for self-study is expected to ensure completion of our ambitious learning program. Students who wish to prepare for the course are invited to complete some of the many excellent tutorials on Python programming. A simple web search for "Python programming introduction" produces tons of results, or check out the corresponding resources on Python.org.
Looking forward to seeing you in BADS.
- Kursverantwortliche/r: Anna-Lena Bujarek
- Kursverantwortliche/r: Vincent Bogdan Gurgul
- Kursverantwortliche/r: Stefan Lessmann