This Basic Machine Learning training is a great fit if want to learn the basics to work as Data/Machine Learning Engineer or Data Scientist. Participants get most value out of the training when they have a background in analytics, mathematics and statistics.
During this training you will learn:
After the training you receive a Certificate of Completion.
The Basic Machine Learning training consists of 7 classes which are spread over a couple of months to ensure the maximum learning curve. The content of the classes is connected, and in general we advise to attend all classes. In case you would like to attend (a) single class(es), contact us so we can give you the right advice about a tailored training course.
Click below to open a detailed description of the class:
Regardless of your OS of choice, knowing how to deal with Linux through the command line is a valuable skill to have for any engineer or scientist. To look under the hood of the application your deployed, to debug that job you had running on one of those nodes that keep crashing, or to simply prepare this dataset that will take longer to download, transform and upload again, being able to utilize the power of Bash will not only often save you, it will actually speed your work up! As with any power tool, it is of course also very easy to cut off your own foot, so join us on this journey towards getting to know Bash and unlocking its power.
The training includes theory, demos, and hands-on exercises. After this training you will have gained knowledge about:
As Data Scientists and Machine Learning experts spend a decent amount of time preprocessing, this topic is a necessary part in their toolkit. In this training we specifically focus on the pandas library, which has grown into one of the main tools for data preprocessing and exploration in Python, with many capabilities.
We start off with an introduction to preprocessing, the concept of tidy data and some useful techniques such as pivoting and missing value imputation. Then, we go into the pandas library, its background, data structures, and basic features. In a demo we get to see concrete ways to handle data sets, from loading, subsetting, merging, etc to (re)sampling, applying grouped transformations and saving results.
The training includes theory, demos, and hands-on exercises. After this training you have gained knowledge about:
This training provides a theoretical introduction into the basics of Machine Learning and its different sub-fields, as well as a hands-on way of seeing how it is applied in practice. At the core of this training is the scikit-learn library, one of the most powerful and versatile tools for Machine Learning in Python.
The training includes theory, demos, and hands-on exercises. After this training you will have gained knowledge about:
In this training, we build upon what we have learned previously, and expand our workflow by showing how to optimize prediction models using Parameter Tuning. We discuss how and why to perform Cross-Validation and how to prevent Information Leakage. Bringing everything together, we finally show how to combine multiple steps of a machine learning workflow into Pipelines, thereby making the process more organized, efficient and less error-prone.
The training includes theory, demos, and hands-on exercises. After this training you will have gained knowledge about:
In this training, we build upon what we have learned previously, and expand our knowledge of how to score machine learning models, discuss common pitfalls and show how to deal with them. We will do this by first examining the concepts of bias, variance, overfitting and underfitting, followed by diving into important performance metrics such as accuracy, precision, recall, F1 scores, ROC curves, etc for classification problems and elaborating on commonly used metrics for regression. This last part in our basic toolkit allows us to properly assess a prediction model that we train to recognize images of handwritten digits during the hands-on lab session.
The training includes theory, demos, and hands-on exercises. After this training you will have gained knowledge about:
R has grown into a well developed ecosystem with powerful packages for data analysis, data visualization, in-depth statistics, time series forecasting and machine learning, to mention a few. This training aims to give a quick-paced introduction of R, its most relevant features and basic workflow, including understanding how to apply them.
We start the training by discussing the basics of the R Programming language and its RStudio IDE, to understand its logic operations, data structures, workflow, etc. We then delve into a number of powerful packages such as dplyr, ggplot2, readr and other tidyverse packages and show how they are used for data preprocessing, analysis and visualization.
Finally, we apply these concepts and tools in practice during a hands-on lab session. We implement a complete data analysis workflow in R, from retrieving realtime earthquake data from a webservice to preprocessing, analyzing and eventually visualizing this data on an interactive map.
The training includes theory, demos, and hands-on exercises. After this training you will have gained knowledge about:
This training serves as a basic introduction to statistics.
We will first discuss a number of core concepts of statistics, from random variables, probabilities and distributions to expectation values, variance and conditional probabilities. We will show a couple of common distributions and examples to clarify these concepts. Then, we will go into statistical modelling with a focus on linear regression. We conclude with some common metrics for regression and by talking about uncertainties in estimates.
In the lab exercises we then get to apply these concepts and do some modelling ourselves. The training includes theory, demos, and hands-on exercises. After this training you have gained knowledge about: