One week course in Data Science at Aalborg University from August 20 to August 24, 2018, in Aalborg, Denmark. The course is offered by Department of Mathematical Sciences, Aalborg University.
The software used in this course are R and RStudio. Both are open source and freely available.
The course is intended for individuals with experience in
Note, that the course uses R as a programming language, and data science tasks are performed by commands and programming some routines. Hence, participants should be willing to and interested in programming for data science.
R is a programming language available on most operating systems (OS X, Windows and Linux). Data analysis and visualisations in R can be embedded in several data analytical tools (e.g. Microsoft Power BI) and database systems (e.g. Microsoft SQL Server R Services). RStudio is the state-of-the-art IDE for R that integrates scripting, syntax completion, visualisation, support for version control and much more.
After the completion of the course, the participants
tidyverse
ggplot2
lm
(also in penalised form using the LASSO glmnet
), support vector machines svm
, naive bayes classifier naiveBayes
, classification and regressions trees (CART) rpart
, etc.princomp
for visualisation and modellingkmeans
and hierarchical clustering hclust
used to produce dendrograms and in heatmap graphicsrmarkdown
and knitr
shiny
applications and dashboards
for interactive demonstrations of data and models through the use of responsive graphics and tablesThese skills will enable you to efficiently wrangle your data into a desired format for further analysis. This includes the abbility to aggregate, summarise and visualise the data at various steps in the data analysis. As R is a scripting language you are free from the constraints of an usual spread sheet program like Excel. The scripts also serves as a transparent and reproducible framework for re-doing your analysis over and over again - as well as re-using essential parts in other analyses. The graphics produced by R and in particular ggplot2 are used professionaly by academics, data visualisation communities and data scientists. As intellegent use of graphics can say more than a thousand words – bringing your data, models and insight to a visual format is a key point in data analysis. RStudio makes it easy to integrate your scripts, tables and graphics into an interactive output using either Shiny or dashboards that can be shared with your organisation.
The topics will provide an essential toolbox for data science and data analytics. Furthermore, R provides a rich eco-system that ease the workflow for the data scientist, where reproducibility and communication of the analysis is optimised through Rmarkdown, Shiny and interactive dashboards.
The course will be taught in English but with the possibility of getting help and asking questions in Danish. Each day will be divided in lectures and hands-on sessions, where the participants will solve relevant tasks related to the specific topic in R.
Solutions to the exercises and other scripts will be made available to the participants for review and inspiration after each session.
The course will be based on the book “Data Science for R” by Garrett Grolemund and Hadley Wickham. This book covers the foundamental parts of data manipulation and data science. Additional course topics will be covered through the use of free online materials and course notes.
We also invite the participants to bring their own project challenges and data. When time permits, we will provide specific guidance in solving challenges related to these projects and directions for further work.
Associate Professors Mikkel Meyer Andersen, Torben Tvedebrink and Søren Højsgaard.
The course fee is 19.000 DKK (plus VAT) for participants from industry; half price for participants from academia.
A group discount (3 for 4) is given to registrations from the same organisation (provided that the billing information is the same). Please register individually and we will ensure that the group discount is given.
The fee covers teaching material, food and drinks during the course and a course dinner Wednesday evening.
You can register here.
The precise venue and itinerary will be announced soon.
Contact course director Torben Tvedebrink: tvede@math.aau.dk