Course Overview
Thank you for your interest in this course. Your course instructors: Lars Schöbitz & Mian Zhong & Sophia Skorik are looking forward to meet you.
We will meet on Zoom for 10 modules over 17 weeks (see Course Calendar below) at the following times:
- Start: 31st October 2023 - 2 pm to 4:30 pm CET
- End: 20th February 2024 - 2 pm to 4:30 pm CET
We will use Posit Cloud infrastructure, so you do not need to install any software. You will hear from us about a week before the course start.
The registration window for this course has closed, but you can fill out the following form to sign up for the next time we host this course: https://forms.gle/MP5rNYZagBdfG2ZRA
Who can participate?
To participate in this course, you need:
- to have a stable internet connection
- to somewhat be connected to the greater Water, Sanitation and Hygiene (WASH) sector (yes, public health, solid waste management, global health engineering, and related topics also count)
- to commit 10 * 2.5 hours to participate in Zoom calls
- to commit another 2 hours/week for readings and additional exercises for practice
- to identify a dataset of your own or your organisation that you are interested to share with the public
- an openess to new ideas and workflows that disrupt current practice
What do we offer?
This course is:
- free
- provides you with a certificate for successful completion
- using exclusively tools that are free and open source
- offers 1:1 coding support between lectures and beyond the course
Course Information
This course provides learners with skills in using the collection of R tidyverse packages as a tool for data analysis, reproducible research and communication. Lectures will be delivered through participatory live coding for students to learn how to write code in code-along exercises. We will use publicly available data related to waste management, air quality, and sanitation. Students will learn how to help themselves using large language models (LLMs) and AI tools Perplexity and build upon the obtained skills to apply them to their data analysis projects.
Topics include:
- The data science life-cycle
- Data organization in spreadsheets
- Exploratory data analysis using visualization
- Using AI for software development in R
- Concept of tidy data and data tidying
- Data transformation and descriptive statistics
- Data communication using the Quarto open-source scientific and technical publishing system
Learning Goals
Be able to use a common set of data science tools (R, RStudio IDE, Git, GitHub, tidyverse, Quarto) to illustrate and communicate the results of data analysis projects.
Learn to use the Quarto file format and the RStudio IDE visual editing mode to produce scholarly documents with citations, footnotes, cross-references, figures, and tables.
Textbooks and Materials
We will rely entirely on open source and open access material for this course. We will use “R for Data Science” by Hadley Wickham, and “Tidyverse Skills for Data Science” by Carrie Wright, Shannon E. Ellis, Stephanie C. Hicks and Roger D. Peng, as complementary reading and learning material for this course. Additional readings will consist of blog posts, journal articles, and reports. All required readings and class material will be provided through this website.
Course Calendar
Weekly Structure
Monday | |
Tuesday | Module from 2 pm to 4:30 pm CET |
Wednesday | |
Thursday | Office hours on Zoom (2 pm to 3:30 pm CET) |
Friday |
Assignments
Homework assignments: Each week will have at least one homework assignment. All assignments, but those for Week 1 are delivered as Quarto documents with instructions and some sample code. Students are required to submit their work through GitHub.
Readings: Every week, additional readings will be provided that support students in learning the underlying concept that are taught during the class.
Capstone Project: A final capstone project provides students with an opportunity to apply their skills and techniques to real-world data sets. Detailed instructions for the capstone project will be provided. The project will be delivered as Quarto documents and students are asked to submit their work through GitHub.
Attendance
We hope you can participate in all classes. Class participation is an important component for successful completion of this course.
Code of Conduct
This course and the openwashdata community follows a code of conduct. Please ensure that you have read it.