Capstone Project

The Capstone Project report is the final assignment for this course and a completion is required to receive a course certificate about successful participation.

Learning Objectives

  1. Learners can apply the skills obtained during the course to write a short data analysis project report.

GitHub repository

Establishing the GitHub repository with self-identified data was part of the homework assignment of module 5, 6, and 7.

It is important that these two assignments are completed before continuing with the write up of the report as outlined on this page.

GitHub issue tracker

The GitHub issue tracker of each student’s capstone project repository is used to communicate and ask questions about the Capstone Project report. Each course participant is assigned to one of the course instructors.

Submission due date

The due date for submission of the report is Tuesday, 13th February 2024.

Required items

Table 1 is a detailed list of items that need to be included for a complete submission of the capstone project report. Items are categorized into technical, data, and intellectual tasks. If any item is unclear, please reach out to the course instructors.

Table 1: List of items to be completed for the capstone project report.
no category items
1 technical The report renders without errors to HTML format and contains at least five chapters of heading level 1 that are named: Introduction, Methods, Results, Conclusions, References.
2 technical YAML header of report has title, author, date, and table of contents that are correctly displayed in the compiled HTML output.
3 technical Warnings are hidden from the compiled output, but code is shown in the compiled output.
4 technical The report has at least two data visualisations.
5 technical Each data visualisation has edited human-readable labels (e.g. axis labels, legend title).
6 technical Each data visualisation applies at least of one scaling function (e.g. color/fill, axes).
7 technical Each data visualisation has a label defined in the code-chunk options.
8 technical Each data visualisation has a caption defined in the code-chunk options.
9 technical Each data visualisation is cross-referenced in the narrative using the defined label from the code-chunk options.
10 technical The report has at least one table with summary statistics (e.g. count, mean, median, standard deviation, etc.).
11 technical Each table is formatted in the rendered output using a function taught during the course (e.g. kable() function or gt() function).
12 technical Each table has a label defined in the code-chunk options.
13 technical Each table has a caption defined in the code-chunk options.
14 technical Each table is cross-referenced in the narrative using the defined label from the code-chunk options.
15 technical The report includes at least 3 citations using a bibliography.bib file created via RStudio Visual Editor.
16 technical References are automatically listed in References section from YAML entry to bibliography.bib file.
17 data Data from data/raw folder was imported, cleaned, and stored as analysis-ready processed data in data/processed folder.
18 data The data/processed folder contains a data dictionary.csv file with two columns (variable_name, description) which document each variable of the data in the same folder.
19 data The data/processed folder contains a README.md file from a provided template and documentation is completed for the data in the same folder.
20 intellectual Introduction section with 3 to 5 sentences introduces the context within which the data was created.
21 intellectual Methods section describes in 3 to 5 sentences how the data was obtained.
22 intellectual Figures and tables in Results section are interpreted with 2 to 3 sentences each.
23 intellectual Conclusions concisely summarize findings in a bullet point format.