01:00
ds4owd - data science for openwashdata
Oct 31, 2023
While we are getting ready, please check for this email from GitHub and accept the invitation to join the GitHub organisation for the course. Used Gmail to sign up? Check the folders that aren’t your primary inbox (e.g Updates).
Mian Zhong
Sophia Skorik
Be able to use a common set of data science tools (R, RStudio IDE, Git, GitHub, tidyverse, Quarto) to illustrate and communicate the results of data analysis projects.
Learn to use the Quarto file format and the RStudio IDE visual editing mode to produce documents with citations, footnotes, cross-references, figures, and tables.
Pick an item and take notes for 1 minute:
What does the item you have picked have to do with the reason for you being here?
01:00
Take 2 minutes each to share with your room partner:
What does the item you have picked have to do with the reason for you being here?
05:00
date | week | topic | module |
---|---|---|---|
31 October 2023 | 1 | Welcome & get ready for the course | module 1 |
07 November 2023 | 2 | Data science lifecycle & Exploratory data analysis using visualization | module 2 |
14 November 2023 | 3 | Data transformation with dplyr | module 3 |
21 November 2023 | 4 | Data organization in spreadsheets | module 4 |
28 November 2023 | 5 | Descriptive statistics and tables with gt | module 5 |
05 December 2023 | 6 | Concept of tidy data & Vectors in R | module 6 |
12 December 2023 | 7 | Joining data & wrting functions | module 7 |
19 December 2023 | 8 | Using AI for software development in R | module 8 |
26 December 2023 | 9 | Break | NA |
02 January 2024 | 10 | Break | NA |
09 January 2024 | 11 | Break | NA |
16 January 2024 | 12 | Personal website development with Quarto and publication of capstone project | module 9 |
23 January 2024 | 13 | Work on Capstone project | NA |
30 January 2024 | 14 | Final submission date of Capstone project | module 10 |
06 February 2024 | 15 | Graduation party of openwashdata academy | NA |
During my turn and our turn segments: Please keep your microphone on mute. Send message to the Zoom chat Mian and Sophia will support you.
During your turn segments: Due to the large number of participants, it will not be feasible to join individual break-out rooms, but you will always be working in pairs.
One screen
Two screens or more
I’ll assume you
do not have R or git experience
have not worked in an IDE before (e.g. RStudio IDE)
want to learn about R
want to learn about Quarto and publishing
want to learn about project management with GitHub
I’ll teach you
R
Quarto syntax and formats
Markdown
Git via RStudio GUI
GitHub issues, project management, and publishing
Sit back and enjoy!
GitHub Authorisation
https://posit.cloud/spaces/426916/join?access_code=BcLC_jGc-2UB6QDLuV09M8zCyaT6xvY2HjM6CNs3
05:00
Please get up and move! Let your emails rest in peace.
10:00
hello-quarto.qmd
file and click on it to open it in the top left window.author:
to the YAML header and add your name10:00
Quarto comes “batteries included” straight out of the box
revealjs
)Feature | R Markdown | Quarto |
---|---|---|
Basic Formats | ||
Beamer | beamer_presentation | beamer |
PowerPoint | powerpoint_presentation | pptx |
HTML Slides | revealjs | |
Advanced Layout | Quarto Article Layout |
Feature | R Markdown | Quarto |
---|---|---|
Cross References | Quarto Crossrefs | |
Websites & Blogs | ||
Books | bookdown | Quarto Books |
Interactivity | Shiny Documents | Quarto Interactive Documents |
Journal Articles | rticles | Journal Articles |
Dashboards | flexdashboard | Quarto Dashboards |
In your exercises project in RStudio on Posit Cloud, go to File > New File > Quarto document to create a Quarto document with HTML output.
my-first-document.qmd
.Use the visual editor for the next steps.
Add a title and your name as the author.
Create four sections with headings of level 2 (Introduction, Methods, Results, Conclusions).
Stretch goal: Add a table of contents. Note: Watch out for the indentation.
Stretch goal: Change the html theme to sketchy
. Tipp: Check quarto.org and use search function with “HTML theming”
10:00
A way to share files with others, so they can:
You can view the history of files, and jump back in time to any point.
GitHub is a hosting platform for version control using Git
Launched in 2008, aquired by Microsoft in in 2018, Microsoft for US$ 7.5 billion
100 million Users (20.5 in 2022 alone) (October, 2023)
Social media for software developers
Sit back and enjoy!
Currently, you receive emails when someone mentions you in a comment on GitHub. Let’s change the settings to receive notifications On GitHub.
05:00
10:00
Please get up and move! Let your emails rest in peace.
10:00
Metadata: YAML
Text: Markdown
Code: Executed via knitr
or jupyter
Weave it all together, and you have beautiful, powerful, and useful outputs!
Literate programming is writing out the program logic in a human language with included (separated by a primitive markup) code snippets and macros.
“Yet Another Markup Language” or “YAML Ain’t Markup Language” is used to provide document level metadata.
Indentation matters!
:
There are multiple ways of formatting valid YAML:
:
format: html
with selections made with proper indentationLint, or a linter, is a static code analysis tool used to flag programming errors, bugs, stylistic errors and suspicious constructs.
RStudio + VSCode provide rich tab-completion - start a word and tab to complete, or Ctrl + space
to see all available options.
filter()
.data =
year == 2007
What do do with the datafilter()
.data =
year == 2007
What do do with the datagapminder_yr_2007
filter()
.data =
year == 2007
What do do with the datagapminder_yr_2007
<-
|>
Rules of dplyr
functions:
Monday | |
Tuesday | Module from 2 pm to 4:30 pm CET |
Wednesday | |
Thursday | Office hours on Zoom (2 pm to 3:30 pm CET) |
Friday |
Slides created via revealjs and Quarto: https://quarto.org/docs/presentations/revealjs/ Access slides as PDF on GitHub
All material is licensed under Creative Commons Attribution Share Alike 4.0 International.