Skip to content

Latest commit

 

History

History
92 lines (57 loc) · 4.78 KB

intro.md

File metadata and controls

92 lines (57 loc) · 4.78 KB

Data Visualization

This course is designed as a 3 day intensive rundown of data visualization principles, techniques, common frameworks, and how to put together some of the material developed throughout the course in a gh-pages website/html-based presentation using markdown.


Introduction

Throughout these 3 days, we will look at a number of exercises. Most of these demonstrations were developed in R due to its flexibility and community support, although some were coded in Python and Mathematica to allow the comparison of different approaches to data visualization and frameworks. The full list of exercises along with the languages they were developed in is:

  1. Box Whisker (R)
  2. Bubble Chart (Python)
  3. Chord Diagram (R)
  4. Dygraph (R)
  5. Factorial (Mathematica)
  6. FFmpeg (Bash)
  7. Globe (R)
  8. Maps (R/Python)
  9. Network (R)
  10. Parallel Plot (Python)
  11. Random Networks (Mathematica)
  12. Remark (Markdown/HTML)
  13. Revealjs (HTML)
  14. Scatter Plot with Histograms (Python)
  15. Scatter Plot (Mathematica)
  16. Stacked Area Plot (R/Python)
  17. Streamgraph (R)
  18. Transitions (Mathematica)
  19. Time Series (R/Mathematica)
  20. Tree Map (R)
  21. Violin Plots (R)
  22. Word Cloud (R)

Some of the examples were developed from scratch, whilst some others were coded as variations of online sources (in which case the sources are cited within the code snippets). The used datasets are either public domain or datasets developed as part of my research group that we are allowed to share.

Objectives

  • To show some good practices in data visualization
  • To showcase some of the newer frameworks used to generate scientific and engineering plots
  • To teach examples of how to choose the correct plot type according to our data, and message we want to transmit
  • To teach which graphics formats are the best for engineering/scientific purposes
  • To show how our visualizations can be embedded into presentations and shown online using markdown and HTML
  • To get some exposure on alternatives to make data visualization more attractive/innovative

My background

Click on the GIF to take a look at my personal website!


Course Structure and Setting up the Required Software

This course was designed to try to accommodate the widest set of audiences in engineering and scientific disciplines. As such, it provides guidelines that are common to most disciplines. For this reason, it was developed to make use of the three aforementioned programming platforms (R, Mathematica and Python) although no deep knowledge of the platforms is required.

Setting up Mathematica

Installing Mathematica is as simple as following the instructions on the wizard after downloading it from the institute's server.

Setting up R and RStudio

Install R by downloading its executable packages, and do the same for RStudio.

Setting up Python, Anaconda and the Required Libraries

The most complicated of the three tools is Python with it's required libraries. However, this can be done easily by installing anaconda, and then following the instructions detailed in the python introduction (easy) and/or the dataViz conda environment readme (advanced).

Forking the Repository

Install the github desktop app. Click on the fork button in the repository's main page:

Once forked, click on the clone or download button, followed by open in desktop: