This repository contains the materials for the "Data Visualization" three days intensive CADi ("Cursos de Actualización en las Disciplinas") course taught to faculty members at "Tecnológico de Monterrey" Institute. It hosts a compendium of materials and activities designed to develop and improve skills to make charts and plots more interactive/appealing, and to show some of the tools we can use to achieve better exposure of our work in scientific, and engineering applications.
For other data-analysis related topics please take a look at the dataPy_CADi repository. Which contains exercises on data wrangling in Python.
Although not strictly required, having some knowledge on one of the following programming languages is suggested (as we'll be using them throughout the course):
- Mathematica: Most of the graphics showcased in the website were developed in this platform due to its flexibility in terms of graphical capabilities (as well as personal preference of the teacher).
- Python: One of the most popular programming languages. Some of the more versatile data visualization frameworks are compatible with it.
- R: A popular statistical framework with lots of community support.
There's examples developed in each of the platforms according to the application and availability of frameworks for specific tasks.
Additionaly, it is also suggested to have the atom text editor for the markdown and python examples. For a useful guide on how to install R and Python kernels in Atom follow this link. Some other useful packages for development in atom are:
- Markdown Preview Enhanced: Allows the live update of markdown documents previews.
- Hydrogen: Package that allows running Python code in Jupyter-style from within atom.
- Preview HTML: Allows us to view the results of our HTML within atom.
- Platformio-ide-terminal: Launches a terminal instance from within atom (the session starts at our current directory).
It is also suggested to install the RStudio IDE for R development and the github desktop app to for the repository.
Given that the course is intended to be useful for several disciplines, the workshop was created with flexibility in mind. As such, modules are fairly independent and can be taken in different order. Alternatively, take a look at the sitemap for the full tree of contents and exercises contained in this repository.
Goal: To describe the basic principles of data visualization, the types of plots that better describe certain datasets, and perform some common data visualization examples that are common across different fields.
- Introduction: Objectives, Scope, My background, Software Installation
- Data Visualization Primer: Data visualization workflow
- Mathematica/R/Python Primer: Brief introduction to programming languages
- Media Formats: Raster-based, Vector-based
- Plot Types (first part with exercises): Counts, Scatter, Time Series
- Data Handling/Data Sources: Data Formats, Data Handling Frameworks
Goal: To describe and run through some examples of popular data visualization frameworks in R and Python.
- Working with Python and Anaconda: Setting up, Basics, VirtualEnv, Anaconda, Jupyter, Spyder, Atom
- Colors: Color Palettes
- Plot Types (second part with exercises): Time Series, Transitions, Clustering, Factorial, Multidimensional, Geographic
- Good Practices: Suggestions to make data visualization clearer
Goal: To be able to put together a project website and host some of the examples created throughout the course for better exposure of our work.
- Github Introduction: Introduction to github, setting up an account, and our first repository
- Markdown + HTML Primer: Introduction to MD and HTML for github and presentations
- gh-pages: Github pages, "Docs" folder, "gh-pages" branch
- Remark: One of the frameworks to create simple markdown presentations
- Revealjs: MathJax-supported javascript HTML presentations framework
- ffmpeg: Stop-motion animations, Further video editing
This is a list of complementary sources and tools that are useful in data visualization applications.
- anaconda: DataScience/Package manager platform for python and R
- atom: Versatile IDE for R, Python, Markdown, Javascript, amongst others
- ffmpeg: Video Manipulation command line tool (can be used to create "stop-motion" animations)
- ggplot2: Plotting in R
- gimp: Free "photoshop" alternative
- github pages: Github pages
- irkernel: R kernel for jupyter
- jekyll: Blog-like templates for github pages (Ruby)
- jupyter: Jupyter project
- leaflet: Open-source JavaScript library for interactive maps
- matplotlib: Python plotting framework
- mathjax: Use latex in html documents through javascript
- networkD3: R Network Plotting
- plotly: Interactive plots (both in R, and Python)
- python: General-purpose programming language
- R: Statistical computing programming language
- RColorBrewer: Color palettes for R
- remark: Markdown presentations
- replit: Online python environments project
- revealjs: Javascript presentations
- rStudio: R IDE
- sciweavers: Latex to image converter to embed them into markdown
- shiny: Interactive web development though R
- slides: GUI for revealjs
- spyder: "RStudio"-like IDE for Python
- tydiverse: Collection of R packages designed for data science.
- Coolors Color Palettes: https://coolors.co
- Colorpicker for Data: http://tristen.ca/hcl-picker/#/hlc/6/1/657D9F/E2E062
- Data Visualization Catalogue: https://datavizcatalogue.com/index.html
- Data Visualization Ted Talks: https://www.ted.com/playlists/201/art_from_data
- Flowing Data: https://flowingdata.com/
- Google Charts: https://developers.google.com/chart/interactive/docs/
- Markdown Cheatsheet: https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet
- Markdown Specs: https://github.github.com/gfm/
- Observable: https://beta.observablehq.com/
- OpenVis Conf 2018 Keynote: https://www.youtube.com/watch?time_continue=298&v=yeC9v-iHJu0
- Plots and Charts Galleries: https://datavizproject.com/
- Plot Type Selector: http://chartmaker.visualisingdata.com/
- The R Graph Gallery: https://www.r-graph-gallery.com
- The Python Graph Gallery: https://python-graph-gallery.com/
- The Truthful Art: https://truth-and-beauty.net/
- Adams, S., & Helfand, J. (2017). The designer’s dictionary of color. 9781419723919
- Barabasi, Albert (2016). Network Science.
- Cairo, Alberto (2016). The truthful art: data, charts, and maps for communication. ISBN-13: 978-0321934079
- Foster Provost, Tom Fawcett. Data science for business.
- Géron, Aurélien. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
- Hastie, Trevor. An Introduction to Statistical Learning with Applications in R
- Kabacoff, Robert. R in Action: Data Analysis and Graphics with R.
- Kirk, A. (2016). Data Visualisation: A Handbook for Data Driven Design. ISBN-13: 978-1473912144
- McKinney, W. Python for Data Analysis - Data Wrangling with Pandas, Numpy and Python. (2018). ISBN-13: 1491957662
- Newman, Mark (2010). Networks: An Introduction. ISBN-13: 978-0199206650.
- Wellin, Paul (2016). Essentials of Programming in Mathematica. ISBN-13: 978-1107116665
- Yau, N. (2011). Visualize this : the FlowingData guide to design, visualization, and statistics. Wiley Pub. ISBN-13: 978-0470944882
- Yau, N. (2013). Data points: visualization that means something. ISBN-13: 978-1118462195
Camilo René Duque Becerra • Carlos Daniel Prado Pérez • Donovan manuel Esqueda Merino • Edgar Emmanuel Vallejo Clemente • Faustino Yescas Martinez • Francisco Javier Delgado Cepeda • Guillermo Sandoval Benítez • Ivonne Abud Urbiola • José Luis Gómez Muñoz • Juan Carlos del Valle Sotelo • Lizethe Pérez Fuertes • Luis Miguel Méndez Díaz • María de Lourdes Quezada Batalla • Miguel Rocha Romero • Rafael Benitez Medina • Ramón Marín Solís • Raúl Gómez Castillo • Raúl Martinez Rosado • Salvador Elías Venegas Andraca • Saul Juarez Ordoñez • Sergio Santiago Rentería
Hugo I. Velasco • Martín Molinero • Myriam Elizabeth • Rodrigo Careaga
Contact: [ [email protected] | [email protected] ]
My main projects: [ MGDrivE & MoNeT ]
My personal website: [ chipdelmal.github.io ]