Skip to content

Latest commit

 

History

History
58 lines (35 loc) · 7.66 KB

README.md

File metadata and controls

58 lines (35 loc) · 7.66 KB

Creative Commons License DOI DOI

R for Data Analysis

"There is synthesis when, in combining therein judgments that are made known to us from simpler relations, one deduces judgments from them relative to more complicated relations. There is analysis when from a complicated truth one deduces more simple truths."

-André-Marie Ampère [@Hofmann96]

Everyone is a data analyst. The purpose of this book is to inspire and enable anyone who reads it to reconsider the methods they currently employ to analyse data. This is not to suggest that the methodologies outlined will be useful or sufficient for everyone who reads it. Some analyses can be performed quickly without the need for additional computation while others will require advanced analytics techniques not outlined in this book; however, the aspiration is that all will be equipped with novel tools and ideas for approaching data analysis.

Prerequisites

No prior knowledge is required to begin this book. The content will start at the very beginning by showing you how to set up your R environment and the basics of programming in R. By the end of the book, you will be able to perform intermediate analytics techniques such as linear regression and automatic report generation.

You will need an environment which you use to run your code. It is recommended that you download R and R Studio locally for this requirement. This book will walk you through how to do that as well as offer alternatives if that is not an option for you.

Structure of the Book

  • Part I (Fundamentals) will introduce you to the basics of programming in the context of R.
  • Part II (Data Acquisition) will teach you how to create, import, and access data.
  • Part III (Data Preparation) will show you how to begin preparing your data for analysis.
  • Part IV (Developing Insights) goes through the process of searching for and extracting insights from your data.
  • Part V (Reporting) demonstrates how to wrap your analysis up by developing and automating reports.

Each part will contain several chapters which cover specific ideas related to the overarching topic. At the end of each of these chapters you will find additional resources for you to use to dive deeper into the ideas. Each part will be concluded with practical exercises for you to test your skills.

While sections of this book could be used to supplement formal education programs, it was initially designed to be used for independent study.

Statement of Need

In the article titled "An empirical study of the rise of big data in business scholarship", the authors suggest that the amount of data that exists in our current society creates a "constant flow of potential new insights for business, government, education and social initiatives" (Frizzo-Barker et al., 2016). This presents an opportunity to educate practitioners in both industry and academia on programmatic data analysis techniques. These practitioners may have historically relied on specialists and/or methodologists to perform analyses, but it is important to ensure that analysis tools are as accessible as the data has become.

There are plenty of resources aimed at teaching specialists how to apply advanced analytics techniques to their chosen discipline; however, there is a notable lack of resources which aim to educate the general public on programmatic data analysis. This phenomenon was observed in an article titled "What is Statistics?" when the authors proclaimed "statistical education has not been sufficiently accessible." (Brown et al., 2009). Furthermore, the contents of R for Data Analysis (French, 2022) are centered around the idea of the "process of data analysis" broadly applied to any discipline. This differs from other high-quality resources, such as "R for Reproducible Scientific Analysis" (Zimmerman et al., 2019), which teaches similar topics in the context of the scientific process.

Contribution Guide

  • You can fix typos, spelling mistakes, or grammatical errors in the documentation directly using the GitHub web interface. Make sure to include a brief description of the changes you are proposing.
  • For other suggestions or larger problems, you can create an "issue" here.
  • Alternatively, you can create a pull request; however, it is generally a good idea to start a conversation about large changes by creating an issue before proposing them. If you have never created a pull request before, you can read more about them here.
  • If you have no changes to propose but still want to contribute- feel free to search the issue board and asked to be "assigned" to an issue.
  • Ensure that any changes in .qmd files are rendered via the quarto preview command
  • If changes were made to the content of the book, make sure to re-render the pdf as well.

License

Creative Commons License
This work is free to use, and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Physical copies of this book are not currently available; however, you can download a pdf in the top left corner of this site. Feel free to contribute by reporting a typo or leaving a pull request at https://github.com/TrevorFrench/R-for-Data-Analysis.

About Me

I have an M.S. in Data Analytics, a B.S. in Business Analytics, and currently work in industry as an Analytics Manager for a software company. I began my journey into analytics by working as a Data Analyst for the university I was attending. This role allowed me to automate processes, build dashboards, deliver reports to executive stakeholders, and provide insight on how operations might be improved. I performed this role until I was promoted to lead the team. Later, I worked for a major CPG company driving pricing and promotion strategy for a large piece of the business.

Despite my education, most of my basic analytics knowledge was hard-won through self-study. I created this resource to be what I wish I had when I started my journey into the analytics domain. Additionally, I don't believe that one must be a domain expert to be effective at analyzing data. In fact, I think most people can quickly learn the skills necessary to be very effective at it.

Physical copies of this book are not currently available; however, you can download a pdf in the top left corner of this site. Feel free to contribute by reporting a typo or leaving a pull request at https://github.com/TrevorFrench/R-for-Data-Analysis.