Skip to content

Datapolitan-Training/intro-stats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

91 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction to Statistical Analysis

Summary

A one-day course covering the basis of descriptive statistics with open data, including basic statistical measures such as mean, median, standard deviation, and variance. The course also covers correlation, linear regression, and introduces decision modeling using open data.

Target Audience

Employees of all levels who perform data analysis and communicate analytical findings in support city operations.

Course Overview

This course introduces participants to the use of statistics for understanding and communicating city data. Using Excel, participants will learn how to use measures like mean, median, mode, standard deviation, and variance interval to understand the content of city data for making operational decisions. Participants will also learn how to display statistical information in meaningful ways.

Goals

  • Learn common statistical measures, including mean, median, mode, standard deviation, and variance
  • Calculate correlation coefficients for bivariate data and apply the technique of simple regression analysis
  • Demonstrate techniques used for forecasting
  • Communicate data meaningfully to a broad audience using charts and graphs in Microsoft Excel

Key Takeaways

  • Participants will be familiar with common statistical measures
  • Participants will be able to calculate correlation coefficients for bivariate data and perform simple linear regression analysis
  • Participants will be familiar with the basic techniques of forecasting
  • Participants will be better able to communicate analysis using charts and graphs in Microsoft Excel

Schedule

Exercise Descriptions

Exercise 1: Calculate simple descriptive statistics (measures of central tendency)

  • Task to participants
    • In a small group, calculate the mean and median for your group and compare with the class as a whole
    • Report your findings to the class
  • Desired outcomes
    • Participants become familiar calculating mean and median in Excel
    • Participants understand the value of statistics for comparison
    • Participants practice communicating statistics

Exercise 2: Calculate measure of variability in small groups and compare to class as a whole

  • Task to participants
    • In a small group, calculate the measures of variability for your group and compare with the class as a whole
    • Report your findings to the class
  • Desired outcomes
    • Participants become familiar calculating measures of variability in Excel
    • Participants understand the value of statistics for comparison
    • Participants practice communicating statistics

Exercise 3: How long do noise complaints stay open in New York City?

  • Task to participants
    • Prepare data in a guided exercise to calculate the time 311 Service Requests related to noise remain open
  • Desired outcomes
    • Participants are guided through the steps necessary to calculate the time a service request remains open
    • Participants learn Excel functions and formulas if they have no previous experience
    • Participants practice calculating statistics on a larger dataset than previous
    • Participants communicate findings from statistical analysis

Exercise 4: How long do pothole complaints stay open in New York City?

  • Task to participants
    • Prepare the data in another guided exercise to calculate the time 311 Service Requests related to pothole complaints remain open
    • Filter and clean the data as necessary to obtain reliable results
    • Compare the results of this analysis with the results from the previous exercise
  • Desired outcomes

Exercise 5: Calculate the correlation between median income and recycling rate in New York City

  • Task to participants
    • Calculate the correlation between median income and the recycling rate in New York City Community Districts
    • Interpret the result based on the calculated coefficient of correlation
  • Desired outcomes
    • Participants practice calculating the coefficient of correlation on NYC data
    • Participants practice communicating statistics

Exercise 6: Build a basic decision model in Excel to maximize parking ticket revenue in NYC

  • Task to participants
    • Follow along in a guided exercise to maximize parking ticket revenue in NYC by varying the assignment of parking ticket officers in the 5 boroughs on New York City using the Excel Solver add in
  • Desired outcomes
    • Participants are familiar with creating decision models in Excel
    • Participants are able to communicate the outcome of decision models

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published