Skip to content

Latest commit

 

History

History
 
 

2019-10-08

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

International Powerlifting

This week's data is from Open Powerlifting.

Wikipedia has many details around the sport itself, as well as more details around the 3 lifts (squat, bench, and deadlift).

Credit to Nichole Monhait for sharing this fantastic open dataset. Please note this is a small subset of the data limited to IPF (International Powerlifting Federation) events, the full dataset with many more columns and alternative events can be found as a .csv at https://openpowerlifting.org/data. The full dataset has many more federations, ages, and meet types but is >250 MB.

A nice analysis of this dataset for age-effects in R can be found at Elias Oziolor's Blog

Get the data!

ipf_lifts <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-10-08/ipf_lifts.csv")

Data Dictionary

ipf_lifts.csv

variable class description
name character Individual lifter name
sex character Binary gender (M/F)
event character The type of competition that the lifter entered.

Values are as follows:
- SBD: Squat-Bench-Deadlift, also commonly called "Full Power".
- BD: Bench-Deadlift, also commonly called "Ironman" or "Push-Pull"
- SD: Squat-Deadlift, very uncommon.
- SB: Squat-Bench, very uncommon.
- S: Squat-only.
- B: Bench-only.
- D: Deadlift-only.
equipment character The equipment category under which the lifts were performed.

Values are as follows:
- Raw: Bare knees or knee sleeves.
- Wraps: Knee wraps were allowed.
- Single-ply: Equipped, single-ply suits.
- Multi-ply: Equipped, multi-ply suits (includes Double-ply).
- Straps: Allowed straps on the deadlift (used mostly for exhibitions, not real meets).
age double The age of the lifter on the start date of the meet, if known.
age_class character The age class in which the filter falls, for example 40-45
division character Free-form UTF-8 text describing the division of competition, like Open or Juniors 20-23 or Professional.
bodyweight_kg double The recorded bodyweight of the lifter at the time of competition, to two decimal places.
weight_class_kg character The weight class in which the lifter competed, to two decimal places.
Weight classes can be specified as a maximum or as a minimum. Maximums are specified by just the number, for example 90 means "up to (and including) 90kg." minimums are specified by a + to the right of the number, for example 90+ means "above (and excluding) 90kg."
best3squat_kg double Maximum of the first three successful attempts for the lift.
Rarely may be negative: that is used by some federations to report the lowest weight the lifter attempted and failed.
best3bench_kg double Maximum of the first three successful attempts for the lift.
Rarely may be negative: that is used by some federations to report the lowest weight the lifter attempted and failed.
best3deadlift_kg double Maximum of the first three successful attempts for the lift.
Rarely may be negative: that is used by some federations to report the lowest weight the lifter attempted and failed.
place character The recorded place of the lifter in the given division at the end of the meet.

Values are as follows:
- Positive number: the place the lifter came in.
- G: Guest lifter. The lifter succeeded, but wasn't eligible for awards.
- DQ: Disqualified. Note that DQ could be for procedural reasons, not just failed attempts.
- DD: Doping Disqualification. The lifter failed a drug test.
- NS: No-Show. The lifter did not show up on the meet day.
date double ISO 8601 Date of the event
federation character The federation that hosted the meet. (limited to IPF for this data subset)
meet_name character The name of the meet.
The name is defined to never include the year or the federation. For example, the meet officially called 2019 USAPL Raw National Championships would have the MeetName Raw National Championshps.

Cleaning Script

library(tidyverse)

df <- read_csv(here::here("openpowerlifting-2019-09-20", "openpowerlifting-2019-09-20.csv"))

df_clean <- df %>% 
  janitor::clean_names()

df_clean %>% 
  group_by(federation) %>% 
  count(sort = TRUE)

size_df <- df_clean %>% 
  select(name:weight_class_kg, starts_with("best"), place, date, federation, meet_name)  %>% 
  filter(!is.na(date)) %>% 
  filter(federation == "IPF") %>% 
  object.size()

ipf_data <- df_clean %>% 
  select(name:weight_class_kg, starts_with("best"), place, date, federation, meet_name)  %>% 
  filter(!is.na(date)) %>% 
  filter(federation == "IPF")

print(size_df, units = "MB")

ipf_data %>% 
  write_csv(here::here("2019", "2019-10-08","ipf_lifts.csv"))