Skip to content

Latest commit

 

History

History
979 lines (882 loc) · 15.1 KB

readme.md

File metadata and controls

979 lines (882 loc) · 15.1 KB

Animal crossing villagers on a bridge with Tom Nook

Animal Crossing - New Horizons

The data this week comes from the VillagerDB and Metacritic. VillagerDB brings info about villagers, items, crafting, accessories, including links to their images. Metacritic brings user and critic reviews of the game (scores and raw text).

Per Wikipedia:

Animal Crossing: New Horizons is a 2020 life simulation video game developed and published by Nintendo for the Nintendo Switch. It is the fifth main series title in the Animal Crossing series. New Horizons was released in all regions on March 20, 2020.

New Horizons sees the player assuming the role of a customizable character who moves to a deserted island after purchasing a package from Tom Nook, a tanuki character who has appeared in every entry in the Animal Crossing series. Taking place in real-time, the player can explore the island in a nonlinear fashion, gathering and crafting items, catching insects and fish, and developing the island into a community of anthropomorphic animals.

Animal Crossing as explained by a Polygon opinion piece.

With just a few design twists, the work behind collecting hundreds or even thousands of items over weeks anpd months becomes an exercise of mindfulness, predictability, and agency that many players find soothing instead of annoying.

Games that feature gentle progression give us a sense of progress and achievability, teaching us that putting in a little work consistently while taking things one step at a time can give us some fantastic results. It's a good life lesson, as well as a way to calm yourself and others, and it's all achieved through game design.

Potential Analyses:

  • Reviews: Sentiment analysis, text analysis, scores, date effect
  • Villagers/Items: Gender, species, sayings, personality, price, recipe, what about a star sign based off the birthday column?

Some potential context for user_reviews.tsv from 538 and a point of potential strife via Animal Crossing World, and lastly a spoiler article analyzing the reviews in R by Boon Tan.

PS there is an easter egg somewhere in the readme - something to do with... turnips.

Get the data here

# Get the Data

critic <- readr::read_tsv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-05/critic.tsv')
user_reviews <- readr::read_tsv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-05/user_reviews.tsv')
items <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-05/items.csv')
villagers <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-05/villagers.csv')

# Or read in with tidytuesdayR package (https://github.com/thebioengineer/tidytuesdayR)
# PLEASE NOTE TO USE 2020 DATA YOU NEED TO USE tidytuesdayR version ? from GitHub

# Either ISO-8601 date or year/week works!

# Install via devtools::install_github("thebioengineer/tidytuesdayR")

tuesdata <- tidytuesdayR::tt_load('2020-05-05')
tuesdata <- tidytuesdayR::tt_load(2020, week = 19)


critic <- tuesdata$critic

Data Dictionary

critic.tsv

Source

variable class description
grade integer 0-100 score given by the critic (missing for some) where higher score = better.
publication character The source of the review
text character Raw text describing the review.
date double Date review published

user_reviews.tsv

variable class description
grade integer Raw score (0-10) where higher score = better.
user_name character User name of reviewer
text character Raw text of the review
date double Date review published.

villagers.csv

variable class description
row_n integer row_n is a numerical ID
id character id is a short text identifier
name character name of the villager
gender character gender of the villager
species character species of the villager
birthday character birthday of the villager (month-day)
personality character Personality
song character Song associated with the villager
phrase character Catchphraase of the villager
full_id character Full text id of villager
url character Link to image of the villager

items.csv

variable class description
num_id integer Numerical id - note that some items have multiple rows as they have multiple recipe items
id character Character id
name character Name of the item
category character Category of item (eg furniture, clothing, etc
orderable logical Orderable from catalogue
sell_value integer sell value
sell_currency character sell currency
buy_value integer buy value
buy_currency character buy currency
sources character way to acquire or person/place to acquire from
customizable character Is it customizable?
recipe integer Recipe number
recipe_id character Recipe ID
games_id character game id
id_full character Full character id
image_url character Link to image of item

Cleaning Script

library(rvest)
library(tidyverse)
library(jsonlite)
library(listviewer)

url <- "https://github.com/jefflomacy/villagerdb/tree/master/data/items"

all_villagers <- list.files("villagerdb-master/data/villagers")

village_read <- function(file_name){
  fromJSON(here::here("villagerdb-master/data/villagers", file_name))
}

item_read <- function(file_name){
  fromJSON(here::here("villagerdb-master/data/items", file_name))
}


json_list <- all_villagers %>% 
  map(village_read)

listviewer::jsonedit(json_com)

clean_villagers <- json_list %>% 
  enframe() %>% 
  rename(row_n = name) %>% 
  unnest_wider(value) %>% 
  unnest_longer(games) %>% 
  unnest_wider(games) %>% 
  unnest_wider(coffee) %>% 
  select(-...1) %>% 
  rename(coffee_beans = beans, coffee_milk = milk, coffee_sugar = sugar) %>% janitor::clean_names() %>% 
  filter(games_id == "nh") %>% 
  select(row_n, id, name, gender:personality, song, phrase)

final_villagers <- left_join(clean_villagers, villager_db_villagers_images %>% 
  mutate(full_id = id, 
         id = str_remove(id, "villager-")) %>% 
  select(full_id, id, name, url),
  by = c("name", "id"))

final_villagers %>% 
  write_csv("2020/2020-05-05/villagers.csv")

# Read and clean item JSON ------------------------------------------------



all_items <- list.files("villagerdb-master/data/items")

items_list <- all_items %>% 
  map(item_read)

jsonedit(items_list)

items_nh <- items_list %>% 
  enframe() %>% 
  rename(row_n = name) %>% 
  unnest_wider(value) 

items_price <- items_nh %>% 
  unnest_longer(games) %>% 
  unnest_wider(games) %>% 
  unnest_wider(sellPrice) %>% 
  rename(sell_value = value, sell_currency = currency) %>% 
  select(-...1) %>% 
  unnest_wider(buyPrices) %>% 
  select(-...1)

items_long <- items_price %>% 
  unnest_longer(recipe) %>% 
  mutate(customizable = unlist(customizable)) %>% 
  unnest_longer(sources) %>% 
  unnest_longer(interiorThemes)

buy_long <- items_long %>% 
  unnest_wider(currency) %>% 
  rename(buy_price_1 = ...1,
         buy_price_2 = ...2)

buy_df_wide <- buy_long %>% 
  unnest_wider(value) %>% 
  rename(buy_currency_1 = ...1,
         buy_currency_2 = ...2)

currency_2 <- buy_df_wide %>% 
  filter(!is.na(buy_currency_2)) %>% 
  select(-buy_price_1, -buy_currency_1) %>% 
  rename(buy_value = buy_price_2, buy_currency = buy_currency_2)

item_df_final <- buy_df_wide %>% 
  select(-buy_currency_2, -buy_price_2) %>%
  rename(buy_value = buy_price_1, buy_currency = buy_currency_1) %>% 
  bind_rows(currency_2) %>% 
  arrange(row_n, id) %>% 
  rename(buy_cur = buy_currency, buy_val = buy_value) %>% 
  rename(buy_value = buy_cur, buy_currency = buy_val) %>% 
  unnest_longer(rvs) %>% 
  filter(games_id == "nh")

item_df_final

joined_img_df <- item_df_final %>% 
  left_join(all_items, by = c("id", "name")) %>% 
  select(num_id = row_n, id:orderable, sell_value, sell_currency, buy_value, buy_currency, sources, customizable, recipe:id_full, image_url = url, -xSize, -ySize)

joined_img_df %>% 
  write_csv("2020/2020-05-05/items.csv")






























































































































































































































































































BONUS SUPER SECRET DATASET

Keep going if you wanna learn about the turnip market.





























































































































































tornps - image of Tom Nook imitating the stonks meme

WARNING

Please note that this may be bordering on making the game a type of "work" - so feel free to skip if you don't want to think about the game THIS hard.





























































































































































SERIOUSLY ENJOY THE GAME HOWEVER YOU WANT

If you want to continue please see the below for context and some scraping code for an example plot in R.















































































Ok here is the easter egg

This is an example dataset from GameWith of example turnip price graphs and additional info from Polygon. Lastly - The Verge also dives into Turnip price watch groups - links to The Stalk Market.

There appear to be 3-4 types of turnip price trends.

  • Random: Price fluctuates without clear pattern
  • Spike: Price declines for a few days and then jumps up 3x before quickly declining
  • Crash: Price increases early and then crashes
  • Decline: Price constantly decreases across week
# Turnip price graphs examples

raw_turnip <- read_html(turnip)

cooked_turnips <- raw_turnip %>% 
  html_nodes("div.acnh_kabu > table") %>% 
  html_table() %>% 
  bind_rows() %>% 
  as_tibble() %>% 
  rename("time" = ...1) %>% 
  slice(3:10) %>% 
  group_by(time) %>% 
  mutate(week = row_number()) %>% 
  ungroup() %>% 
  pivot_longer(cols = Mon:Sat, names_to = "day", values_to = "turnip_price")


turnip_levels <- cooked_turnips %>% 
  distinct(day) %>% 
  pull()

cooked_turnips %>% 
  mutate(day_time = paste(day, time, sep = "-"),
         day_time = factor(day_time, 
                           levels = c("Mon-AM", "Mon-PM", "Tue-AM","Tue-PM", 
                                      "Wed-AM", "Wed-PM", "Thu-AM", "Thu-PM", 
                                      "Fri-AM", "Fri-PM", "Sat-AM" , "Sat-PM")),
         week = factor(week, labels = c("Random", "Spike", "Crash", "Declining"))
  ) %>% 
  ggplot(aes(x = day_time, y = turnip_price, color = week, group = week)) +
  geom_line()