Skip to content

NewYorkCityCouncil/vacant_storefronts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Storefronts Reported Vacant or Not

Analyzing Storefront Vacancies

New York City can be a challenging place for small businesses to operate. In addition to adhering to occasionally complex regulatory schemes enforced by multiple City agencies, business owners confront hurdles including rising rents, taxation, competition from chain stores and e-commerce retailers, and various zoning restrictions. Over the course of the last year, a slew of beloved local establishments across the city closed amid skyrocketing costs.

Local Law 157 of 2019 seeks to gather data about the state of vacant storefronts to conduct the sort of studies needed to understand the full scope of storefront vacancies in New York City. The bill requires the Department of Finance to collect data and establish a public dataset of commercial properties in the City.

The data team analyzed local law 57 data in order to:

  • Assess the severity of the problem
  • Identify vacant storefront hotspots
  • Release recommendations for better reporting

You can read our takeaways and recommendations on the council website. The initial analysis was performed for the NYC Council's 6.9.22 'Oversight - Combatting Commercial Vacancies' hearing.

FAQ

Where can I find the storefront vacancy data?

You can find the Storefronts Reported Vacant or Not dataset on the NYC Open Data portal.

Who maintains the dataset and what is Local Law 157?

In 2019, the New York City Council passed a bill requiring building owners of ground-floor and second-floor commercial premises to submit to the Department of Finance (DOF) registration statements. From those statements, the DOF collects data on commercial properties, including whether or not those properties are vacant. They are required to publish this data to the NYC Open Data portal every year or every six months if owners file a supplemental registration statement.

You can find more details about the bill here.

What is the storefront vacancy data about?

As described on the Open Data Portal, “Each row shows a storefront that was registered with the department as of December 31 of the reporting year and legally required updates provided as of June 30 (or date sold if earlier) of the calendar year immediately following the reporting year. Each row contains the property's borough, block and lot number and the storefront's street address (and zip code), either field can be used to search for individual storefronts.”

How often is the data updated?

The data is updated every year. You can report an error or questions about the dataset on the open data portal by filling out the form here.

How is vacancy defined?

Where a commercial storefront "in a designated class one property that has not been leased to a tenant for any time period during the twelve months preceding the January 1st of the current calendar year" or "if the premises becomes vacant at any time during the period from January 1 through June 30 of the current calendar year or the ownership of the premises has changed during that period." - Local Law 157

What are the important nuances of the dataset?

Data is self-reported: All data is based on the information registered with the DOF. Data about the occupancy or vacancy status for each storefront rely solely on the information provided in the registration.

There are two periods of time for which owners of commercial storefronts must report vacancies:

  1. Owners can either report 'yes' the storefront was reported vacant on 12/31 of the reporting year, or 'no' the storefront was reported owner-occupied or leased on 12/31 of the reporting year.
  2. Required for Classes 2 and 4 and optional for Class 1 storefront properties, owners after the 12/31 reporting year, can report yes the storefront was reported vacant on 6/30 or the date the property was sold if earlier than 6/30. Blank means there is no reported information.

For more dataset definitions, view the data dictionary here

How clean is the data?

There is missing and incorrectly labeled data.

The Business Type and Vacant 6/30, Sold Date, and Construction Reported columns have missing and/or not reported information. For example, of storefronts reported as vacant on 12/31, nearly all had the primary business activity listed as ‘no business activity identified.

In addition, for our cluster work where the unit of analysis is at the census tract level, there were difficulties in using the census tract field. In this column, the values appear to be census tract labels. However, there are missing decimal values. Perhaps they dropped or only whole integers are accepted when the dataset was compiled. The decimals are important to correctly match a tract to its corresponding spatial boundary.

According to the Census Bureau, “when new census tracts (splits) occur within an established set of census tracts, the Census Bureau recommends retaining the original four-digit census tract number and adding a two-digit decimal suffix. As a result, Census Tract 101 may be split into Census Tracts 101.01, 101.02, and so forth, depending upon how many new census tracts are created” (Geographic Areas Reference Manual) (https://www2.census.gov/geo/pdfs/reference/GARM/Ch10GARM.pdf))

Therefore, there are incorrectly labeled tracts in the column, and it is difficult to know which ones were split or have decimal values.

How does your methodology deal with missing data?

For our cluster analysis, we grouped census tracts together based on tracts sharing similar vacancy rates and total number of storefronts. Clustering is an analysis where data is classified into categories, previously undefined, based on their similarities. We identified areas in NYC that have higher than average vacancy rates as well as a large number of total storefronts.

In order to get the correct census tract for each storefront in the dataset, we used the ‘latitude’/’longitude’ and ‘bbl’ columns. We matched bbls to the PLUTO file (https://www.nyc.gov/assets/planning/download/zip/data-maps/open-data/nyc_pluto_21v2_arc_csv.zip) to get the census tract id. For storefronts with non-matching bbls, we geocoded the addresses using tidygeocoder.

Can I use the maps & findings in my own work?

Yes! Please attribute credit to the New York City Council Data Team.

Where can I download a high res version of the maps?

For the neighborhood & cluster map, you can access and download the interactive version [here] and the static version [here], respectively.

Data

Implementation (methodolgy & Scripts)

Method 1

Method 2