diff --git a/README.md b/README.md index 17f0f28..f44b001 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,8 @@ # China Census -Most countries rely on census data to seek a comprehensive probe into population dynamics. However, China has made substantial changes in administrative levels and boundaries during the past decade, while these adjustments were not well documented in the released census dataset, making it difficult to compare data between census years at the city/county level. Additionally, the delays in releasing census data are frequent. To date, the National Bureau of Statistics of China still has not released the detailed city/county-level data from the 2020 Census. To overcome such data challenges, we manually collected and digitized 2,317 gazettes, which include over 10,000 PDFs and HTML web pages, from local official sources for the 2020 Census and merged them with 2010 Census data. To consider intercensal changes in county boundaries, we adjusted the data for counties with changes in boundaries based on the released administrative zone changes and the 2010 Census population, covering more than 770 administrative changes. Finally, we built the population panel data from 2010 to 2020 at both the county and prefectural-city levels, covering 2,666 counties in 356 prefectural-cities. +## Backgroud + +Most countries rely on census data to seek a comprehensive probe into population dynamics. However, the records between census years at the city/county level in China are not comparable as the country has made substantial changes to administrative levels and boundaries over the past decade. Furthermore, these adjustments were not well documented in the released census datasets. Additionally, the delays in releasing census data are frequent. To date, the National Bureau of Statistics of China still has not released the detailed city/county-level data from the 2020 Census. To overcome such data challenges, we manually collected and digitized 2,317 gazettes (~ 10,000 PDF and webpages in total) from local official sources for the 2020 Census and merged them with 2010 Census data. To consider intercensal changes in county boundaries, we matched county-level data with 770 released administrative changes, e.g.,changes in the status of the administrative level, city/county consolidation or disintegration, land reconfigurations. Finally, we built a population panel from 2010 to 2020 at both the county and prefectural-city levels, which covers 2,666 counties in 356 prefectural-cities. The dataset has been made puclicly available in this repo. ### Data contributors: