Skip to content

Latest commit

 

History

History
327 lines (228 loc) · 12.1 KB

README.rst

File metadata and controls

327 lines (228 loc) · 12.1 KB
Documentation Status https://travis-ci.org/MacHu-GWU/uszipcode-project.svg?branch=master https://img.shields.io/pypi/dm/uszipcode https://img.shields.io/badge/STAR_Me_on_GitHub!--None.svg?style=social

Welcome to uszipcode Documentation

If you are on www.pypi.org or www.github.com, this is not the complete document. Here is the Complete Document.

If you are looking for technical support, click the badge below to join this gitter chat room and ask question to the author.

uszipcode is the most powerful and easy to use programmable zipcode database in Python. It comes with a rich feature and easy-to-use zipcode search engine. And it is easy to customize the search behavior as you wish.

Data Points

From version 0.2.0, uszipcode use a more up-to-date database, and having a crawler running every week to collection different data points from multiple data source. And API in 0.2.X NOT COMPATIBLE with 0.1.X, please read Document for more information.

Address, Postal

  • zipcode
  • zipcode_type
  • major_city
  • post_office_city
  • common_city_list
  • county
  • state
  • area_code_list

Geography

  • lat
  • lng
  • timezone
  • radius_in_miles
  • land_area_in_sqmi
  • water_area_in_sqmi
  • bounds_west
  • bounds_east
  • bounds_north
  • bounds_south
  • border polygon

Stats and Demographics

  • population
  • population_density
  • population_by_year
  • population_by_age
  • population_by_gender
  • population_by_race
  • head_of_household_by_age
  • families_vs_singles
  • households_with_kids
  • children_by_age

Real Estate and Housing

  • housing_units
  • occupied_housing_units
  • median_home_value
  • median_household_income
  • housing_type
  • year_housing_was_built
  • housing_occupancy
  • vancancy_reason
  • owner_occupied_home_values
  • rental_properties_by_number_of_rooms
  • monthly_rent_including_utilities_studio_apt
  • monthly_rent_including_utilities_1_b
  • monthly_rent_including_utilities_2_b
  • monthly_rent_including_utilities_3plus_b

Employment, Income, Earnings, and Work

  • employment_status
  • average_household_income_over_time
  • household_income
  • annual_individual_earnings
  • sources_of_household_income____percent_of_households_receiving_income
  • sources_of_household_income____average_income_per_household_by_income_source
  • household_investment_income____percent_of_households_receiving_investment_income
  • household_investment_income____average_income_per_household_by_income_source
  • household_retirement_income____percent_of_households_receiving_retirement_incom
  • household_retirement_income____average_income_per_household_by_income_source
  • source_of_earnings
  • means_of_transportation_to_work_for_workers_16_and_over
  • travel_time_to_work_in_minutes

Education

  • educational_attainment_for_population_25_and_over
  • school_enrollment_age_3_to_17

Example Usage

NOTE:

uszipcode has two backend database, SimpleZipcode and Zipcode. Zipcode has more info, but the database file is 450MB (takes more time to download). SimpleZipcode doesn't has all data points listed above, but the database file is smaller (9MB). By default uszipcode use SimpleZipcode. You can use this code to choose to use the rich info Zipcode:

>>> from uszipcode import SearchEngine
>>> search = SearchEngine(simple_zipcode=False)

From 0.2.4, uszipcode allows developer to choose which directory you want to use to download the database file. By default, it is $HOME/.uszipcode, but you can easily change it.:

>>> search = SearchENgine(db_file_dir="/tmp")

For example, AWS Lambda doesn't allow to download file to $HOME directory, but allows to download to /tmp folder.

Examples:

>>> from uszipcode import SearchEngine
>>> search = SearchEngine(simple_zipcode=True) # set simple_zipcode=False to use rich info database
>>> zipcode = search.by_zipcode("10001")
>>> zipcode
SimpleZipcode(zipcode=u'10001', zipcode_type=u'Standard', major_city=u'New York', post_office_city=u'New York, NY', common_city_list=[u'New York'], county=u'New York County', state=u'NY', lat=40.75, lng=-73.99, timezone=u'Eastern', radius_in_miles=0.9090909090909091, area_code_list=[u'718', u'917', u'347', u'646'], population=21102, population_density=33959.0, land_area_in_sqmi=0.62, water_area_in_sqmi=0.0, housing_units=12476, occupied_housing_units=11031, median_home_value=650200, median_household_income=81671, bounds_west=-74.008621, bounds_east=-73.984076, bounds_north=40.759731, bounds_south=40.743451)

>>> zipcode.values() # to list
[u'10001', u'Standard', u'New York', u'New York, NY', [u'New York'], u'New York County', u'NY', 40.75, -73.99, u'Eastern', 0.9090909090909091, [u'718', u'917', u'347', u'646'], 21102, 33959.0, 0.62, 0.0, 12476, 11031, 650200, 81671, -74.008621, -73.984076, 40.759731, 40.743451]

>>> zipcode.to_dict() # to dict
{'housing_units': 12476, 'post_office_city': u'New York, NY', 'bounds_east': -73.984076, 'county': u'New York County', 'population_density': 33959.0, 'radius_in_miles': 0.9090909090909091, 'timezone': u'Eastern', 'lng': -73.99, 'common_city_list': [u'New York'], 'zipcode_type': u'Standard', 'zipcode': u'10001', 'state': u'NY', 'major_city': u'New York', 'population': 21102, 'bounds_west': -74.008621, 'land_area_in_sqmi': 0.62, 'lat': 40.75, 'median_household_income': 81671, 'occupied_housing_units': 11031, 'bounds_north': 40.759731, 'bounds_south': 40.743451, 'area_code_list': [u'718', u'917', u'347', u'646'], 'median_home_value': 650200, 'water_area_in_sqmi': 0.0}

>>> zipcode.to_json() # to json
{
    "zipcode": "10001",
    "zipcode_type": "Standard",
    "major_city": "New York",
    "post_office_city": "New York, NY",
    "common_city_list": [
        "New York"
    ],
    "county": "New York County",
    "state": "NY",
    "lat": 40.75,
    "lng": -73.99,
    "timezone": "Eastern",
    "radius_in_miles": 0.9090909090909091,
    "area_code_list": [
        "718",
        "917",
        "347",
        "646"
    ],
    "population": 21102,
    "population_density": 33959.0,
    "land_area_in_sqmi": 0.62,
    "water_area_in_sqmi": 0.0,
    "housing_units": 12476,
    "occupied_housing_units": 11031,
    "median_home_value": 650200,
    "median_household_income": 81671,
    "bounds_west": -74.008621,
    "bounds_east": -73.984076,
    "bounds_north": 40.759731,
    "bounds_south": 40.743451
}

Rich search methods are provided for getting zipcode in the way you want.

>>> from uszipcode import Zipcode
# Search zipcode within 30 miles, ordered from closest to farthest
>>> result = search.by_coordinates(39.122229, -77.133578, radius=30, returns=5)
>>> len(res) # by default 5 results returned
5
>>> for zipcode in result:
...     # do whatever you want...

# Find top 10 population zipcode
>>> result = search.by_population(lower=0, upper=999999999,
... sort_by=Zipcode.population, ascending=False, returns=10)

# Find top 10 largest land area zipcode
>>> res = search.by_landarea(lower=0, upper=999999999,
... sort_by=Zipcode.land_area_in_sqmi, ascending=False, returns=10)

Fuzzy city name and state name search does not require developer to know the exact spelling of the city or state. And it is case, space insensitive, having high tolerance to typo. This is very helpful if you need to build a web app with it.

# Looking for Chicago and IL, but entered wrong spelling.
>>> res = search.by_city_and_state("cicago", "il", returns=999) # only returns first 999 results
>>> len(res) # 56 zipcodes in Chicago
56
>>> zipcode = res[0]
>>> zipcode.major_city
'Chicago'
>>> zipcode.state_abbr
'IL'

You can easily sort your results by any field, or distance from a coordinates if you query by location.

# Find top 10 population zipcode
>>> res = search.by_population(lower=0, upper=999999999,
... sort_by=Zipcode.population, ascending=False, returns=10)
>>> for zipcode in res:
...     # do whatever you want...

Deploy Uszipcode as a Web Service

I collect lots of feedback from organization user that people want to host the database file privately. And people may love to use different rdbms backend like mysql or psql. From 0.2.6, this is easy.

Host the database file privately:

  1. download db file from https://github.com/MacHu-GWU/uszipcode-project/releases/tag/0.2.6-db-file
  2. reupload it to your private storage.
  3. use download_url parameter:
search = SearchEngine(download_url="your-private-host")

Use different RDBMS backend:

  1. Let's use MySQL as example.
  2. Download db file.
  3. use DBeaver to connect to both sqlite and mysql.
  4. dump sqlite as csv and load it to mysql.
  5. use engine parameter
from uszipcode.pkg.sqlalchemy_mate import engine_creator

engine = create_postgresql(username, password, host, port, database)
search = SearchEngine(engine=engine)

Deploy uszipcode as Web API:

  1. Use a VM like EC2 machine, and deploy a web api server with the machine.
  2. (RECOMMEND) Dump the sqlite database to any relational database like Postgres, MySQL, and inject the database connection info in your application server.

Install

uszipcode is released on PyPI, so all you need is:

$ pip install uszipcode

To upgrade to latest version:

$ pip install --upgrade uszipcode