Skip to content

ZackKanter/a16z-library

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Andreessen Horowitz Library

This is the output of an attempt to "open source" the Andreessen Horowitz Library. The inspiration for using high-res photos from a Wired article to catalog all 1,087 books in the lobby of a16z is documented on Medium.

Data Files

The library is available in three files:

  • books.md: This is the original markdown list of books and the source for the two data files that follow. Changes and/or corrections to the book data captured in the photos should be captured here.
  • books_ratings.md: This is a an extension of the original list. Where possible, the Goodreads API was used to append additional attributes of each book to the original records in the books.md file. The books in the library are sorted from highest to lowest rating. This file can be (re)generated from the books.md file using the book_ratings.py python script described below.
  • books_ratings.csv: This is a csv version of the expanded books_ratings.md suitable for further analysis in Excel, python or R. This file can be (re)generated from the books.md file using the book_ratings.py python script described below.

Fields

The books.md file has the following fields:

  • Book ID: Describes the location of the book in the libarry. The first digit is the bookcase number (clockwise around the room), the second is the vertical shelf (‘A’ is the top shelf), and the third is the book’s position on that shelf (‘1’ is the far-left book).
  • Title: The title of the book.
  • Author: The author of the book.
  • Links: Where possible, links to the books on Amazon and/or Goodreads.

books_ratings.md and books_ratings.csv have the following additional fields besides those listed above. These additional fields are derived through the Goodreads API where possible. This is done by using the books.md file as input to the book_ratings.py script (explained in more detail below). Not all books have a record in Goodreads. Also, the Goodreads API does not necessarily return complete data for all books. In both these cases, default values are added for these additional fields. Understanding the default values will allow records to be appropriately filtered in any subsequent analysis.

  • Book Title: This is the title of the book on Goodreads. Should be more or less the same as the Title field from the original list. The default value is an empty character string.
  • Average Rating: This is the average rating given by reviewers on Goodreads for the book. The default value is '0.0'.
  • Ratings Count: This is the number of ratings given for a book. The default value is '0'.
  • Number Pages: This is the number of pages in the book. The default value is '0'.
  • Publication Date: This is the publication date in the format Month/Day/Year. The default value is None/None/None. If only the year of publication is known, then only that portion of the date will be set, e.g. None/None/<Year>
  • Publisher: The publisher of the book. The default value is an empty character string.
  • ISBN: The ISBN number of the book. The default value is an empty character string.

How Additional Attributes Are Generated

The books_ratings.md and books_ratings.csv files are generated by using the books.md file and the book_ratings.py python script. This script uses the Goodreads API to get additional attributes of each book.

Prerequisites

In order to use the script to (re)generate the additional attributes for the books in the library, you will need to do the following:

  • Download and install python. This script was developed using Python 3.5.1.
  • Install the goodreads python package. Ensure you follow the instructions for installing the package on the project site. Note that this package is not longer maintained. It has been forked here in case you need another repository from which to download it.
  • Install the pandas python library. This can be done using pip:
    pip install pandas

More information on installing pandas can be found here.

  • Request an API key from Goodreads here. You will not be required to authenticate to use this script so the API keys will be sufficient.

Running the Script

The script can be run from the command line. It expects the books.md file to be in the same directory as the script. The script also expects 2 command line arguments, the api key and api secret you requested above:

    python book_ratings.py <api_key> <api_secret>

The script will generate two files: books_ratings.md and books_ratings.csv

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages