Skip to content

Bus Timetables scraped from New Lantao Bus' website

License

Notifications You must be signed in to change notification settings

pkboy/nlb-scrape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nlb-scrape

New Lantao Bus Company provides an API through data.gov.hk. They provide a route list, stops for a route, and ETA for stops, but not a timetable for each route. They do have these timetables available on their website, so I created this project to scrape the timetables from there.

Route List is obtained from their API. Scraper iterates through the route IDs and the relevant website URL then scrapes the timetable.

Usage

Install the required packages: pip install beautifulsoup4

Run

python daysDictionary.py

daysDictionary builds the daysDictionary json file.
Each entry in the array has a serviceDaysString which is taken from the heading of each table in HTML.
You're free to define the string with the other key-value pairs, but for me I start the week on Monday so for a service string of:

"Monday to Friday", I define dayStart as 0 and dayEnd as 4.

python scrapeNLB.py

JSON

Json structure matches the one used in my https://github.com/pkboy/sunferryhktimetable project that converts that data from CSV to JSON.

Issues

  • Data from timetables where stop information is not in a standard format will be erroneous, such as circular routes where departures are presented as a time period with the route's headway time.
  • Data is in English but can be set to Traditional and Simplified Chinese, see comments.

Contributors

pkboy

About

Bus Timetables scraped from New Lantao Bus' website

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages