GLAM Logit: A random-utility-consistent method to estimate non-parametric coefficents from ubiquitous datasets
Group Level Agent-based Mixed (GLAM) Logit is a variation of mixed logit (MXL) model, which provides deterministic and agent-specific estimation that can be efficiently integrated into optimization models. This is an extension of agent-based mixed logit (AMXL) model in Ren and Chow (2022)’s study. Consider within each agent
where
where
where
For more details, please refer to the following paper:
The NYU NON-COMMERCIAL RESEARCH LICENSE is applied to EVQUARIUM (attached in the repository). Please contact Joseph Chow ([email protected]) for commercial use.
For questions about the code, please contact: Xiyuan Ren ([email protected]).
We built a simple example of mode choice to illustrate how the GLAM logit works. In this example, each agent refers to trips belonging to an OD pair. Only two modes, taxi and transit, are considered for simplicity. Each row of the sample data contains the ID of the agent, travel time and cost of taxi, travel time and cost of transit, and mode share of the two modes.
It is noted that we added two “fake” agents (agent 7 and 8) into the dataset. The mode shares of these two agents are unreasonable since the mode with a longer travel time and a higher cost has a higher market share. The sample data containing aggregated-level mode choice information of 8 agents:
The derived utilities of the two modes are defined as:
where
We ran GLAM Logit with latent class
The estimated market share E_Taxi (%) and E_Transit (%) are quite close to the input data. Moreover, the results reflect diverse tastes at the agent level though the three latent classes: (1) agent 1-3 have negative
For detailed codes, please check Illustrative_sample.py
In a real case study, a NY statewide model choice model is developed using GLAM Logit. Synthetic trips on a typical weekday were used to calibrate the model. We considered six modes enabled by Replica’s datasets, including private auto, public transit, on-demand auto, biking, walking, and carpool. The GLAM logit model with 120,740 agents took 2.79 hours to converge at the 26th iteration, with a rho value of 0.6197.
Coefficient distribution:
Value of time (VOT) distribution in NY state and NYC:
Deliverables:
NYC and NYS mode choice coefficients, as well as census block group-level mode share estimated by GLAM logit are available on Zenodo platform.
To run GLAM Logit model:
Please conduct the following steps: 1) define the utility function; 2) prepare group-level choice observation datasets (see OD_level_RP_processing.py), 3) run inverse optimization algorithm for a single agent (see Group_level_IO.py), and; 4) run the whole estimation algorithm (see Model_building.py)
For further questions, please contact: Xiyuan Ren ([email protected]).
Compared with conventional logit models (e.g., MNL, NL, MXL), the significance of GLAM Logit model is three-fold.
- GLAM Logit takes OD level (instead of individual level) data as inputs, which is efficient in dealing with ubiquitous datasets containing millions of observations.
- Preference heterogeneities are based on non-parametric aggregation of coefficients per agent instead of having to assume a distributional fit. The spatial distribution of agent-level coefficients is infeasible for conventional logit models to capture.
- GLAM Logit can be directly integrated into optimization models as constraints instead of dealing with simulation-based approaches required by mixed logit (MXL) models. For instance, multi-service region assortment can be formulated as a quadratic programming (QP) problem.