Smart Public Transit - Transit Hub
This project addresses the problem of urban transportation and congestion by building analytical tools that help the customers and the transit agencies reduce uncertainties and optimize the transit operations. We adress this problem at three fronts - Data Analytics, Planning and analysis tool for understanding and projecting the impact of transportation choices, and developing scalable data stores that can enable cities to operate their own data lakes and analytics engines.
We focus on data analytics to understand bottlenecks and improve the operational reliability. For this, we start by first collecting multimodal data about transit operations, traffic, public events and congestion from cities of Nashville and Chattanooga. Then, we perform data analytics to understand the causes of transit delays and help provide tools that inform the community as well as transit operators deal with both long term planning as well as short term delays.
Some results from this work are below.
Deep Learning Based Anomaly Detection
Non-recurring traffic congestion is caused by temporary disruptions, such as accidents, sports games, adverse weather, etc. The data we use consists of historical traffic speed, jam factor (a traffic congestion indicator), and events collected over a year from Nashville, TN to train a multi-layered deep neural network. The traffic dataset contains over 900 million data records. The network is thereafter used to classify real-time data and identify anomalous operations. Compared with traditional approaches of using statistical or machine learning techniques, our model reaches an accuracy of 98.73 percent when identifying traffic congestion caused by football games. Our approach first encodes the traffic across a region as an image. After that, the image data from different timestamps is fused with event- and time-related data. Then a crossover operator is used as a data augmentation method to generate training datasets with more balanced classes. Finally, we use the receiver operating characteristic (ROC) analysis to tune the sensitivity of the classifier.
Surrogate Data Sensing
Data generated by transit vehicles that are equipped with GPS can be used to provide surrogate sensing of traffic conditions in the city. We propose a multivariate predictive multi-model approach called SpeedPro. It can identify similar clusters of operation from historical data that includes the real-time position of a probe vehicle, weather, and driver identifier, and then employs different models to estimate the traffic speed in real-time as a function of current weather, and transit vehicle speed. The work has been published by the SmartSys 2017 workshop. See the folloing slides: SpeedPro: A Predictive Multi-modal Approach for Urban Traffic Speed Estimation
Understanding Delays and Optimizing Schedule
The on-time arrival performance of buses at stops is a critical metric for both riders and city planners to evaluate the reliability of a transit system. Identifying the bottlenecks in transit networks that often have abnormal delay is the first step for scheduling optimization. We built a prescriptive analytics mechanism to identify historical bus delay patterns and locate the bottlenecks in the transit network by measuring transit performance.
The transit performance is affected by various factors, such as the travel demand, traffic conditions, weather, etc. These stochastic factors make it very difficult to optimize timetables to match the actual transit operation. To better undestand the factors affecting delays we built a system called Delay Radar. It uses multivariate linear regression models and random forest models to analyze the traffic and weather data to make predictions on transit travel time. Further, we created a robust delay prediction algorithm that uses multiple data sources and combines clustering analysis and Kalman filters. Additionally, a novel route segmentation mechanism that handles the issue of data sparsity was developed. You can read more about it in these slides.
To understand the impact of events we built multi-task deep neural networks that utilize contextual features (e.g., scheduled sports games and forecasted weather conditions) to make context-aware predictions of the expected travel delay, as well as the likelihood of accidents on the bus routes. Compared to existing models that rely solely on static and historical data, utilizing scheduled and predicted contextual information could provide a better estimate of transit system performance. Furthermore, the multi-task deep neural network architecture allows faster training and prediction, and reduces the possibility of overfitting, which improves the prediction accuracy. To learn more read the following papers: SmartComp2018 and the following poster.
Finally, we use the long term delay models to create an optimization problem for helping improve the fixed line transit schedule. See the following slides
As part of the work to improve the efficiency of public transit and urban transportation in general, we also build solutions that will educate the community on benefits of public transit. To mitigate this problem, we build a simulation framework to evaluate the effect of personal transportation choices and also help the cities evaluate the impact of incentive policies in nudging commuters towards alternate modes of travel, such as bike and car-share options. For this purpose, we leverage MATSim, an agent-based simulation framework, to integrate agent preference models that capture the altruistic behavior of an agent in addition to their disutility proportional to the travel time and cost. These models are learned in a data-driven approach and can be used to evaluate the sensitivity of an agent to system-level disutility and monetary incentives given, e.g., by the transportation authority. This framework provides a standardized environment to evaluate the effectiveness of any particular incentive policy of a city, in nudging its residents towards alternate modes of transportation. We show the effectiveness of the approach and provide analysis using a case study from the Metropolitan Nashville area.
Read more about it in this paper.
Scalable Data Stores
We are also building resilient data stores. Read more about it on the data store project page.