Predicting Bus Travel Time with Hybrid Incomplete Data – A Deep Learning Approach

The authors propose a hybrid deep learning model combining Long Short-Term Memory (LSTM) networks with Genetic Algorithm (GA) optimization. The LSTM model, known for capturing temporal dependencies and sequential patterns, is particularly well-suited to modeling the time-varying nature of transit data. However, to ensure optimal performance, the network’s hyperparameters—such as learning rate, number of neurons, and epochs—must be finely tuned. The use of GA automates this process by evolving potential parameter sets until the most efficient configuration is achieved, based on fitness metrics such as Root Mean Square Error (RMSE). ​

This combination allows the framework to handle hybrid, partially missing GPS and ESC data more adaptively than deterministic grid search or manual tuning approaches. The result is a system that is both robust to missing information and computationally efficient, capable of scaling across datasets from different bus routes or cities.

The study tested its approach using both simulated and real-world data from a Chinese urban bus corridor. Data included GPS readings—specifying timestamp, latitude, longitude, and speed at 15-second intervals—and ESC records capturing electronic fare transactions. Due to the inevitable incompleteness of these streams, the authors constructed a hybrid dataset integrating both sources. The inclusion of heterogeneous data streams allows the model to infer missing values more effectively since GPS and ESC capture complementary aspects of bus movements—spatial trajectories and boarding events, respectively

The study concludes that hybrid data-driven models supported by evolutionary optimization techniques, such as GA, can substantially enhance bus travel time prediction accuracy even with incomplete data. The approach highlights a scalable methodological framework for smart city transport analytics—capturing dynamic traffic behaviors, enabling more precise passenger information systems, and optimizing operational planning.

Furthermore, the integration of ESC and GPS data opens new avenues for hybrid transport data fusion, especially for cities transitioning from legacy systems with inconsistent monitoring infrastructure. The method’s ability to generalize across missing-data scenarios makes it adaptable to broader intelligent transportation and logistics applications amid real-world data imperfections. ​

Overall, Jiang et al.’s contribution represents a robust step toward resilient AI-enabled transit systems capable of learning complex mobility patterns and adapting to data uncertainty inherent in large-scale transportation networks.

You can read the paper here

Comments are closed.