Decision Companion for Traffic Safety Improvement

Summary

Predicting expected traffic crashes and designing targeted interventions are highly challenging due to the inherent complexity of crash data and persistent concerns over the prediction trustworthiness. We introduce SafeTraffic Copilot that adapts Large Language Models (LLMs) to perform expected crash prediction as a text-reasoning task, then attribute critical features for targeted safety interventions. Within the Copilot, SafeTraffic LLM is customized then fine-tuned on the textualized SafeTraffic Event dataset, which consists of 66,205 real-world crash cases with 14.5 million words from five U.S. states. Across multiple prediction tasks including crash type, severity, and number of injuries, SafeTraffic LLM demonstrates a 33.3% to 45.8% improvement in average F1-score over existing works. To interpret these results and inform safety interventions, we introduce SafeTraffic Attribution, a sentence-level feature-attribution framework enabling conditional “what-if” risk analysis.

Related Publications

Zhao, Y., Wang, P., Zhao, Y., Du, H., and Yang, H.F. SafeTraffic Copilot: adapting large language models for trustworthy traffic safety assessments and decision interventions. Nature Communications 16, 8846 (2025). https://doi.org/10.1038/s41467-025-64574-w

Zhao, Y., Wang, P. and Yang, H.F., How to Auto-optimize Prompts for Domain Tasks? Adaptive Prompting and Reasoning through Evolutionary Domain Knowledge Adaptation. In The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025). https://openreview.net/forum?id=59n2g6RqjT
Fan, Z., Wang, P., Zhao, Y., Zhao, Y., Ivanovic, B., Wang, Z., Pavone, M. and Yang, H.F., 2024. Learning traffic crashes as language: Datasets, benchmarks, and what-if causal analyses. arXiv preprint arXiv:2406.10789.