How we transformed California’s transit data infrastructure
Building the nation’s first statewide open-source transit data platform, serving 200+ agencies and processing millions of files daily
How we transformed California’s transit data infrastructure
Building the nation’s first statewide open-source transit data platform, serving 200+ agencies and processing millions of files daily
California now has the most comprehensive transit data infrastructure in the nation, built entirely on open-source technology that agencies own and control.
California's 200+ transit agencies operated in complete data silos. Each agency managed their own systems, standards, and tools - making statewide planning, coordination, and improvement nearly impossible.
Caltrans and Cal-ITP leadership imagined a different future: a unified, open-source platform that would democratize access to data and tools, enabling every agency—regardless of size—to make data-driven decisions.
Implemented Google BigQuery, dbt, and Airflow to create a scalable, cloud-native data platform
Every component built on open-source tools, with all code publicly available on GitHub
Trained Caltrans staff to own and operate the platform, ensuring long-term sustainability