Optimizing the bike infrastructure system in SF using Machine Learning

City Bikes

Bike infrastructure system offer a method to rent bikes by utilizing an automated system of membership, rental, and return at kiosk locations across San Francisco. The primary issue with it is the expensive repair and maintenance of these stations. As a result, the service hubs often don't fully utilize their potential. Among the major inefficiencies of urban bike systems is the inability to satisfy increasing demand or properly capitalize on the stations' maximum capacity
Our project deals with optimizing the bike sharing systems in the city of SF to help distribute the resources effectively and efficiently. My role on the project was to clearly define the problem space, help in building the ML model and create the data visualization in Streamlit App for the project
Oct-Dec 2021
Samarth Gowda
Kevin Chian
Toolkit & techniques
Final Movie
Github Repo

Bike service stations do not end up using to their full potential

Our internet research study informed us that the bike stations in SF have varying demands with certain stations popular than the other resulting in inefficient systems. We wanted to solve this problem by examining the concealed data layers within daily cyclists travel. To achieve this, various connections are identified, and essential EDA is performed to comprehend urban cyclist patterns

Bike station - Image credits - Flicker.com

Defining the problem space

How can we improve the existing bike-ride system efficiency in San Francisco to decongest the city ?  

One of the biggest inefficiencies of the city bike system is either not being able to meet the growing demand or not utilizing the stations to their full potential. This demand is a hefty variable and it is almost impossible to know what will be the demand on a particular day


Predict the usage and analyze the underutilized stations using Regression

Bike sharing systems function as a sensor network, which can be used for studying mobility in a city.  Apart from interesting real-world applications of bike sharing systems, the characteristics of data being generated by these systems make them attractive for the research. Opposed to other transport services such as bus or subway, the duration of travel, departure, and arrival position is explicitly recorded in these systems. This feature turns the bike sharing system into a virtual sensor network that can be used for sensing mobility in the city. Hence, it is expected that most of the important events in the city could be detected via monitoring these data.
Final Dashboard on Streamlit showcasing live data visualizations


Dashboards & Visualizations

Visualizations showcasing the most popular stations
Regression model error
Average usage in all stations over a year
Result - Average prediction per day showcasing the performance of the model on test data


We started with a dataset from 2013 to 2015 and SF has since changed their bike sharing system, obtaining corporate sponsorship and rebranding the system to "Bay wheels". However, many different places around the world have similar structure and the model can be adopted for predictive analysis to improve the bike riding systems.