For this project, I used the ever-popular data visualization software tool, Tableau, to create and present a business proposal for a bike-sharing company. I've created worksheets, dashboards, and stories to visualize key data from a New York Citi Bike dataset. I plan to present these visualizations and proposal to an angel investor in the hopes of securing funds to start a bike-sharing business in my hometown of Des Moines, Iowa.
- Data Source: CSV file containing August 2019 data from the NYC Citibike website
- Software and Tools: Tableau, Python, Pandas, Jupyter Notebook & Git Bash
Using Python and Pandas functions, I converted the tripduration
column from an integer to a datetime datatype to get the time in hours, minutes, and seconds (00:00:00). Then, I exported the DataFrame as a CSV file to use for the trip analysis in Deliverable 2.
Using Tableau, I created visualizations that answer the following 3 questions:
First, I created a line graph displaying the number of bikes checked out by duration for all users, and the graph can be filtered by the hour. This visualization shows that the majority of bike trips last for 20 minutes or less.
Next, I created a line graph displaying the number of bikes that are checked out by duration for each gender by the hour, and the graph can be filtered by the hour and gender. This visualization provides more granular detail than the previous graph, showing that males overwhelmingly make up the majority of bike-share customers.
First, I created a heatmap showing the number of bike trips for each hour of each day of the week. This visualization shows that the peak times for bike trips are Monday through Friday between 7:00-9:00am and 5:00-6:00pm, which suggests that many bike-share customers use them for their workday commutes. Also, there are a consistent number of bike trips made between 10:00am-6:00pm on Saturdays and Sundays, which one would expect in NYC due to the large tourist presence.
Next, I created a heatmap showing the number of bike trips by gender for each hour of each day of the week, and the heatmap can be filtered by gender. This visualization provides more granular detail than the previous heat map, just like the line graphs above, confirming that males are the predominant bike-share users.
3. What days of the week might a user be more likely to check out a bike, by type of user and gender?
I created a heatmap showing the number of bike trips for each type of user and gender for each day of the week, and the heatmap can be filtered by user and gender. This visualization clearly shows that most bike-share trips are made by male subscribers, with the most trips occuring on Thursdays.
For this part of the Challenge, I created a story in Tableau. To view and interact with my Tableau presentation, click here.
In summary, the visualizations and analysis I performed demonstrate that a bike-sharing program in Des Moines, Iowa can be successful. Since the hourly trip data in NYC suggests that daily commuters, not tourists, make up the majority of customers, a similar bike-sharing venture in Des Moines has strong potential to thrive. The data also shows that men are the predominant customer base, therefore advertising and marketing dollars should be spent to attract male subscribers. And, it goes without saying that the placement and distance between bike station locations will be key to the success of the program, so scouting out potential locations should be given special consideration.
Two additional visualizations that I would perform with the given dataset are:
- create a visualization to plot the start and end station ids and determine the average distance of bike trips.
- create visualizations that show usertype by birth year to get a better understanding of a customer's demographic by age.