Hubway Class of 2017: Final Project

By Meghan Kokoski, Mikayla Murphy, Kimberly Yu, and Margaret Yu

Methodology

The main data source we used was the Hubway trips dataset, which contains the start station, end station, timings, and other information for every Hubway ride ever taken.

Early on, we identified that our primary audience was MIT students without a Hubway membership, and that our goal was to encourage these students to purchase annual Hubway memberships. To achieve this goal, we created a set of posters for the Infinite Corridor. Each poster focuses on a specific Hubway bike that plays a certain character in the “Hubway Class of 2017”. The characters were carefully chosen to emphasize unique parts of MIT culture (so as to be relatable to MIT students) while also having a strong data-driven story. We chose to represent the class of 2017 as the MIT class of 2017 is just about to graduate, making it the most current and relatable class. As such, we used Hubway data from September 2016 through February 2017 (the latest available Hubway data as of now).

Our first character was the Hacker bike, which is the Hubway bike that had the most trips at night. Using the dataset, we found that Bike 1738 had the most trips that started between 10 pm and 6 am, with 54 trips.

The second character was the Course 6 bike, which was the bike that had the most trips to or from the Hubway station at the Stata Center. Stata houses most of the Course 6 classes and professors, and Course 6 is the largest major at MIT, so we thought it would be especially relatable to most MIT students. Bike 1382 had the most trips to and from the Stata Center station, with 49 trips (24 trips starting from Stata and 25 trips ending at Stata) .

The third bike was the Greek bike, which was the bike that made the most trips from the main MIT campus to the two stations closest to the MIT Greek houses across the river. The MIT stations were defined as the Stata Center station and the Mass Ave/Amherst St station, and the Greek stations were defined as the Kenmore Square station and the Beacon St/Mass Ave station. Bike 640 had the most trips between these stations with 55 trips.

The last bike was the Firehosed bike, which was defined as the bike that was most busy (aka took the most trips). Being hosed is very relatable to MIT students, so we hoped they’d empathize with Bike 1395, which took 701 total trips between September 2016 and February 2017.

To support each of the stories, we added a second layer of information about biking and Hubway linked to each character. For example, for the Firehosed bike, we added facts about how biking and being outdoors decreases stress, because hosed students are often stressed. For the Course 6 bike, we talked about the environmental impacts that biking has, as Course 6’s are trying to change the world technologically, so why not change the world environmentally too? For the Hacker bike, we added facts about safety, as safety is an important part of the Hacker Code of Ethics, and for the Greek bike, we focused on the time savings, as students who live across the river often complain about the commute time required.

Impact

The four Hubway Class of 2017 posters would be displayed on bulletin boards in the Infinite. Our audience is MIT students who are potential annual Hubway subscribers. Our goals are to raise awareness of the Hubway service, increase Hubway annual memberships, and lower CO2 levels.

We interviewed twelve MIT students who do not have an annual Hubway membership. Before showing them the posters, we asked them the following pre-questions:

  1. If you bike, do you own a bike?
  2. Have you heard of Hubway?
  3. Have you ever used Hubway before?
  4. Do you have a Hubway annual membership?
  5. If not, how likely are you to get a Hubway annual membership? (1 – not likely to 5 – very likely)
  6. If not, why do you not have a Hubway membership? Why do you not already have a bike?
  7. If you had to go to Harvard Square, how would you usually get there? (bike, bus, T, car, walk, etc.)
  8. Which of these methods of exercising are you most likely to do this weekend: jogging, walking, biking, or something else?
  9. How useful do you think biking is? (1 – not useful to 5 – most useful)

We then showed them the posters and asked them to imagine them displayed in the Infinite. After they examined the posters, we asked the following post-questions:

  1. (If they didn’t already have Hubway) How likely are you now to get a Hubway annual membership? (1 – not likely to 5 – very likely)
  2. After seeing these posters, are you more likely to bike to Harvard Square? (1 – less likely to 5 – more likely)
  3. Are you more likely to go cycling as a form of exercise this weekend? (1 – less likely to 5 – more likely)
  4. How useful do you think biking is? (1 – not useful to 5 – most useful)
  5. What do you like about the posters? What do you think is most effective?
  6. What do you not like about the posters? What do you think is least effective?
  7. Did we address your concerns about using Hubway?

From the pre-questions, we found that most MIT students have heard of Hubway but are not very likely to get a Hubway membership. The primary reason they do not have a membership is that they seldom go off-campus. We also found that MIT students would most likely take the bus or walk to Harvard Square from MIT, and would most likely walk to exercise over the weekend. They all agree that biking is very useful. After viewing the posters, they were slightly more likely to get a Hubway annual membership. The posters did not affect the likelihood of biking to Harvard Square or cycling to exercise over the weekend. However, their perception of the usefulness of biking increased slightly.

From the feedback on our posters, we learned that MIT students felt more connected to the Hubway service because of the MIT-affiliated bike names and the cute designs. The Course 6 bike and the Firehosed Bike appealed to the most people because they directly addressed the impact of Hubway use on climate change and health. MIT students liked the color scheme and found the layout of the information easy to follow, and the quick facts easy to learn. However, they thought some posters had a lot of text, and they probably wouldn’t stop to read the posters in the Infinite. Although the time comparison was helpful for understanding the usefulness of biking, some people thought it would not be worth it to arrive at an event sweaty from biking, which only saves 2 minutes. MIT students who were not very familiar with Hubway had difficulty understanding the bike id numbers. They were also concerned about the station locations, and one student mentioned how, according to a friend’s experience, the Hubway bike pedals don’t accommodate short people. For the most part, our posters were able to address MIT students’ concerns, particularly in terms of saving time and improving health, and made people seriously consider why they do not have a Hubway annual membership.

Data Log- Sunday 2/12

  • Google Maps- Tracked route driven and how long it took us. We also used it to look up nearby restaurants (that we didn’t end up going to) so it also collected data on which restaurants we were considering and the fact that we didn’t go to any of them.
  • Ski Rental Shop– Collected name, height, weight, and skiing experience level when renting skis.
  • Google Search History- Recorded time, search query, and any links followed from search.
  • Bank of America- Recorded every credit card transaction (time, amount spend, retailer, location).
  • Waterville Valley Ski Resort- Recorded time we picked up our lift passes and scanned our lift passes’ barcodes before we got on each lift, meaning they know exactly when we rode each lift. Also recored our names, addresses, emails, and birthdates.
  • Whenisgood- Recorded my schedule for the next week.
  • Gmail- Recorded what emails I got, when I opened them, when I responded, what I gave labels or marked as spam, etc.
  • Facebook- Tracked my location and what content I viewed on my newsfeed. Recorded when I liked a photo and who’s photo it was.
  • Snapchat- Recorded my location and what snaps I sent and received. Also recorded temperature and altitude data. It also noted which Snapchat users were nearby.
  • Android- Recorded my phone’s battery use and network connectivity status. Also recorded which apps were opened, how long they were opened for, and how much storage/battery they used, as well as my location.
  • Google Photos- Uploaded photos taken on my phone and their associated metadata to the cloud.
  • Messenger- Recorded what messages I sent and received, their contents, and their times. It also tracked photos taken on my phone and my location.
  • Chrome- Recorded what websites I visited, how long I spent on them, and what links I clicked on them.
  • MIT RFID readers– Records what time I tapped into my dorm and my club’s office on campus.
  • Dorm Desk Package Tracking- What time I picked up my package from my dorm’s front desk was recorded.
  • MIT Campus Security Cameras– Cameras likely tracked a majority of my movements around campus, including entering my dorm.
  • New Hampshire Highway System– Collected pictures of our license plates when we passed through the toll booths.
  • Shaw’s Parking Garage- Recorded what time it issued ticket and thus what time we entered the parking garage.
  • Rental Car- Tracked and recorded gas status, mileage, among other variables.

Visualizing 1.1 Billion NYC Taxi and Uber Trips

I recently came across a detailed analysis by Todd W. Schneider of 1.1 billion New York city cab rides that occurred from 2009– 2015. In addition to this massive dataset provided by the New York City Taxi & Limousine Commission, Schneider also incorporated a public dataset of 19 million Uber rides from April– September 2014 and January– June 2015.

One of the most striking images from his analysis, showing a heatmap of taxi pickups and dropoffs in NYC.

This analysis appears to be intended for New Yorkers, as a lot of the analysis and visualizations assume the audience has basic geographic familiarity with NYC and an understanding of New York lingo and culture. While this presentation would definitely still be interesting for those interested in transportation and unfamiliar with NYC, Schneider focuses less on general transportation analysis (eg. average fare or trip time) and more on New York specific analysis (eg. which neighborhoods are up late and taxi trips taken from Goldman Sachs).

Schneider appears to have multiple goals in this presentation. One is to comprehensively explore a wide variety of questions in this data set. He includes numerous graphs and figures, each addressing a different aspect of the dataset, but it’s almost overwhelming how many figures are presented. Though data junkies would enjoy the comprehensive nature of this presentation, I think most readers will get overwhelmed by sheer number of graphs. In addition, in my opinion, the large number of figures buries some of the most interesting aspects of the data, reducing the efficacy of his analysis. For instance, about halfway down his (long) post, Schneider has a simple bar graph showing that rainstorms don’t appear to affect daily ridership. This was the most surprising conclusion for me personally, as the common thought is that taxis are impossible to get during rainstorms, so the fact that it’s hidden halfway down his post is disappointing.

A simple graph showing that rainstorms don’t appear to affect daily NYC taxi ridership.

 

Intentionally or not, he also appears to be advocating for usage of public transit over taxis in parts of his analysis. In the section dedicated to airports, he concludes that “depending on the time of day and how close you are to a subway stop, your expected travel time might be better on public transit than in a cab, and you could save a bunch of money.” As New Yorkers can then customize the visualizations to show expected travel time to the airports from their own neighborhood, I think this part of the presentation is very effective. It’s much more powerful and relatable to show viewers time averages of taxi trips from their own neighbourhood rather than averages across the whole city, making this section of analysis one of the most powerful in the whole presentation.