Erik Bernhardsson has been observing phenomenon of NYC’s subway system, analyzing data readily available from the local transit authority’s real-time API. In his latest post he examines the notion of variance – simply put if the train only runs every hour then waiting an hour doesn’t seem like a bother, but if a train runs every 5 minutes then somehow waiting 10 minutes is a bother, even if 10 minutes isn’t a very long wait time.
Why does it suck to wait for things? In a previous post I analyzed a NYC subway dataset and found that at some point, quite early, it’s worth just giving up.
This isn’t a proof that the subway doesn’t run on time — in fact it might actually proves that the subway runs really well. The numbers indicate that it’s not worth waiting after 10 minutes, but it’s a rare event and usually involves something extraordinary like a multi-hour delay. You should roughly give up after some point related to the normal train frequency, and 10 minutes is not a lot at all. Conversely if the trains ran hourly, it probably would had been worth waiting an hour or more. My analysis gave me a lot of respect for the job MTA is doing.
But there’s another effects that greatly impacts waiting time. The variance. It turns out that the the statistics of waiting makes it very sensitive to variance.