The United Kingdom's local public transport network is likely to become part of Google Transit. Technically that should be far easier in the UK than in North America, where Google Transit was first developed: The UK has a decade's bitter experience putting all the data together. In practice it is raising wider issues over data control and availability, that the public sector is somewhat reluctant to tackle.
This article describes how the UK's public transport data is being integrated into Google. It questions why data is being made available based solely on the business model adopted. It explores the real value of this information, and presents a case for the liberalisation of data.
Readers unfamiliar with the topic area should read my earlier Introduction to UK Local Public Transport Data, which contains non-technical background information, and defines many of the terms used (such as "local"). The original research for this was done in June/July 2007, so may now be slightly out of date.
The illustration on the right is the Google part of a visual representation of web trends, based on the Tokyo metro map, by Information Architects Japan.
On this page:
- Information contained within Google
- Who pays?
- There's no such thing as a free lunch
- Free data = free promotion
- Impartiality and accuracy
- Thinking outside the box
- Google Transit?
- Further Reading
Also in this series of articles:
Information contained within Google
There is currently stop-based scheduled departure information in Google Maps for bus services in Buckinghamshire. Zoom right in on Aylesbury town centre, and you will see bus stop icons. Click to see the next departure times. A version of the full Google Transit for most of the South East region (21/22 authorities) is on trial, but not yet public. (The regional definition of "South East", excludes London.) This will allow point-to-point journeys to be planned and visualised on the map. Both of those are run by Google themselves, who are determining the pace of implementation.
Traveline South East is the only regional consortium currently involved. However, there is growing interest amongst the wider professional community - for example Peter Stoner's (Traveline Coordinator) What impact will Google Transit have for UK journey planners? (PDF) in ATCO News (the magazine of association representing local authority public transport coordination officers).
Data converters exist from ATCO-CIF and TransXChange (see the Introduction to UK Local Public Transport Data) into a Google format. The format is essentially sets of CSV files - making it terribly hard to extend. Inevitably the format will keep on changing, as new networks are added, all with their odd little quirks.
There is no railway information, since there is no rail industry agreement. Trains are a significant local public transport mode within South East England, so excluding them greatly weakens the usefulness of the information.
There are other uses of Google Maps Application Programmer's Interface (API) - for example, Kizoom has used it in Leicestershire. I'll discuss that "mashup" approach later. I mention it here because it is easy to confuse a "system run by Google" with a "system that uses Google technology".
The current approach should be regarded as an experiment. But it may accidentally set a dangerous precedent.
There is no exchange of money: Google get to use the data for free, but don't charge for running the service. It is regarded as a complementary distribution channel  - an extension of a search engine, which directs people on to services like TransportDirect for more detailed information. There is a contract, but it is not terribly binding on either party. Essentially, Traveline could pull the data, Google could stop offering the service.
This isn't a free-for-all: Anyone else that wants to follow has to negotiate with each regional Traveline consortium separately. The terms provisionally adopted in the South East are that the service must be free to the end-user, and the data represented faithfully.
Since Summer 2007 it has been possible to licence Traveline data for commercial use, for a price. Availability of Traveline data hasn't yet been widely publicised.
Spot the problem? Two different commercial licensing models are emerging for the same data: One for commercial applications that are free to the end-user, and one for applications that adopt a different business model.
There's no such thing as a free lunch
While Google's service may be free to the end-user, that doesn't make Google a non-profit or charity. It shouldn't be given preferential treatment, simply because of its choice of business model.
Companies like Google and Yahoo are trying to find out as much about their users as possible . Queries about public transport journeys are like "manna from heaven". Time and place to ridiculous levels of detail. Fares will start to indicate who. Why you can take a good guess at if you know enough about the time and place.
What's the commercial value to this? Right now, advertising. Target people very precisely. Someone just asked how to get to Leicester Square (London) in the evening? Start selling them last minute theatre tickets, or suggestions for restaurants or nightclubs. It's like having someone hand out your fliers at the station entrance, but a lot cheaper. It's a logical part of a wider trend towards behavioural advertising.
But that's just the start. Some other uses for this data are discussed below.
So there is a solid commercial basis to Google's "free" service, for those far-sighted enough to look.
But Google is far from the only commercial organisation here:
- Search competitors Yahoo and Microsoft all have mapping projects.
- The UK has plenty of businesses with experience of building/operating journey planners: Xephos and Journeyplan already offer the public such services themselves. But there are several others who might evolve from the current business model based on bidding for contracts.
- The US has independents like HopStop, who provide journey planning information in many US cities, and might be expected to expand.
The market is theoretically highly contestable, particularly once it is viewed as part of a wider market for information, beyond simple "bus times". Unfortunately, giving data freely to one business, while charging another, is not a good way to foster competition.
Free data = free promotion
So information about public transport services might have commercial value. Does that mean it should be sold?
It seems we have forgotten that this information is gathered to inform intending travellers. The easier it is for people to use this information - in whatever form - the more likely it is that they will travel using public transport. That contributes to policy objectives, profits, long-term sustainability. Why do I feel like I need to explain this?
Airlines pay to have their data used by others. This extract is from CheapFlights.co.uk, the UK's leading air travel website:
"Do the agents and airlines on your site pay to keep their prices there? Yes. Apart from the joy of receiving their money, we do find it works much better. This is not as hypocritical as it sounds: in our first year (1996) we had a mixture of paying and non-paying contributors, and we invariably found that the paying ones took far more trouble to give us a fuller range of prices, and to keep them more up-to-date. With so many advertisers, we are able to carry a fully comprehensive range of cheap prices and so are able to fulfill our mission to be the No.1 guide to travel deals."
Why such a difference in approaches? Local public transport has little motivation to sell itself. And when it does, the decision making process is cluttered by many organisations with slightly different aims. In contrast aviation needs to sell itself to survive, reaps benefits from doing so, and has a more streamlined decision making process.
Aviation does use a commission model to sell travel, which local public transport does not. But that is not important to CheapFlights.co.uk: They aren't actually selling air travel themselves.
Impartiality and accuracy
The belief that only the public sector can be impartial in the provision of information is common within government.
For local bus services, it seems (there are no doubt other interpretations) to stem from a misguided expectation that "head-to-head" competition would become common following deregulation (in 1986). Head-to-head means two or more operators serving the same route, at about the same time, offering customers a genuine choice for their journey. In reality, very few routes sustain sufficient volumes of passengers to financially support more than one operator in the long run. Short run, such competition does break out. It is rarely profitable, so does not last. Across the whole of the UK, there are only a handful of routes where aggressive head-to-head competition has persisted, south Manchester being the best example. In most cases where there is a choice of operators, competition has matured: As far as competition legislation allows, operators co-operate with one another by providing information about one-another's services.
So while it is possible for a commercial business to seek an advantage by misrepresenting (or arguably, merely ignoring) another commercial business, the scope for such tactics is small.
Accuracy is more of a problem. If one were to adopt an open, free-for-all approach, inevitably some of the services offered would be biased or wrong. Inaccurate information here isn't just annoying. If you miss the last bus home because the time you were given was 5 minutes too late, that's a serious problem. Mostly for you; but also for the commercial operator, for your local politician, for wider transport policy. It is not just a problem for the organisation that gave the bad information.
Internet history is fascinating in this area. 7 years ago most major search engines delivered paid listings. The results of a search were biased based on how much money businesses (mostly) had paid the search engine. Not as adverts, as results. Yet over time people opted instead to use search engines that biased results by the quality of the information: People are, incredibly, not stupid, and will eventually tend to seek out the best source of information. Services that transpire to be hopelessly wrong or biased fall out of use by the majority, so long as there are adequate competitors to turn to.
If there is choice in the market for information, the risk of bad information being used should be reduced: Uncertainty and distrust will naturally lead users to check several sources until they become confident in their preferred source.
But that isn't the only reason choice is important.
Chris Anderson popularised the "long tail". A tiny proportion of internet services will attract the most of traffic, yet the aggregate of all the "unpopular" services may still be very important. The implications are that the more choice you offer, the greater total usage you will have: Aiming one product to be "the best" will always miss opportunities, even if it is successful. And sometimes approaches/products that you didn't expect to be popular will become so. Particularly if you guide users between the different options. The internet changes the cost model - additional choices that would be prohibitively expensive as physical products, can be delivered at very marginal cost electronically.
So choice is cheap; the more choice, the more use; the more use, the more paying punters.
All this is counter-intuitive for those (I include myself here) who would rather specify perfection, and then spend all our resources trying to attain it.
Without wishing to re-open old wounds, can monolithic projects like TransportDirect really be judged as a success? After 7 years the quality of the information is good, but not without "teething problems". The official Transport Direct Evaluation shows that a plethora of other travel information services attract more users than TransportDirect (even Transport for London, which only serves about 15% of the UK population). Since that evaluation makes scant mention of the cost, I've been forced to summarise it in the box below:
Superficial TransportDirect cost analysis
Every user session (basically, an enquiry) costs Â£0.20 (40 cents) to answer . Crudely discounting the Â£40+ million ($80 million) development costs over an operational life of ten years (this is the internet, right?), raises the total cost per enquiry to Â£0.60 ($1.20). To put those figures in perspective, the total revenue from each of those enquires is likely to be of the order of Â£2 ($4) . The vast majority of that revenue is paying for core operating costs - like drivers, vehicles, fuel and terminals. A commercial operator spending 30% of their revenue communicating basic information like when their service departs, will be out of business rather quickly, unless they are attracting very high proportions of genuinely new customers .
The case for government commissioned super-portals does not sound terribly convincing.
Thinking outside the box
A key benefit of choice and a more open approach, is that it fosters creativity and innovation in how the information is presented.
Let's look at a few examples.
I mentioned the "mashup" (colloquial term) earlier. These combine data from multiple sources and in turn create a unique service or visual output. TubeJP is a simple example (extract shown right). It was built as a non-commercial project by Dean Larman-Moore. It uses elements from Google Maps, some live data from Transport for London (TfL), journey and location point data from timetables and maps, and bespoke journey planner code. Nothing especially revolutionary, but one can see how different data sources can be wrapped together. Critically, this isn't something TfL are offering from their popular journey planner. TubeJP is different. For better or worse, its existence offers users a choice - in this case, an independent service that maps their route, or an official one that does not.
Choice is something TfL is trying to offer: Its journey planner offers almost 30 different user options. For example, the user can opt to select certain modes, or indicate their willingness to walk. TransportDirect's "door-to-door" journey planner provides ample evidence of the futility of trying to cover all the possibilities â€“ almost 60 choices are presented, yet one can easily conceive many more rational user preferences. There are strong parallels to the development of search engines, shown in the box below. In both cases, it would be easy to operate a series of websites, each of which presents the same information differently. For example, the difference between a kid's "Buz4u" service and a "Granny Planner" (the names need work, forgive me) is technically a domain name, an algorithm tweak, and a style sheet. But each would pick up users alienated by the "ultimate" (that is, compromise-based design) travel planner. As TubeJP demonstrates, the motivation doesn't have to come from government or operators - even without access to data sources, fresh ideas will emerge.
Online journey planner and search engine design parallels
The design of online journey planners strongly parallels early internet search engines. These typically presented the user with options or complex methods of writing queries â€“ which most users completely ignored. However, different search engines appeal to different individuals, in part because of the differences in results they deliver. For example, Google now dominates searches by affluent working-age adults, but holds much lower market share among those aged over 70. Ultimately the solution is a better search. But to quote Larry Page, co-founder of Google, "The ultimate search engine would basically understand everything in the world, and it would always give you the right thing. And we're a long, long ways from that." In the meantime, a choice of search engines serves a useful purpose.
The proliferation of social networking sites is hard to ignore: MySpace, Facebook, Bebo, and others. Around 20% of the UK population visit a social networking site each month . Likely heavily skewed towards the young, who are among the greatest users of public transport. These are developing such that third-party services can be delivered to people within these networks. The best example is Facebook's application interface. An example in theory: a group could decide visit a location, and a journey planning application automatically calculate personalised journeys for everyone in the group, such that they arrive at the same time from wherever they are now. The delivery of information becomes a natural part of the group's interactions.
Already Facebook applications exist for organising activities like car-sharing (ride sharing). For some large organisations, these social networking sites could become a replacement for the corporate intranet. That trend would create huge potential for services like automated company travel plans (advising employees on the best option for travel to work).
There is much debate as to whether Facebook-like services are likely to develop as platforms in their own right (much like DOS in the 1970s), or whether "walled gardens" (one provider distributing content) will always fail because the wider internet will always offer the most choice (as has been the case thus far in internet history).
I think they will raise the bar of expectations, such that people will be less likely to expect to go to specific websites each time they want certain information.
Perhaps Tim Berners-Lee's Semantic Web will finally emerge? Each time you combine information sources, you have the potential to create some new online service - a new way of communicating information. So perhaps the internet (and related services) will blossom like a fractal. Some branches will be attractive, others less so. Some won't get explored at all. But if we acknowledge the long tail, the fractal is genius: Somewhere in there will be the best solution for any one individual, in any one set of circumstances.
As internet-type information increasingly flows through mainstream personal mobile communication devices (it has been about to happen for the last decade, so it might start soon), the uses of this data will become progressively harder to predict. If the augmented reality concepts of the foresight projects like the Metaverse transpire to be correct, intending travellers will learn about local public transport networks by simply glancing at the street. Concepts like visiting a website or "texting" for information (using SMS to query times at a location), will become "so 2000..."
This isn't about Google at all. The issues Google Transit raises merely go to the core of what we do with local public transport data. This is all part of a much wider data liberalisation/control agenda. In the case of public transport information, there are real benefits to data liberalisation, even if that means there is no way to recover the costs of gathering the data:
- The more choice of information, the more usage, the more patronage. The cost of providing that choice may be very marginal.
- If data integration is the future, then potential travellers will expect it. Ultimately, if you want to keep and build patronage, you don't have a choice.
In addition, freely available information is likely to get built in while the technology is developing and maturing, rather than wait for the public mainstream to adopt a new technology and then build information into it.
There's a risk that:
- The process stalls because it unravels too many "difficult areas",
- The public sector becomes implicit in the creation of a private monopoly, because the approach is inequitable, or
- The whole exercise might be viewed "a bit like commissioning a travel planner", and all the other potential benefits lost.
The local public transport sector has a really great opportunity to be right at the centre of an information-driven location-orientated hyper-communicating society. Let's hope it can see it as an opportunity, and not a threat.
There are many aspects of the topic that this article does not address, such as alternative approaches to gathering and distributing data, and the role of specific organisations within the process: Why should local public transport information be a public sector function, when most operations are commercial? Rather than provide "the solution", the article aims to feed debate on the reasons why we do what we do.
- Introduction to UK Local Public Transport Data - a non-technical introduction to the United Kingdom's electronic local public transport data.
- Google Transit Feed Specification - information for developers at Google Code.
- Google Transit Trip Planner Group - discussion, and some documentation.
- JourneyPlan: Delivering the data - Mac Logan (Journeyplan) demonstrates a role for private developers.
- Transport Direct Evaluation - official evaluation project, including background information about competitors to TransportDirect.
-  Based on the comments of Roger Slevin (Traveline South East). This may not necessarily reflect the views of all those involved.
-  Bradley Horowitz's The Tech Lab essay explores some of Yahoo's approaches to this topic.
-  "Non-capital funding" - i.e. revenue cost - has a budget of Â£2 million in 2005/06. While the service may be growing in popularity, so far during 2007 the number of user sessions per week has been hovering just below 200,000. That's about 10 million user sessions per year. Hence a revenue cost of Â£0.20 per session. Assuming 100 million enquiries over 10 years (which assumes no growth, so is likely a slight underestimate, but this calculation is crude anyway), and a development cost of $40 million (likely an underestimate of the final cost, since TransportDirect is still being enhanced), that's an additional Â£0.40 per enquiry, or Â£0.60 total.
-  The Â£2 figure is very, very, approximate. I don't know the balance of modes (bus/rail/car), the extent to which one-way, multiple-journey or long-distance journeys are being enquired about, the proportion of single travellers vs groups, or the extent to which TransportDirect is disproportionately used by those paying reduced fares.
-  The Transport Direct Evaluation cites a net mode shift due to using TransportDirect of 5% of surveyed users (that's individual people, not enquiries). The net total clouds the patterns within: For every three users attracted to public transport, one is actually lost. Unfortunately, assessing the motivations behind travel behaviour change in one simple survey question is rather simplistic. For example, someone might seek travel information because they have already subconsciously decided to investigate alternatives. The sample is likely to be heavily biased.
-  Comscore reports 10 million visitors to each of the two most popular sites (Bebo and MySpace) during July 2007. That's a third of the online population, a sixth of the total UK resident population. A proportion of users will visit more than one site, so the overall proportion of people using social networking sites in general, is higher. The proportions making significant use of these services will be lower. Nielsen//NetRatings (PDF) notes that in May 2007, over half (53%) of Facebook's UK visitors also visited MySpace, as did 43% of Bebo's visitors. 26% of UK MySpace visitors also visited Bebo, 26% also visited Facebook.
The author is grateful to Roger Slevin (as project manager for Traveline South East) for some of the factual information about Google Transit in the UK. The opinions expressed here are entirely those of the author, Tim Howgego.
(Further comments are now locked due to heavy spamming.)
Similiar writings: Bus, Data, Google, Journey Planner, Public Transport, Rail, Transport.