Open Source Software and Open Innovation communities dream of government data freely moving as Open Data through Smart City APIs. In 2017 Xapix and FIWARE collaborated on Open Data projects, and we gained new insights into the challenges affecting data: B2C and B2D politics, missing incentives for key stakeholders, and lack of strategy and expertise. Here’s what we’ve learned and the solutions we deliver.
The Dream of Public Data Infrastructure
Imagine enjoying a movie on your way to work, knowing that your autonomous vehicle is taking you safely and quickly to your destination. A commute where cars communicate about dangers and traffic, help find the nearest coffee shops and suggest parking spots or charging stations. A world of self-driving, clean and safe, connected cars. It’s what people in mobility dream about.
This dream depends on advancements in Open Data or Smart City APIs. We can’t move forward without overcoming challenges with data collection, maintenance, and distribution. Not enough open data exists, and what does is extremely heterogeneous. Few economic or legislative incentives encourage companies and organizations to produce clean data and make it available. Government employees often comply only with baseline requirements. We need to access all necessary data, deal with its inconsistency, figure out how to maintain it and educate participants. This might sound overwhelming. The good news? We understand the challenge and have some solutions.
In 2017, Xapix had the pleasure to partner with FIWARE by providing our tool to help cities like Malaga, Spain, sustainably comply with FIWARE NGSI Smart City data standards. Xapix is normally used by automotive manufacturers, tier-one suppliers, and mobility companies. We’re also engaged in a research project on connected vehicle data standards with the mFUND program of the German Federal Ministry of Transport and Digital Infrastructure (BMVI). I’ve seen how tools like ours as part of a greater political and industrial data strategy can have an impact to build a smarter future.
Challenges with Industries and Corporate Politics
What standards do exist aren’t clear and enforceable. Industry-led initiatives run for decades and only manage to get semi-successful adoption, like IATA NDC in aviation, and participants branch out own versions, only partially implement the standard and or add competing approaches if they don’t get their ways in decision committees. Whether stakeholders are just distracting or actively sabotaging is impossible to judge which is why pretend-collaboration can also be a tool of corporate politics. Examples for failures to define crystal-clear industry standards would be SQL, OAuth or JSON schema, where we are seeing today many different versions or implementations of each. The brighter positive examples like Swagger / OpenAPI and jsonapi.org shine – it is possible!
Companies don’t respect even these standards. Resisting industry standards serves market leaders very well to maintain their top position. Therefore they tend to politically lobby against legislative standardization efforts. With enough market dominance, joint industry standardization efforts can just be ignored or otherwise can easily be distracted by joining an initiative and blowing it up from the inside. Usually, market leaders try to use their competitive advantage in the amount of data they possess to lock in existing customers or create better services and new products on their own rather than sharing the data with the public and therefore competitors. E.g. in agricultural machinery manufacturing the market leader for the past decades has been ignoring industry efforts to standardize signals sent via the CAN bus to have tractors communicate potentially with competitor devices they are pulling and vice versa. Or more famously to the web affine reader Microsoft Internet Explorer ignoring W3C browser standards up until rather recently.
Companies prefer to sell than share data. Selling data is an industry-wide trend, especially when guarding data doesn’t provide a competitive advantage. Independent startups like e.g. Otonomo and SmartCar in the mobility sector use this business model; however, the market is young and the monetary value of data often unknown or unstable. Tesla has tremendous success directly selling to competitors their autonomous driving data, which they own according to their terms of service. Their competitors need to catch up in research and development, but they don’t have a large enough fleet of vehicles to collect sufficient data.
Serving as a patch to the currently deeply heterogenous data landscape there is a whole fleet of data aggregator businesses across industries. The most successful of them usually do the hard work of collecting and integrating all available data for a specific use case from all providers and then resell that data oftentimes along with additional services. In the EU mobility B2B sector HaCon for train schedules and Distribusion for ground transportation bookings come to mind. Consumers will probably more familiar with brands like SkyScanner for flight tracking or moovel for car sharing and public transportation.
Challenges with Government and Legislation
Legislation that would make open sharing possible is progressing, but the road is still rocky. The challenges that exist in private industry exist in the public sector: heterogeneous data, incomplete standards, and lack of distribution channels cause significant challenges.
Government entities choose how to represent their shared data, causing innumerable instances of incompatibility. Differences in column names and capitalization within a single language already cause machine problems when parsing data. In places like the European Union, various languages and alphabets overlap to exacerbate the problems of inconsistent representation.
Political pressure to adopt certain standards is weak and faces challenges from companies actively working against open-data initiatives. Most standards require the publication of only a small range of data without stipulating format or providing information about how to access it.
Companies that do comply still find ways to obscure their data. For instance, some countries oblige the oil industry to make environmental and production data accessible. Doing so goes against their own interests, so they organize data in confusing table aggregations and publish it in PDFs stored on local servers with changing IP addresses.
The location of publication isn’t consistent. Some cities publish data on their own website, others provide them on their file servers. Some providers sell hosting products as software as a service to cities, such as Opendatasoft. While those make data better accessible, it’s still necessary to go through various providers and their infercases and payment models rather than consolidated sources.
If legislation requires an agency to publish their data, those working inside the agency may still have little incentive to help businesses or open-source communities. Producing and publishing Open Data generally sits at the edge of the employee’s responsibilities, and not enough time or education is allocated to support these tasks. This is especially true for people working with data feeding into Smart City APIs, where continuous support and maintenance add to the workload.
We Need Better Tools
Tools that enable companies and government agencies to comply with strengthened standards and make data accessible and malleable can solve these challenges. To empower economic incentives, we need tools that integrate and exchange data to enable new business partnerships and data marketplaces. Tools that aren’t only for scientists and programmers but a broader audience with little or no technical expertise. These users can find a few appropriate tools already, but they aren’t widely adopted yet. New tools are in the making and I am proud to be part of this endeavour.