APIs and Privacy in the European Legal Context : a study of 4000+ API Terms of Services
I work for the Digital Economy Unit that investigates how the information and communication technologies affect the economy and the digital transformation of government. In more detail, I am currently on a study which is called API4IPS, APIs for innovative public services. The study aims to analyse the strategic essentials for the adoption of APIs for innovations of the public sector, looking at the same time at the three pillars, technical, legal and organisational.
Why do we do that?
APIs are the connecting nodes of all systems. Most of the time, every single actor connects to other actors in the digital sphere through an API, and therefore, APIs are technical enablers to achieve some of the goals of the European Digital Strategy. I’m only going to mention the three of them:
- The European Data Strategy in which APIs are enablers of data access, supply, and their control.
- The European Industrial Strategy in which APIs are enablers of the cross fertilisation between and within sectors (fostered by innovation in small and medium enterprises).
- The European Strategy in AI, where APIs are enablers of access to algorithms and their controls, access from algorithms to data for their training and their observation.
Going back to the three pillars I mentioned earlier, the first one is technical. We analyse how best to manage APIs, not only individually but also as a coordinated effort for an organisation. We look at how to improve discoverability so that our APIs that are facing the external world are used and uptaken. This leads to the question of how to ensure the security of the infrastructures, because of course, APIs are doors to our digital premises, and therefore this needs to be assessed.
ToS: Coordination in digital ecosystems?
Today, I’ll focus more on both the legal and the organisational pillars. We started a legal empirics analysis by gathering over 4000 API terms of Services. We analysed the content of these documents to try and estimate whether there is real coordination in digital ecosystems from a systemic perspective. The aim is to find means to ensure the robustness of digital infrastructures, of digital tissue and more particularly, legal stability. So we got those 4000 ToS, compared them and tried to understand if there’s homogeneity in the definition of these documents, which in the end are our contracts. We checked if the practices encoded in these ToS are first compliant with the governing laws, and then if they are fostering or hindering the cooperation or fair competition among the digital actors. The terms of services are a unilateral legal offer that API providers are giving to any consumer of their API. Whenever someone uses the API, they are by default accepting the terms of service and therefore, adhering to the contract by default.
Methodology
We got 4287 self-declared ToS documents from ProgrammableWeb, with whom we collaborated in 2019, and out of these documents, we managed to download over 2800 documents with content. We used very well known natural language processing techniques to analyse both the structure and the conditions that are described in these documents. First of all, we analysed the homogeneity of the ToS, focusing on three main aspects: Structure, API Specificity (or more related the digital change on which they are embedded) and finally, we did an analysis and a comparison between normal API providers and the big players like Google, Twitter, Twilio, and so on and so forth. Regarding the structure, we found well established structural commonalities among the documents. The clauses that we identified in those documents were more or less always the same (contracting parties, start of the contract, termination conditions, payments, governing law, liabilities, and then indemnification, warranty, privacy, severability, IPR, etc.).
Regarding the API specificity, what we looked for was the number of mentions of APIs in these terms of services and we found that 40% of these didn’t even mention them once. This was surprising for us because of the data sets: ProgrammableWeb is an API provider repository that is self declared by the API providers. We couldn’t understand how it was possible that the ToS were not more specific. As we dug deeper, we understood that most of these terms of services, even if they are related to APIs, are only dedicated to the single application for which they are used. Instead of being used as an access point for multiple potential applications, are APIs restricted in their uses? Are people missing out on opportunities to use their APIs? Are people lacking a systemic vision on the services that they are providing?
To continue with this analysis, we tried to look at the big providers and see what they do. They actually have a multi-layered Terms of Service structure that typically contains terms of services specifically for defining terms of horizontal API service, such as authentication, and then Terms of Service specific to the functionality that they provide through APIs (such as Google Maps, Google ads, etc). That’s where we found a difference between the types of providers. We then analysed the compliance with government laws, getting closer to the topic of privacy. We were able to identify 111 legal documents worldwide, mentioned in these terms of services. These legal documents contain acts, regulations, directives, ordinances and decrees, and most of them were defined in the late 90s and 2000s. The other surprising point was that, even though ProgrammableWeb has more users in the US, 60% of the documents were from the EMEA region (European legislation). However, not surprisingly, US laws appeared more often in more documents. GDPR was identified as the most prominent source of regulation. There were also mentions of soft regulatory actions, such as Standards, Code of Conduct and Code of Practices.
Privacy
Almost one quarter of all the legal documents mentioned in the ToS are related to privacy. They are also related to digital governance topics such as ICT regulation, intellectual property, and contract consumer protection. We also have mentions to sector-specific regulations, such as the environment, health, finance and tax, export and trade and anti-fraud laws. Regarding the privacy and data subject rights, we identified 25 law entries that were related to privacy. They came from 13 different areas, from the regional perspective to the national scope to pluri-national, and as I said, the GDPR was the most mentioned. By the way, we found that some of them were amended or substituted, but they are still present in the terms of services… It could be good to tell those companies to review their ToS. As far as data subjects rights were concerned, we had a look at what the most common topics were:
- data portability
- privacy protection
- erasure
- the right to be forgotten
- rectification aspects
- restriction on data processing
We also did a study on jurisdiction clauses to see if they are a legal barrier for innovation. What you see on the above map are the places that are mentioned in the jurisdiction clauses, these are the courts where you have to go to dispute settlement. Let’s imagine you are a spanish person developing an application using a US API and the code that you have to use originates from California, then you might be willing to risk having trouble because of the distance, the language barrier etc, it discourages interaction.
The analysis on cooperation and/or fair competition also helped us understand what the dynamics were, looking at transparency, termination conditions, liability and warranties. Regarding the transparency, we tried to understand how difficult it is to process and understand the terms of service contract for a human being. Not surprisingly, legal documents are very difficult to read, even more so because in 80% of the documents, there were no definitions of the terms available to the reader, the word count per sentence was very high and thus, the readability score was very bad. The European Platform for Business Regulation has already asked the protagonists to improve the comprehensibility of the ToS so it doesn’t hinder the possibility to embark on new ventures regarding the termination conditions. As far as the termination condition clauses were concerned, we were struck by the fact that 35.7% of the documents analysed declared unbalanced termination conditions, meaning that they say they could truly stop the service that they are providing anytime without any responsibility. If these terms of services are API specific, they might create potential contract discontinuities in data value chains. And this, of course, generates uncertainty on the viability and continuity of services and discourages innovation. On the same vein, we did the analysis on liability and warranties, bringing to light that 14% mentioned total exclusion in liability and only 296 ToS offered some warranties. We are now digging deeper and will publish more results in 2022, notably about:
- Indemnification,
- Out of Court Dispute settlement,
- Payment Conditions
- Suspension/modification/restriction of service provision
- IPRs
Q&A Section
Q: Do you think APIs deserve a specific regulation in Europe, like APIs act regulations, or do contract trade practices seem to be reliable enough so that we should let the lawyers develop a set of frequent practices?
A: It is very difficult to say that APIs themselves deserve regulation, because this is a technology and we know that technologies evolve so quickly. Typically legislation is made with a bigger scope in mind. However, there are lots of policy instruments that can be used, rather than just regulating, to enforce the use of good practices on APIs. APIs can be used for monitoring the implementation of data regulation and there is a wide range of things that can be done other than regulating. It is clear to me that APIs already play an important role in the data strategy and there are some documents that already speak about this explicitly. For those not so clear about the topic, they do mention APIs as coccenting dots, as in the payment service directive.
Q: Now that you have done the study on 4000+ ToS, do you also plan to make a study on how these ToS are applied? Are these applied in reality or are they just declarative things that don’t reflect the reality of the implementation of the APIs?
A: This is a very interesting question and there might be some evolution in this direction in the future.