3.1. Potential impact of regulations

There is no doubt that various anticipated and forthcoming EU policy and regulatory initiatives will have a profound impact both on research activities within the AI4Media project, as well as on the commercial and non-commercial activities undertaken by AI4Media partners. In the following paragraphs, we briefly discuss the potential impact of the EU regulatory initiatives in the field of AI for the AI4Media project.

Data and data access for researchers

The availability of social media data for academic research and journalism is a major challenge. The social media companies have no incentive and no interest in revealing what kind of data they have on users and how this data is being used. Some social media data is accessible through Application Programming Interfaces (APIs), but most of the major social media companies are making it difficult for academics and journalists to obtain comprehensive access to their data (Batrinca B., Treleaven P.C, 2015). Access to social media platforms’ data for researchers is currently mainly governed by contractual agreements and platforms’ own terms of service. As provided in the Assessment of the Code of Practice on Disinformation, “it is a shared opinion amongst European researchers that the provision of data and search tools required to detect and analyse disinformation cases is still episodic and arbitrary and does not respond to the full range of research needs” (European Commission, 2015).

The most recent example is Facebook’s shut down of the accounts of researchers using access to Facebook Ad Library to study political advertising and misinformation within the Ad Observatory at New York University (NYU) (Hatmaker T. 2021). In Facebook’s view, the NYU’s Ad Observatory project studied political ads using unauthorized scraping to access and collect data from Facebook, in violation of the platform’s terms of service (Clark M., 2021).

It is clear that as provided by the “Artificial intelligence in the audio-visual sector” report by the European Audio-visual Observatory, “we need more data and independent research on the availability of different types of content, the consumption and engagement with that content, the participants involved in this process and the impact of these processes on individual and collective democratic and cultural performances” (Cappello M., 2020). Recent regulatory initiatives, such as the Digital Services Act, try to address this problem. Article 31 of the DSA proposal provides a specific provision on data access and scrutiny. It imposes an obligation on the very large online platforms to provide the Digital Services Coordinator of establishment or the Commission, upon their reasoned request and within a reasonable period, access to data that are necessary to monitor and assess compliance with the DSA. Upon a reasoned request from the Digital Services Coordinator of establishment or the Commission, VLOPS shall also provide access to data to “vetted researchers”. According to Art. 31(4), “in order to be vetted, researchers shall be affiliated with academic institutions, be independent from commercial interests, have proven records of expertise in the fields related to the risks investigated or related research methodologies, and shall commit and be in a capacity to preserve the specific data security and confidentiality requirements corresponding to each request.” The reasoned request to access data must, however, come from the Digital Services Coordinator of establishment or the Commission. Moreover, according to EU DisinfoLab, the new rules pose overly restrictive criteria needed for “vetted researchers”, narrowing the scope to university academics. This is not likely to facilitate access to data to a variety of different actors: journalists, educators, web developers, fact-checkers, digital forensics experts, and open-source investigators (EU Disinfo Lab, 2021). The final scope of this provision will, undoubtedly, shape the way in which (vetted) researchers, journalists, and social activist will be able to access platforms’ data.

While discussing access to (media) data, it is worth mentioning that in December 2020, the Commission adopted an Action Plan to support the recovery and transformation of the media and audio-visual sector (Media and Audio-visual Action Plan) (European Commission, 2020). The Media and Audio-visual Action Plan aims to support the recovery and transformation of the media and audiovisual sector. It addresses the financial viability of the media sector to help the media industry recover and fully seize the opportunity of digital transformation, and further support media pluralism. Importantly, under Action 4 ‘Unleashing innovation through a European media data space and encouraging new business models’, the EC proposes the concept of the “media data space” to support media companies in sharing data and developing innovative solutions.

The EC recognizes the importance of data for the media sector: “data spaces can change the way in which creators, producers, and distributors collaborate. They host relevant media data such as content, audience data and content meta-data as well as other types of data on users’ behaviors that might be useful to create content better tailored to consumer needs and distribute it more efficiently.”327 The initiative of a European “media data space” builds on the European Data Strategy and the proposed Data Governance Act. While it remains to be seen what shape the media data space will take, the creation of a shared data space will facilitate European fact-checking networks in news verification and help fact-checkers to have access to the relevant data to the spread of disinformation.
Finally, it is worth reminding that there are growing data and data governance requirements while training AI systems. Art. 10 of the AI Act provides that high-risk AI systems which make use of techniques involving the training of models with data shall be developed on the basis of training, validation and testing data sets that meet the quality criteria such as appropriate data governance and management practices. Also, training, validation and testing data sets shall be relevant, representative, free of errors and complete. If adopted, these legally binding requirements will set a high standard on processing data.
Although, in the current draft AI Act, this provision applies only to ‘high-risk’ AI systems, the scope of this provision may be changed before the final text is adopted.

One the other hand, one must keep in mind that privacy and data governance requirement of the AI HLEG applies to all AI systems, regardless of the context (high risk or low-risk, academic research or commercial application). Similarly, the GDPR already contains legally binding requirements on all data processing activities (including in the context of AI systems) which involve personal data, such as the obligation of a lawful ground for data processing activities and adherence to GDPR principles.

Academic research exception in the AI Act

The key issue which comes to the fore is the scope of exceptions for academic research. It is important to note that a research exception is only provided in a recital, it is not dealt with elsewhere in the text of the proposed Regulation. Recital 16 of the AI Act deals with the prohibition of placing on the market, putting into service or use of certain AI systems intended to distort human behaviour, whereby physical or psychological harms are likely to occur. Such AI systems deploy subliminal components individuals cannot perceive or exploit vulnerabilities of children and people due to their age, physical or mental incapacities. They do so with the intention of materially distorting the behaviour of a person and in a manner that causes or is likely to cause harm to that or another person. Research for legitimate purposes in relation to such AI systems, however, should not be stifled by the prohibition, if such research does not amount to use of the AI system in human-machine relations that exposes natural persons to harm and such research is carried out in accordance with recognized ethical standards for scientific research. Should this provision be conceived as a general exception for research or a special exception only related to prohibited AI practices referred to in recital 16 (i.e., systems intended to distort human behaviour, whereby physical or psychological harms are likely to occur), but not to other categories (e.g., biometric identification systems, social scoring systems)? It seems that this recital only addresses Art. 5(1) (a) and (b).

As a consequence, researchers will have to comply with other AI regulation obligations i.e. related to “high-risk systems” or certification procedures. Moreover, the recital provides that “research (…) should not be stifled”, however, only if research is done “for legitimate purposes”, “if such research does not amount to the use of AI (…) that exposes natural persons to harm” and “is carried out in accordance with recognised ethical standards”. The meaning of these notions, especially the notion of “research for legitimate purposes” is unclear, which may adversely affect legal certainty of researchers.

AI Act’s applicability to media applications

In addition, the scope of the AI act is not clear for the AI systems applied in the media sector. For instance, it is not clear whether the provision which prohibits the use of subliminal techniques could cover some AI systems used in practice such as the recommender systems or systems used for targeted advertising. The requirements imposed on manipulative AI, such as the use of subliminal techniques or the exploitation of a specific vulnerability of a specific group of persons, as well as the requirement of intent, can result in these provisions having a limited scope. More incidental manipulative systems (such as targeted advertising) are therefore not likely to be covered.
Though the explanatory memorandum suggests that other existing instruments still cover manipulative or exploitative practices, apart from practices prohibited under Art. 5, it fails to address that none of this legislation explicitly contains provisions on manipulation. As Bublitz and Douglas emphasize, AI systems can powerfully influence or weaken control over individuals’ thoughts and behaviours, by bypassing or weakening rational control (Bublitz J. C., Douglas T., 2021). This includes microtargeted advertisement, as well as abuse of trust in recommender systems and their influence on decision-making. Thus, these practices should also be deemed to be manipulative, and they must be fairly addressed in the AI Act, instead of simply referring to other legislation. Perhaps, they could be classified as high-risk AI if they substantially influence thought or behaviour in ways that bypass or weaken rational control.

The AI high-risk systems listed in the annex to the AI Act do not contain media applications, but the media sector is directly concerned when it comes to transparency obligations both in the AI Act proposal and in the DSA proposal.
The scope of the AI Act is also unclear when it comes to transparency obligations applicable to bots, emotion recognition systems and deepfakes (Art. 52 of the AI Act). It is recommended to pay a particular attention to how the “emotion recognition system” definition and applicable transparency obligations for such systems change as the AI Act proposal goes down the legislative path. In particular, will ‘sentiment analysis’ fall under “emotion recognition system” definition? Or, can measuring and predicting the user’s affective response to multimedia content distributed on social media with the use of physiological signals be considered as such?

As a side note, according to the impact assessment of the AI Act, transparency obligations already exist in other cases which may involve AI such as when a person is subject to solely automated decisions or micro-targeted. The impact assessment reminds of the following legislation which provides transparency obligations: data protection legislation (Art. 13 and 14 of the GDPR), consumer protection law, the proposals for the e-Privacy Regulation and in the proposal for the Digital Services Act. However, as an example and as explained in Section 4.2.1, transparency provisions on recommender systems in Art. 29 DSA only apply to very large online platforms. Nevertheless, should the scope of this provision change in the upcoming DSA drafts, automatically ranking user profiles and recommending content can be subject to the new obligations. Similarly, provisions on targeted advertising in Art. 24 DSA only apply to online platforms, which is currently limiting the reach of such provision.
This shows that the AI Act proposal does not exist in a legal vacuum and existing legislation is equally applicable to AI systems. The applicability of various legal frameworks to various AI systems and various types of the platforms, makes, however, the current legal picture puzzling.

Algorithmic copyright filtering

Concerning AI technologies to detect IP infringements, both the resolution and the action plan encourage utilizing filtering tools. Though such implementation could bring convenience in terms of detecting any infringement faster than a human review, it is important to note that AI technologies are not sophisticated enough at the moment to analyse the nuances of copyright protection. Such algorithmic filtering especially creates issues concerning the detection of copyright limitations and exceptions (Samuelson P., 2020).

In the academic context, public domain work and non-exclusive/open licenses such as Creative Commons licenses are heavily used. However, such filtering technologies usually do not come equipped with an aggregated database of non-exclusive licensed and/or public domain works. Thus, without clarifying further the effect of implementation of such technologies on the aforementioned exceptions, academic and creative work could suffer a great deal.
Additionally, practices of such filtering tools have the potential to interfere with freedom of expression by removing legal content, violating the rights of access to knowledge and freedom to share. Unfortunately, the issue of over removal is especially gaining prominence in countries where notice and stay down regimes have also been trending, perhaps influenced by the EU’s policy on incentivizing the implementation of such tools. Furthermore, when it comes to relying on automated decision-making (algorithmic filtering) concerning legality or illegality of such works, it is always important to have ex-ante human review mechanisms before removing content to avoid any violations of fundamental rights, as well as preventing bad faith takedown notices.

Authors:
Lidia Dutkiewicz, Emine Ozge Yildirim, Noémie Krack from KU Leuven
Lucile Sassatelli from Université Côte d’Azur