AI comes with responsibilities: which provisions of the GDPR to comply with

Data protection is at the crossroads of the many applications of artificial intelligence processing personal data. As AI technologies, and in particular those based on machine learning, are inherently intrusive and maximise the volume of data processed, organisations must ensure that the rights and freedoms of individuals are not undermined when designing and using these tools.

The General Data Protection Regulation applies

First, organizations must ensure that they are transparent about the use of AI and provide clear and concise information about how personal data is collected, used and processed. This includes providing information on the purposes for which the data are collected, the types of data collected and the legal basis for the collection and processing of the data.

The collection and use of personal data must not be for non-legitimate purposes. Organizations must define at the design stage the purposes for which personal data are processed and ensure that processing complies with the General Data Protection Regulation (GDPR). The data used must be indispensable to achieve the objectives.

Next, for each data processing envisaged, there must be a legal basis, that is to say, a justification provided for by law. The GDPR provides for six legal bases: consent, compliance with a legal obligation, the performance of a contract, the performance of a task carried out in the public interest, the safeguarding of vital interests, the pursuit of a legitimate interest. Concretely, the legal basis is what gives the right to an organization to process personal data. The choice of the legal basis must be made before the data processing is carried out.

The principle of data minimisation also applies: the personal data collected and used must be adequate, relevant and limited to what is necessary for the stated purpose. The processing in question and the types and quantities of personal data processed must be proportionate to the achievement of the purposes and not only useful.

As the use of large amounts of data is at the heart of the development and use of AI systems, minimisation can pose challenges for controllers.

The Commission Nationale de l'Informatique et des Libertés (CNIL, the French supervisory authority) makes the following recommendations (non-exhaustive list):

Critically assess the nature and quantity of data to be used;
Check the performance of the system when it is fed with new data;
Clearly distinguish the data used during the learning and production phases;
Use mechanisms for pseudonymisation or filtering/obfuscation of data;
Draw up and make available documentation on how to compile the dataset used and its properties (source of the data, sampling of the data, verification of their integrity, cleaning operations carried out, etc.);
Regularly reassess the risks for the persons concerned (privacy, risk of discrimination/bias, etc.);
Ensure data security, including precise framing of access clearances to limit risks.

Source: CNIL, IA: how to comply with the GDPR?, 05 April 2022

The retention period of the collected data still needs to be determined. Personal data cannot be kept indefinitely. The GDPR requires defining a period of time after which data must be deleted, or in some cases, archived. This retention period must be determined by the controller according to the purpose for which the data were collected.

Another key consideration for organizations is the issue of automated decision-making. Article 22 of the GDPR gives individuals the right (with the exceptions provided for in paragraph 2 of that article) not to be subject to a decision based solely on automated processing, including profiling. This means that organizations need to consider the potential impact of their AI systems on individuals and ensure that they don't make decisions that significantly affect them without adequate human control if the people involved don't want to.

What measures should be taken?

The use of artificial intelligence therefore poses delicate challenges for controllers, who must ensure that personal data is collected, processed and stored in accordance with applicable data protection laws and regulations, including the GDPR.

Here are some recommendations to minimise the risk of non-compliance:

Adopt the ‘privacy by design’ principle in order to integrate the protection of personal data by design into the processing supported by AI technologies, particularly in terms of data governance, organisation and security.
Implement appropriate security measures to protect personal data against unauthorized access, disclosure or misuse.
Ensure that adequate documentation is in place to demonstrate the actions taken to achieve compliance, for example by setting up a database listing the annotations describing the data, and carry out categorisation, cleaning, standardisation, etc.
Develop an appropriate communication policy to provide individuals with transparent, clear and understandable information on how their personal data will be used by the AI system, including their rights to access, rectify, erase or restrict the processing of their personal data.
Establish procedures to respond to requests for access or rectification of personal data, and to deal with complaints or other issues related to data protection.
Protect against risks related to AI models such as belonging inference attacks, model exfiltration attacks or model reversal attacks.
Supervise the continuous improvement of algorithms, e.g. through a re-learning process at different frequencies.
Avoid algorithmic biases and frame automated decision-making, including the possibility of human intervention to enable a data subject to obtain a review of the processing of his or her data, to express his or her point of view, to obtain an explanation of the decision taken and to challenge the decision.
Evaluate the system on a regular basis to verify ongoing GDPR compliance.
Consider the use of synthetic, artificially generated data, rather than collected from real individuals. Synthetic data has the same characteristics as real data, but does not contain any personally identifiable data.

Sources: Information Commissioner’s Office (ICO), Guidance on AI and Data Protection, 15 March 2023 / CNIL, IA: how to comply with the GDPR?, 05 April 2022

In conclusion, artificial intelligence has immense potential to shape our future, but it also raises significant concerns about the protection of personal data. As AI-supported technologies continue to develop rapidly, it is crucial to strike a balance between the application and use of artificial intelligence, on the one hand, and the need to comply with rules, guidelines and laws to ensure privacy for individuals, on the other.