In some respects, Generative AI can be too effective when it comes to certain tasks. One such function in Gen-AI is synthetic data generation. This task involves synthesizing data that resembles real data, raising concerns about the potential for re-identification. This may happen as a result of lack of testing and working with large data sets. The automation of the process often sees minimal governance by humans which leads to black box processing if you will.
Another area of concern is Chatbots which may collect a lot of personal data, IP information and cookies when interacting with customers. There is a real risk that information collected may breach GDPR regulations on storing and transmitting sensitive PII and special category data particularly with public chatbots where organizations have less visibility over.
GDPR rights: Implementing GDPR subject access requests such as personal data erasure is relatively straightforward with organizational databases. It’s more difficult to delete data from a machine learning model and doing so may undermine the utility of the model itself. There is also the issue that there maybe many data controllers for personal data in Gen-AI environments, hence the ownership and responsibilities maybe blurred when executing data subject requests.
Data Bias: Generative AI models learn patterns and information from the data they are trained on. If the training data contains biases or prejudices, these may be reflected in the generated output. For example, if the training data is skewed towards certain demographics or perspectives, the generated content may reinforce those biases. Efforts should be made to mitigate and address biases in training data to ensure fair and unbiased outcomes.
Organizations such as Clearview AI have been fined by regulatory authorities for collecting millions of images and other personal data from online public sources such as Facebook for use in their AI powered identification products. Generative AI creates heightened risks in businesses that personal information is being processed and shared without knowledge or consent by data subjects.
Lack of testing: Normally, high risk applications are subject to privacy impact assessments. Companies frequently overlook this step when considering using open-source AI which can be a significant risk. While AI environments are more complex, companies need to descide scope for assessment activities with a valid justification. Instititions such as the ICO in the UK have issued guidance on AI assessments.