Generative AI Privacy and What You Need to Know

tips on running a cybersecurity awareness training program

Generative AI is a type of artificial intelligence technology that can be used to produce various types of content, including text, imagery, audio and synthetic data.

The process is enabled by generative AI models which learns the patterns and structure of their input training data and then generates new data that has similar characteristics.

Generative AI have been adopted across a swathe of industries including financial services, publishing, software development & healthcare to name a view. Perhaps its most notable application is ChatBot type applications like ChatGPT and Bing and Text to image applications like Stable and DALL-E which can use deep learning techniques to generate image based on a natural language input.

AI Learning Methods

Generative AI is a subset of Deep Learning. Deep learning is a type of machine learning and artificial intelligence (AI) that imitates the way humans gain certain types of knowledge. In DL, the algorithms use a type of supervised learning known as deep neural networks. These networks consist of multiple layers of interconnected nodes designed to process data hierarchically.

Each layer of the network extracts features from the input data, and these features are then used by the next layer to further refine the output. DL algorithms can learn from unstructured data, such as images, audio, and text, and can be used for tasks such as image recognition, speech recognition, and natural language processing. Generative AI uses a blend of self-supervised M/L learning and unsupervised learning to develop a data set. Self supervised means the training data is autonomously (or automatically) labeled and more closely imitates the way humans learn to classify objects.

Unsupervised learning uses machine learning algorithms to analyze and cluster unlabeled datasets. These algorithms discover hidden patterns or data groupings without the need for human intervention.

AI Privacy Concerns

In some respects, Generative AI can be too effective when it comes to certain tasks. One such function in Gen-AI is synthetic data generation. This task involves synthesizing data that resembles real data, raising concerns about the potential for re-identification. This may happen as a result of lack of testing and working with large data sets. The automation of the process often sees minimal governance by humans which leads to black box processing if you will.

Another area of concern is Chatbots which may collect a lot of personal data, IP information and cookies when interacting with customers. There is a real risk that information collected may breach GDPR regulations on storing and transmitting sensitive PII and special category data particularly with public chatbots where organizations have less visibility over.

GDPR rights: Implementing GDPR subject access requests such as personal data erasure is relatively straightforward with organizational databases. It’s more difficult to delete data from a machine learning model and doing so may undermine the utility of the model itself. There is also the issue that there maybe many data controllers for personal data in Gen-AI environments, hence the ownership and responsibilities maybe blurred when executing data subject requests.

Data Bias: Generative AI models learn patterns and information from the data they are trained on. If the training data contains biases or prejudices, these may be reflected in the generated output. For example, if the training data is skewed towards certain demographics or perspectives, the generated content may reinforce those biases. Efforts should be made to mitigate and address biases in training data to ensure fair and unbiased outcomes.

Unlawul Processing:

Organizations such as Clearview AI have been fined by regulatory authorities for collecting millions of images and other personal data from online public sources such as Facebook for use in their AI powered identification products. Generative AI creates heightened risks in businesses that personal information is being processed and shared without knowledge or consent by data subjects.

Lack of testing: Normally, high risk applications are subject to privacy impact assessments. Companies frequently overlook this step when considering using open-source AI which can be a significant risk. While AI environments are more complex, companies need to descide scope for assessment activities with a valid justification. Instititions such as the ICO in the UK have issued guidance on AI assessments.

Identity Theft

Perhaps one of the most severe privacy threats revolves around the ‘Deepfake’ abilities of AI to generate genuine looking documents and photos. Cases have been recorded where generative AI has defeated online banking account facial recognition programs (for new account signup). It has also been associated with impersonation of company CEO’s and stock manipulation. This primarily involved conference calls where Gen-AI was used to deceive other executives into divulging sensitive information.

Deepfake Protection

Various technologies and tools are being employed by social media companies and software companies to defend against this threat. Technologies such as blockchain are being employed to verify the source of videos and images before allowing them onto media platforms. Adobe allows creators to create digital signatures when they create content while Microsoft is now issuing confidence scores as to whether the media has been manipulated.
Sensity offers a detection platform that uses deep learning to spot indications of deepfake media in the same way antimalware tools look for virus and malware signatures. Users are alerted via email when they view a deepfake.

Identity Theft Prevention

While AI has accelerated the risk of dentity theft there are some things you can do to prevent this serious threat.

Restrict access to you credit reports. Credit reporting agencies such as Equifax have tools which allow you to place a ’security freeze’ and ‘fraud alert’ on your credit report. The former restricts access to your credit report while the later sends out a notification when someone tries to open credit in your name.

Turn on multi-factor authentication for your online accounts.
Most online account providers offer an option to enable two factor authentication if they don’t already require it. Ensure that the feature is enabled and use strong passwords which are not reused across your other accounts.

Identity Theft Protection Service.
There are several services such as Aura and Identity Force which are subscription based which offer alerts for social security number abuses, social media account monitoring, three bureau credit monitoring and identity theft restoration services.

Next Steps

Training staff on the risks from AI and other threats is a key preventative control against cybercrime. Discover our range of presentation training solutions and tools on on our online store. We have slideshow training on generative AI privacy risks, phishing, general privacy and security awareness, remote working security and more. Click the link below to visit our store or contact us on the contact us page to ask a question.

Training Shop