Written by Ken Cox, expert on data privacy and President of Custom Private Cloud Hostirian
Since its release to the public, ChatGPT has been one of the most intriguing pieces of software on the market. Its role as one of the first — certainly the most visible – publicly available applications of artificial intelligence meant that it quickly became the talk of the town.
The town, in this case, was the whole world. ChatGPT was adopted quickly, breaking the record for the fastest run to a million users within five days. Instagram held the previous record, with two and a half months from zero to million users.
The talk varied. People didn’t know, and many still don’t, what to make of it and what kind of effects it will have on the world. It certainly has the potential to disrupt industries, leave some people without jobs, help some keep theirs, and streamline processes.
At the same time, no one could guarantee that it wouldn’t turn into the herald of an era where fake and inaccurate news abounds, and people have outsourced their ability to think critically to a language model that sees no issues with making stuff up.
We can confidently say that ChatGPT is a generative artificial intelligence tool that relies on user input — information from its users — to perform some of its functions. So while the jury’s out on whether it’s the best thing since bread came sliced or baby Skynet that will be the end of us, we might as well deal with the problem at hand, and that’s ChatGPT’s use of data.
As the president of a leading data privacy company, Hostirian, I am no stranger to skepticism surrounding Artificial Intellignece. When talking about ChatGPT and data, it’s essential to distinguish between the data OpenAI initially used to train its model and the data its users have supplied since the tool went public.
The former is a can of worms that got OpenAI and Meta in the headlines after the companies were sued for copyright infringement over using their works without proper authorization. Whether or not they have a case is to be determined. You only need to google Silverman vs. OpenAI to keep up.
The latter deals with what happens with all the data its users give to ChatGPT, plus the data it harvests to make its service available. That’s the data people give to OpenAI when creating their accounts. They also feed the data to ChatGPT as a resource for further content creation or checking for errors.
Finally, there’s the data ChatGPT scoops automatically — usage data, data from cookies, and browser data. Users should count on ChatGPT to save every bit of information they share and then use the appropriate information for appropriate services.
So a card number might allow a user to continue using ChatGPT’s service and not to train OpenAI’s model further. However, their data conversation with ChatGPT can’t be used to bill for the subscription, but it can add to the corpus used for training. And even if all the information is used perfectly appropriately, the fact that it’s stored still leaves a couple of problems open for ChatGPT and its users.
One such problem became evident on March 20, when ChatGPT had to be pulled offline to work on a bug that allowed some users to see titles from other active users’ chat history. Upon further investigation, OpenAI determined that 1.2% of its subscribers have had their payment-related information potentially compromised. The information at risk included active users’ first and last names, email addresses, payment addresses, credit card types, credit card expiration dates, and the last four digits of the credit card number.
These privacy breaches are only one type of security challenge when companies gather sensitive information. Companies with that kind of data can routinely profile their users, all in the name of personalization and improving user experience. Those same profiles of users can be used for targeted messaging or other unwanted purposes.
Companies also regularly share data with other platforms or third parties. Those practices increase the number of situations where data could be misused or mishandled. These third parties might have different privacy standards and use data in ways the original provider did not anticipate or intend. It might also be stored less securely.
Still, even though all of these risks are undeniable, people are not likely to stop using ChatGPT over them. That genie’s out of the bottle, and we can’t put it back.
However, we can ensure that ChatGPT is the safest data environment it can be. Every interested party can contribute to that effort. Users can, for starters, read the Privacy Policy and inform themselves on which data is collected, how it’s used, and whether it’s shared and with whom. Then, they can make an informed decision about the information they feel comfortable sharing with ChatGPT.
For their part, OpenAI needs to be transparent about their data collection and sharing practices. It also needs to take proactive measures to enhance safety and security and be transparent about any breaches.
OpenAI has already taken steps to allow users more control over which data goes into the training corpus. Users can now opt out conversations from training and model improvement usage. Those conversations would be deleted permanently after 30 days. The company also said it was looking into developing another type of subscription that would allow even more control over data to professionals and enterprises.
It’s also essential to note that none of these changes might have happened without pressure from regulators. Ideally, the AI industry should be able to regulate itself, but that’s unlikely to happen. Government regulators might not be the heroes we deserve, but they still got the job. So far, we’ve seen interest in OpenAI and action regarding it on both sides of the Atlantic — and that’s a good thing for data protection and privacy.
Innovation always brings a mix of opportunities and risks. As AI and machine learning technologies become more integrated into our lives, how we handle personal data and privacy will invariably evolve. Ultimately, the goal should be to leverage the benefits of technologies like ChatGPT while safeguarding individual privacy and ensuring a secure digital environment.