In the first article of this two-part analysis, we looked at who owns the code created by AI chatbots like ChatGPT and explored the legal implications of using AI-generated code.
Part I: Who owns the code? If ChatGPT’s AI helps write your app, does it still belong to you?
Now, we’ll discuss issues of liability and exposure.
Functional liability
To frame this discussion, I turn to attorney and long-time Internet Press Guild member Richard Santalesa. With his tech journalism background, Santalesa understands this stuff from both a legal and a tech perspective. (He’s a founding member of the SmartEdgeLaw Group.)
“Until cases grind through the courts to definitively answer this question, the legal implications of AI-generated code are the same as with human-created code,” he advises.
Keep in mind, he continues, that code generated by humans is far from error-free. There will never be a service level agreement warranting that code is perfect or that users will have uninterrupted use of the services.
Also: ChatGPT and the new AI are wreaking havoc on cybersecurity in exciting and frightening ways
Santalesa also points out that it’s rare for all parts of a software to be entirely home-grown. “Most coders use SDKs and code libraries that they have not personally vetted or analyzed, but rely upon nonetheless,” he says. “I think AI-generated code — for the time being — will be in the same bucket as to legal implications.”
Send in the trolls
Sean O’Brien, a lecturer in cybersecurity at Yale Law School and founder of the Yale Privacy Lab, points out a risk for developers that’s undeniably worrisome:
The chances that AI prompts might output proprietary code are very high, if we’re talking about tools such as ChatGPT and Copilot, which have been trained on a massive trove of code of both the open source and proprietary variety.
We don’t know exactly what code was used to train the chatbots. This means we don’t know if segments of code output from ChatGPT and other similar tools are generated by the AI or merely echoed from code it ingested as part of the training process.
Also: 5 ways to explore the use of generative AI at work
If you’re a developer, it’s time to brace yourself. Here’s O’Brien’s prediction:
I believe there will soon be an entire sub-industry of trolling that mirrors patent trolls, but this time surrounding AI-generated works. As more authors use AI-powered tools to ship code under proprietary licenses, a feedback loop is created. There will be software ecosystems polluted with proprietary code that are the subject of cease-and-desist claims by enterprising firms.
As soon as O’Brien mentioned the troll factor, the hairs on the back of my neck stood up. This is going to get very, very messy.
Canadian attorney Robert Piasentin, a partner in the technology group at Canadian business law firm McMillan LLP, also points out that chatbots could have been trained on open-source work and legitimate sources, alongside copyrighted work. All of that training data might include flawed or biased data (or algorithms) as well as corporate proprietary data.
Also: AI scholar Gary Marcus makes a strong case for an AI regulatory agency
Piasentin explains: “If the AI draws on incorrect, deficient or biased information, the output of the AI tool may give rise to various potential claims, depending on the nature of the potential damage or harm that the output may have caused (whether directly or indirectly).”
Here’s another thought: Some will attempt to corrupt the training corpora (the sources of knowledge that AIs use to provide their results). One of the things humans do is find ways to game the system. So not only will there be armies of legal trolls trying to find folks to sue, but there will be hackers, criminals, rogue nation states, high school students, and crackpots — all attempting to feed erroneous data into every AI they can find, either for the lulz or for much more nefarious reasons.
Perhaps we shouldn’t dwell too much on the dark side.
Who is at fault?
None of the lawyers, though, discussed who is at fault if the code generated by an AI results in some catastrophic outcome.
For example: The company delivering a product shares some responsibility for, say, choosing a library that has known deficiencies. If a product ships using a library that has known exploits and that product causes an incident that results in tangible harm, who owns that failure? The product maker, the library coder, or the company that chose the product?
Usually, it’s all three.
Also: ChatGPT’s latest challenger: The Supreme Court
Now add AI code into the mix. Clearly, most of the responsibility falls on the shoulders of the coder who chooses to use code generated by an AI. After all, it’s common knowledge that the code may not work and needs to be thoroughly tested.
In a comprehensive lawsuit, will claimants also go after the companies that produce the AIs and even the organizations from which content was taken to train those AIs (even if done without permission)?
As every attorney has told me, there is very little case law thus far. We won’t really know the answers until something goes wrong, parties wind up in court, and it’s adjudicated thoroughly.
We’re in uncharted waters here. My best advice, for now, is to test your code thoroughly. Test, test, and then test some more.
You can follow my day-to-day project updates on social media. Be sure to follow me on Twitter at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, and on YouTube at YouTube.com/DavidGewirtzTV.