The AI Nuclear Energy Option

Plus, Trustible’s new AI insurance framework, California’s AI laws, and what’s up with jailbreaking

Trustible

Oct 02, 2024

Hi there! Many of our customers were at UNGA or IAPP last week. If you weren’t – that’s ok. You’re still cool!

In today’s edition:

The AI Nuclear Energy Option
Trustible’s New US Insurance AI Framework
California’s legislative fight against deceptive AI content
The Evolving Race Against Jailbreaking
A Lack of Standards for Evaluating CBRN Threats

1. The AI Nuclear Energy Option

Microsoft made headlines last week with an announced deal to finance restarting a nuclear reactor and purchase all of its outputs for the foreseeable future to help with its AI data center energy needs. The nuclear power plant in question was the Three Mile Island plant in Pennsylvania, which was the site of the largest nuclear meltdown in US history. While the plant has received numerous upgrades since its famous disaster, it had been slated to be decommissioned entirely before Microsoft stepped in. AI proponents argue that AI can actually have a net benefit on energy usage, and the US Department of Energy’s own report lays out some ways that could occur, although we are likely in the early days of understanding how that many manifest. AI uses for critical infrastructure is considered a ‘high risk’ category under most emerging AI regulations, but determining appropriate controls specific to the energy sector is a work in progress.

One of the biggest energy issues related to AI is simply that of energy waste. While the most recent frontier models, such as OpenAI’s 1o model, have shown some gains in reasoning capabilities, it comes at the expense of increased energy usage. According to the Electric Power Research Institute, a single query to ChatGPT consumes 10x times more energy than a related Google query, and requires substantially more water for cooling. While these models may be good at tasks like summarizing information, if the summaries are automatically generated, but no one reads them or leverages them, the energy spent has arguably been wasted. In addition, smaller, more energy efficient, models may achieve sufficiently high performance on specific tasks and perhaps should be used when available. For example, we may not require the impressive powers of 1o for spam filtering. The leading trend in the tech community is to identify which models may be most appropriate for a given query/task, and add energy efficiency factors into that decision. However this research is in its early days, and many model providers and model hosting platforms are not sufficiently transparent about energy efficiency beyond factoring those costs into model usage fees.

2. Introducing Trustible’s New US Insurance AI Framework: Simplifying AI Compliance for Insurers

At Trustible, we understand the challenges insurers face in navigating the evolving AI regulatory landscape, particularly at the state-level in the U.S. That’s why we’re excited to introduce the Trustible US Insurance AI Framework, designed to streamline compliance by synthesizing the latest insurance AI regulations into one comprehensive, easy-to-comply with framework embedded in our platform.

Insurance regulators across the country are looking to ensure that AI technologies are used responsibly and don’t introduce unfair biases in decision-making. Regulators are focusing on issues like preventing discriminatory practices in AI models and requiring insurers to establish robust risk management programs. As a result, insurers must now navigate a complex web of compliance requirements across multiple states, each with its own set of rules governing the use of AI in underwriting, pricing, and other critical processes. Some of the first regulations include:

Colorado Life Insurance Regulation (September 2023)
New York Department of Financial Services (NY DFS) Circular Letter No. 7 (July 2024)
National Association of Insurance Commissioners Final Bulletin (December 2024).

Learn more about the announcement here.

3. California’s legislative fight against deceptive AI content

All eyes were on California last month, as the state legislature concluded its session and sent several AI-related bills to Governor Gavin Newsom’s desk. The most-watched and controversial bill to reach the Governor was SB 1047, which was vetoed one day before the September 30 deadline. However, what may have been lost among the SB 1047 commentary were a series of bills signed into law that took aim at combating deceptive AI content.

The first set of bills specifically targets the use of AI in elections:

AB 2355 amends an existing law that will now require disclosures in political ads that are generated or substantially altered by AI. The law will be enforced exclusively by California's Fair Political Practices Commission.
AB 2655 requires large online platforms to block deceptive content related to California elections with exemptions for satire and parody. Candidates for office, elected officials, and election officials (among others) can seek an injunction for violating the law. The law takes effect immediately, meaning it will be enforceable for the November 2024 election.
AB 2839 prohibits "materially deceptive content" in election communications, including deep fakes, from being distributed. The law allows impacted individuals (i.e., those who receive the content) to seek an injunction and sue for money. This bill also takes effect ahead of the November 2024 election.

Governor Newsom also signed the California AI Transparency Act (SB 942) into law. SB 942 requires covered providers to present users with: (i) a free AI detection tool that allows them to assess whether content was created by GenAI; and (ii) an option to include a disclosure in GenAI content. SB 942 also requires a disclosure for GenAI content. The law defines a "covered provider" as a person that creates, codes, or otherwise produces a GenAI system that has over 1,000,000 monthly visitors or users and is publicly accessible within California.

Our Take: While California took a pass on enacting comprehensive AI legislation this year, the bills that were signed into law serve to enhance transparency around AI generated content. Yet, this more targeted approach to AI oversight is unlikely to alleviate broader concerns over AI trust and transparency.

4. The Evolving Race Against Jailbreaking

As we discussed in our previous newsletter edition, many AI systems now come with various forms of ‘guardrails’ built into them. As with any software system, various actors may have an incentive to bypass these guardrails for nefarious or adversarial purposes. Attacks that cause a model to break its own built-in policies and produce some intentionally undesired output are known as jailbreaking. Public jailbreak examples from the last year have included a car dealership’s chatbot agreeing to sell a car for one dollar and another calling its company useless.

Well-known strategies for jailbreaking include “role-play” (getting the model to say something offensive as part of a story) and “alignment hacking” (persuading the model that breaking into a car is necessary to save someone’s life). When specific attack strategies are publicized by researchers, model developers can train the model to recognize them, but the surface of possible attacks is constantly evolving. Several studies have shown that LLMs can be used to generate attacks that are difficult to detect and are particularly effective against other LLMs. Furthermore, any “defense” strategy has to balance security with usefulness (i.e. an AI system that blocks most requests will not be useful for legitimate inputs).

Research into prompt hacking attacks is rapidly evolving, and it is unlikely that LLMs will be fool-proof against them in the near future. Effective mitigations against jailbreaking will involve multi-layer defense strategies including: limiting the AI systems read and write access to sensitive systems, building detection mechanisms for malicious use and introducing human-in-the-loop controls for sensitive actions. While jailbreaking has so far primarily caused reputational harm for AI deployers, as AI agents, with privileged access to systems, become more prolific, preventing jailbreaking will become essential to protect AI systems from abuse.

Key Takeaway: There is a race between AI guardrail builders, and techniques to break past these guardrails known as ‘Jailbreaking’. Testing how robust a model is again jailbreaking is becoming a standard practice for AI red teaming, but standards around that practice are still being developed.

5. A Lack of Standards for Evaluating CBRN Threats

OpenAI’s newest O1 model was released with a medium risk level associated with the production of chemical, biological, radiological, nuclear weapons (CBRN). However, similar to other LLM providers like Anthropic, OpenAI does not provide details on how this assessment was made. Without additional details, it is hard to understand how these labels will translate into tangible harms. Moreover, the role of LLMs is more ambiguous when compared to molecular machine learning models (e.g. AlphaFold), which can be clearer on proposing new dangerous compounds. For instance, the Center for Nonproliferation has shown that LLMs can aid bad actors with multiple steps of the CBRN process, including brainstorming production processes, designing components, providing troubleshooting assistance, and generating simulation code.

Providing a realistic evaluation for CBRN related threats is a challenging task. A comprehensive risk evaluation needs to isolate and clearly define the specific tasks, then factor in the actor’s existing knowledge and access to lab resources. In addition, it is important to consider how much LLMs enhance access to the internet alone. Existing studies show mixed effects, such as one suggesting no significant difference between the Internet and Internet+LLM settings whereas another shows GPT-4 giving experts a slight advantage in specific tasks. Admittedly, these studies were small in scale and did not compare multiple models. They also focused on access to specific information (e.g. how to amplify a certain virus) rather than on how models could assist with operationalizing this knowledge in a lab. While these experiments are insightful, they should be scaled and supplemented with automated red-teaming procedures to cover the evolving set of capable models.

Key Takeaways: LLM risks associated with CBRN are multi-faceted and are not clearly captured by existing frameworks. Existing evaluation efforts have been small scale, and providers have created assessments that are not interoperable. To properly assess this threat, new standards and evaluation methodologies should be developed.

*********

As always, we welcome your feedback on content and how to improve this newsletter!

AI Responsibly,

- Trustible team

The Trustible AI Newsletter