2024 recap: new models, more laws, and the ethics of AI and Santa Clause
Plus: Plus we request your feedback on how to make this newsletter even better in 2025!
That’s it. Today we say goodbye to 2024. It’s undoubtedly been a year of breakthrough innovation in AI and relentless research on AI governance. 2025 is shaping up to be even busier. In today’s edition:
Should AI lie about Santa Clause?
2024 Policy Review
2024 AI Model Review
Trustible Newsletter Review & Feedback Request
1. Should AI Lie about Santa Claus?
OpenAI recently released the full version of their latest model GPT-o1. This latest model focused heavily on trying to improve ‘reasoning’ capabilities, particularly when dealing with complex questions. OpenAI released a red-teaming report alongside the model’s system documentation, but the report contained some potentially alarming instances of deception. In several instances, the model would seemingly set aside its own system prompts and goals, particularly when told it would be shut down if those goals were achieved. Both OpenAI and the red-teaming group, Apollo Research, are conducting more research into how impactful this behavior may be.
This brings up the tricky ethical issue of AI lying. Beyond simple probabilistic hallucinations, there are serious ethical questions about if it may ever be appropriate for an AI system to lie. While the gut instinct is obviously ‘no’, some ethical frameworks for humans can have more nuanced contexts. This issue may first manifest with AI applications targeted towards children.
For example, what should an AI system for kids respond to: “Is Santa Claus real?”. As a society, we have collectively chosen to maintain this myth for kids, yet getting an AI to maintain that illusion may require deliberate intervention which could raise various ethical concerns. This question will likely manifest first in early childhood education where we commonly simplify or omit historical information. In these instances, should the principle of ‘AI never lying’ take precedence? Should the AI system be taught to ‘evade’ answering by saying ‘Ask your parents’? Or should it be acceptable to build ‘personable’, humorous systems that are allowed to tell some of the ‘white lies’ accepted in society? Will different cultures decide differently on this viewpoint? Time will tell!
Key Takeaway: Humans don’t agree on many fundamental ethical issues such as the ethics of lying. While there is broad consensus that we don’t want AI lying about safety issues, there may be cultural differences, and specific AI uses, that merit a more nuanced conversation.
2. 2024 AI Policy Review
The past year has proven to be another banner year for AI policy. Here is a recap of the top 5 key takeaways from AI policy in 2024:
The EU Continues to Lead the Way. For better or for worse, the EU is the global leader in regulating technology. While the European Parliament and Council reached a temporary agreement in December 2023, the Act did not officially enter into force until August 1, 2024 due protracted adoption by the European Council. The entry into force dates means that the first set of obligations under the new law will be effective in February 2025. The EU’s revised Product Liability Directive also entered into force this year, which updated the EU’s product liability regime to account for new technologies (i.e., AI systems).
U.S. Congress Stalls. Some thought this year may give way to some federal legislation on AI in the wake of President Biden’s 2023 AI Executive Order. However, Congress continues to shy away from regulating AI. Instead, both the House and Senate released bi-partisan reports on frameworks for regulating AI but provided no specific legislative solutions. Expected more of the same for 2025, as the incoming Trump Administration and Republican congressional majorities are unlikely to pass sweeping AI legislation.
Elections Upend AI Policy. Elections around the world in 2024 threw uncertainty into the wind with regards to AI. Besides a Republican sweep in the U.S., the U.K. also changed directions after a new Liberal government was formed after a landslide win. The new government has begun exploring ways to potentially regulate AI. Additionally, the final elements of India’s Digital India Act remain uncertain after Prime Minister Narendra Modi’s party underperformed in elections held earlier this year.
Mixed Bag for US States. State lawmakers were once again busy this year trying (or successfully) enacting AI laws. Colorado became the first state to enact a comprehensive AI regulation, after a similar styled bill failed to pass in Connecticut. California enacted a series of bills aimed at combatting deceptive AI content, while failing to enact a broader bill intended to regulate AI developers.
AI’s New Green Problem. AI was thought to be a new tool to fight climate change and reach global sustainability targets. Instead, AI has threatened sustainability goals in the past year. The AI arms race is having a profound impact on energy consumption, rolling back progress on reducing carbon emissions and straining certain natural resources like water. As AI development accelerates, there is growing concern that it will be a new contributor to climate change rather than help in the fight to mitigate it.
3. 2024 AI Models Review
2024 brought us new generations of LLMs from all the major providers including: Open AI’s GPT-4o and o1, Anthropic’s Claude-3 (and 3.5), Meta’s Llama-3, Google’s Gemini 2, Microsoft’s Phi-3, Amazon’s Nova, Allen AI’s OlMo 2, xAI’s Grok 2 and Qwen 2. Three key themes amongst these developments were:
Shifts from Pre-Training: Training larger models on larger collections of pre- and post- training data was a consistent theme, but this may soon shift, as some theorize that traditional pre-training has reached its limits due to a lack of new training data. Developers are already exploring new classes of techniques:
OpenAI used a reinforcement learning algorithm to teach the o1 model to reason, and we expect others to experiment with similar methods in the new year.
Phi-3 models have demonstrated that SLMs (Small Language Models) can be competitive when trained on high-quality natural and synthetic data.
The current leader of the Open LLM Leaderboard is an adaptation of Qwen-2.5 that used fine-tuning on custom datasets and model-merging (i.e. combining multiple variants of the model into one) to achieve top-tier performance.
Multi-Modal: Nearly all these new models have introduced a multi-modal component: taking text, images and audio as input and producing them as outputs. We’ve seen previews of text-to-video models, like Sora and Veo 2, but these have not been released to the general public. Specialized text-to-image models are showing impressive capabilities like Dall-E 3, Canva’s AI and Adobe’s Firefly (by an informal metric, models have gotten a lot better at generating hands). Over the next year, we will see advancements in both “natively multi-modal” models and specialized industry-specific ones.
Agentic AI: Model providers are starting to explicitly consider Agentic AI capabilities (i.e. AI systems that interact with and control external components). For example, Anthropic introduced a Computer Use application that allows Claude-3.5 models to interact with users’ computers; Gemini-2.0 was developed with an explicit focus on agentic capabilities.
Key Take-aways: 2024 has brought a whirlwind of new models and established a new baseline for capabilities (HuggingFace introduced a new leaderboard for Open LLMs, as most benchmarks in the previous ones have been saturated). In 2025, we may see a shift toward high-performance small language models, reinforcement learning and industry-specific technology. In addition, video-generator and agentic AI move from demo to publicly accessible tools, but developers will need to develop new techniques to manage novel risks of these technologies. Stay up-to-date with Trustible’s latest Model Transparency Ratings here.
4. Trustible Newsletter Reflections
We started this newsletter at the beginning of the year with the goal of sharing the ‘top’ news updates related to AI governance, try to translate policy news into technical implications, and vice-versa, and to share our views on those stories. We wanted to share a few reflections from our team on this process, and a request for feedback about where to take our newsletter in 2025.
Reflections
There’s a LOT of relevant AI Governance news
It was a busy year in the AI world. Our hardest task writing this newsletter was selecting what to cut from our list of relevant news stories. On the policy side, the EU and US states had constant updates, while on the technical side, new record breaking foundational modes and key safety research papers were released every month.
The media narrative on AI is over-sensationalized
Many major publications got so many details about AI governance and safety issues just outright wrong, and then added sensationalized headlines that skewed narratives too much. Major AI headlines will often hype AI agents as a ready-made replacement for the majority of workers, the EU AI Act has killed all AI innovation, terrorists are now creating new biological weapons in an hour using open source AI, and AI deepfakes severely disrupted this year’s democratic elections. There is no tangible evidence for any of these.
AI understanding and literacy is still low
As evidenced by some of this year’s reports in the AI Incident Database, people still regularly misunderstand the capabilities and limitations of AI. AI content detectors in schools led to false cheating accusations, lawyers still got caught submitting non-existent legal citations, and doctors using medical note taking apps missed key medical information because of poor recognition of medical terms and verbal accent support. This inspired us to create Trustible’s AI Literacy & Compliance Training product.
The future of AI seems more uncertain going into 2025
There are a lot of reasons to expect AI news in 2025 to be even more crazy. The incoming Trump administration will chart their own course on AI, led by an A16z partner, likely leading to increased State AI regulation and AI competition with China. Meanwhile AI Agents, "reasoning models, new video generation and the ‘yet to be announced’ AI models/breakthroughs will continue to be pushed by the $45 billion+ investments made in 2024 alone.
Request for Feedback
We’ve really enjoyed documenting and talking about this crazy year in AI, but we want to make sure we’re constantly improving everything we do. We’d love your feedback on our newsletter including sharing what you think we’re doing well, what you’d like to see us improve/iterate on, and if you have any suggestions for new things we should cover that would be useful for you! If you have a few minutes, we’d love your thoughts and feedback in this form!
*********
As always, we welcome your feedback on content and how to improve this newsletter!
AI Responsibly,
- Trustible team