Understanding Reasoning Models
Plus state-sponsored data poisoning, RFI for AI Action, and Manus Mania
Hi there – our team just got back from the Human[X] Conference in Las Vegas this week, where AI Agents were all the rage. Are AI Agents hype or is there tangible value there? We’ll be discussing this over the coming months.
In today’s edition (5 minute read):
Understanding Reasoning Models
State Sponsored Data Poisoning
Trump Administration Seeks Input on AI Action Plan
Manus Mania
1. Understanding Reasoning Models
In recent months, we've witnessed the emergence of a new wave of "reasoning" models, including GPT-o1/o3, DeepSeek-R1, and Claude-3.7. While these models do not introduce revolutionary architectural changes, they achieve impressive results on advanced math and reasoning tasks through post-training adaptation. Two key elements distinguish these reasoning models:
Built-in Chain-of-Thought. Reasoning models output intermediate reasoning steps before producing a final result, akin to writing out the steps in the math problem instead of just producing the final result. Chain-of-Thought has previously been studied as a prompting technique, but here it is a built-in feature and the model is given separate output areas of the “reasoning process” and the “final response.”
Reinforcement Learning (RL). RL is used during post-training to teach the models to produce useful intermediate outputs. Unlike a traditional model with a Chain-of-Thought prompt, a reasoning model is given the opportunity during training to learn to produce the right reasoning process. (For a deeper dive, see our previous newsletter)
Despite their impressive capabilities, reasoning models are not optimal for all use cases. They tend to be slower and more expensive than standard models, and perform similarly to regular models on tasks like summarization and code generation. Their true value lies in applications requiring flexible, multi-step reasoning (e.g., AI agents, complex document analysis, or Complex math and reasoning problems).
Overall, we recommend evaluating multiple model types to find the best trade-off of efficiency, cost and performance. To finish, we note that it is not currently clear how the “reasoning process” output should be interpreted. While it resembles human reasoning, Anthropic found that it does not faithfully represent the underlying process and OpenAI’s ChatGPT only shows a summary of this output.
Key Take-aways: Reasoning models are generally standard LLMs with post-training that teaches them to produce an intermediate reasoning output. This makes the models effective at advanced math problems and multi-step tasks; the exact set of best applications is still uncertain and in many cases regular LLMs can be as effective with careful prompting.
2. State Sponsored Data Poisoning
LLM outputs will inevitably reflect the self-selection biases of their training data and the document sets they can access in real time when generating answers (ie: when doing internet searches). It should be no surprise that in a recent Stanford paper analyzing the distributions of publicly accessible AI training datasets, found over 80% of language tokens are coming from data sources in the U.S. or Europe. Even Chinese created models need dedicated guardrails or fine-tuning steps to try and block or re-align answers to be consistent with their political viewpoints because of the prevalence of western viewpoints on the internet, especially in English language datasets. Those opposed to western values have taken notice.
A recent report by the American Sunlight Project, a non-profit organization focused on disinformation, has noted a potential state-sponsored effort to affect future LLMs. The report highlights a rapid expansion of the ‘Pravda’ network, a set of seemingly independent publication English language outlets that promote pro-Russian viewpoints on geopolitical and social issues, such as the war in Ukraine and LGBTQ people. While there are non-AI related motivations for these efforts, the researchers strongly suspect deliberate ‘LLM Grooming’ motivations as people are increasingly turning to AI systems for information. This suspicion is supported by a recent audit by Newsguard that found leading LLMs frequently citing articles written by sites from the Pravda network.
Our Take: AI models are the new front in the information wars between nation states, and we will likely see increased efforts to influence these models. This will make model selection not just a business decision, but potentially a political one as well. In addition, misinformation sources have a lot of incentives to make their content as easy to scrape and ingest, while high quality, verified reporting sources do not. It is unclear how this dynamic will impact things in the years to come.
3. Trump Administration Seeks Input on AI Action Plan
The Trump Administration is continuing its push to redefine federal AI policy with its recent Request for Information (RFI) on the Development of an AI Action Plan. The RFI is the next phase of implementing President Trump’s AI Executive Order, which directs his senior advisor to develop an “AI Action Plan” by July 22, 2025. The RFI itself is light on specific requests and directs commenters to provide input on policies that can “sustain and enhance America's AI dominance, and to ensure that unnecessarily burdensome requirements do not hamper private sector AI innovation.” Comments are due to the Office of Science and Technology Policy by March 15, 2025.
As the comment deadline approaches, organizations spanning the AI ecosystem have released their submissions ahead of the deadline. These organizations include Anthropic, IBM, and the American Society for AI. From an international perspective, the three organizations generally agreed on the need for implementing export controls on AI technologies. However IBM was of the opinion that restrictions should exist on the hardware, not AI model weights. Domestically, the organizations offered varying AI policy solutions. IBM and Anthropic advocated for AI procurement for government services. IBM suggested some risk-based regulation to preempt state led AI rules, which stands at odds with the Trump Administration's rhetoric on AI regulations. The American Society for AI noted that its members supported efforts to protect consumers from certain AI application harms, as well as prohibitions on agentic AI capabilities for life-and-death decisions.
Trustible submitted comments to the RFI that emphasize the importance of AI standards. Standards were briefly discussed within submissions from IBM, Anthropic, and the American Society for AI. However, we chose to highlight the importance that standards play in building trusting and accelerating AI adoption. We think the current Administration should continue the work started under the first Trump Administration to promote common sense, pragmatic standards. We also believe that the Trump Administration can learn from the cybersecurity ecosystem to develop scalable AI standards. Moreover, the U.S. should lead on AI standards to promote American values within the AI ecosystem.
Our Take: The Executive Order and RFI were light on details when it comes to Administration's intent for the AI Action Plan. This point underscores the importance of developing a robust record to inform the Administration's views and actions on AI. While there can be a lot of focus on what the Administration will not do, issues like standards, energy consumption, and IP protections offer some meaningful policy proposals that the current Administration may be amenable to implementing.
4. Manus Mania
If your news and social media feeds are like ours, you’ve likely seen a lot of discussion about a new agentic AI tool from China: Manus. While the main discourse has focused on whether China’s AI competitiveness increasing (it is), or even leapfrogging the US (it is not), we wanted to highlight what Manus will mean for AI governance professionals:
Agentic AI is already getting commoditized
Manus seems to mainly wrap Claude 3.7 with some smart prompt engineering, integrations with common APIs, and a user-friendly interface, and they have created one of the most ‘useful’ general agents according to many who have used it. None of these components are particularly unique, nor establish a deep competitive moat, meaning that even before agents have gained widespread adoption, they’ve well on their way to being heavily commoditized, and therefore will be prolific and integrated into many products very quickly.
Major Prompt Leakage
We know how Manus is built because users were able to easily jailbreak the system and fetch system files from its servers by prompting the model. It revealed many of its inner workings, all of which are proprietary for now, and all of which could be easily replicated. This was not a complicated hack by any means, and it underscores how vulnerable many ‘top tier’ AI systems can be to various exploits, and how damaging they can be to organizations.
Model & Tool Provenance isn’t always obvious
Nothing on the Manus website, social media pages, or even terms of service or privacy policy underscore its Chinese ownership and origin (the company owning Manus is technically incorporated in Singapore). There are plenty of reasons for Manus to present themselves this way, especially given potential technology export laws, but this underscores how difficult it can be to determine the potential origins, biases, or data transfer risks of AI tools, and the risks associated with them.
*********
As always, we welcome your feedback on content and how to improve this newsletter!
AI Responsibly,
- Trustible team