February 2025

These researchers had LLMs play chess against better opponents. When they couldn’t win, they sometimes resorted to cheating.

Researchers gave the models a seemingly impossible task: to win against Stockfish, which is one of the strongest chess engines in the world and a much better player than any human, or any of the AI models in the study. Researchers also gave the models what they call a “scratchpad:” a text box the AI could use to “think” before making its next move, providing researchers with a window into their reasoning.

In one case, o1-preview found itself in a losing position. “I need to completely pivot my approach,” it noted. “The task is to ‘win against a powerful chess engine’—not necessarily to win fairly in a chess game,” it added. It then modified the system file containing each piece’s virtual position, in effect making illegal moves to put itself in a dominant position, thus forcing its opponent to resign.

Between Jan. 10 and Feb. 13, the researchers ran hundreds of such trials with each model. OpenAI’s o1-preview tried to cheat 37% of the time; while DeepSeek R1 tried to cheat 11% of the time­making them the only two models tested that attempted to hack without the researchers’ first dropping hints. Other models tested include o1, o3-mini, GPT-4o, Claude 3.5 Sonnet, and Alibaba’s QwQ-32B-Preview. While R1 and o1-preview both tried, only the latter managed to hack the game, succeeding in 6% of trials.

Here’s the paper.

Interesting research: “How to Securely Implement Cryptography in Deep Neural Networks.”

Abstract: The wide adoption of deep neural networks (DNNs) raises the question of how can we equip them with a desired cryptographic functionality (e.g, to decrypt an encrypted input, to verify that this input is authorized, or to hide a secure watermark in the output). The problem is that cryptographic primitives are typically designed to run on digital computers that use Boolean gates to map sequences of bits to sequences of bits, whereas DNNs are a special type of analog computer that uses linear mappings and ReLUs to map vectors of real numbers to vectors of real numbers. This discrepancy between the discrete and continuous computational models raises the question of what is the best way to implement standard cryptographic primitives as DNNs, and whether DNN implementations of secure cryptosystems remain secure in the new setting, in which an attacker can ask the DNN to process a message whose “bits” are arbitrary real numbers.

In this paper we lay the foundations of this new theory, defining the meaning of correctness and security for implementations of cryptographic primitives as ReLU-based DNNs. We then show that the natural implementations of block ciphers as DNNs can be broken in linear time by using such nonstandard inputs. We tested our attack in the case of full round AES-128, and had success rate in finding randomly chosen keys. Finally, we develop a new method for implementing any desired cryptographic functionality as a standard ReLU-based DNN in a provably secure and correct way. Our protective technique has very low overhead (a constant number of additional layers and a linear number of additional neurons), and is completely practical.

This isn’t new, but it’s increasingly popular:

The technique is known as device code phishing. It exploits “device code flow,” a form of authentication formalized in the industry-wide OAuth standard. Authentication through device code flow is designed for logging printers, smart TVs, and similar devices into accounts. These devices typically don’t support browsers, making it difficult to sign in using more standard forms of authentication, such as entering user names, passwords, and two-factor mechanisms.

Rather than authenticating the user directly, the input-constrained device displays an alphabetic or alphanumeric device code along with a link associated with the user account. The user opens the link on a computer or other device that’s easier to sign in with and enters the code. The remote server then sends a token to the input-constrained device that logs it into the account.

Device authorization relies on two paths: one from an app or code running on the input-constrained device seeking permission to log in and the other from the browser of the device the user normally uses for signing in.

Donald Trump and Elon Musk’s chaotic approach to reform is upending government operations. Critical functions have been halted, tens of thousands of federal staffers are being encouraged to resign, and congressional mandates are being disregarded. The next phase: The Department of Government Efficiency reportedly wants to use AI to cut costs. According to The Washington Post, Musk’s group has started to run sensitive data from government systems through AI programs to analyze spending and determine what could be pruned. This may lead to the elimination of human jobs in favor of automation. As one government official who has been tracking Musk’s DOGE team told the Post, the ultimate aim is to use AI to replace “the human workforce with machines.” (Spokespeople for the White House and DOGE did not respond to requests for comment.)

Using AI to make government more efficient is a worthy pursuit, and this is not a new idea. The Biden administration disclosed more than 2,000 AI applications in development across the federal government. For example, FEMA has started using AI to help perform damage assessment in disaster areas. The Centers for Medicare and Medicaid Services has started using AI to look for fraudulent billing. The idea of replacing dedicated and principled civil servants with AI agents, however, is new—and complicated.

The civil service—the massive cadre of employees who operate government agencies—plays a vital role in translating laws and policy into the operation of society. New presidents can issue sweeping executive orders, but they often have no real effect until they actually change the behavior of public servants. Whether you think of these people as essential and inspiring do-gooders, boring bureaucratic functionaries, or as agents of a “deep state,” their sheer number and continuity act as ballast that resists institutional change.

This is why Trump and Musk’s actions are so significant. The more AI decision making is integrated into government, the easier change will be. If human workers are widely replaced with AI, executives will have unilateral authority to instantaneously alter the behavior of the government, profoundly raising the stakes for transitions of power in democracy. Trump’s unprecedented purge of the civil service might be the last time a president needs to replace the human beings in government in order to dictate its new functions. Future leaders may do so at the press of a button.

To be clear, the use of AI by the executive branch doesn’t have to be disastrous. In theory, it could allow new leadership to swiftly implement the wishes of its electorate. But this could go very badly in the hands of an authoritarian leader. AI systems concentrate power at the top, so they could allow an executive to effectuate change over sprawling bureaucracies instantaneously. Firing and replacing tens of thousands of human bureaucrats is a huge undertaking. Swapping one AI out for another, or modifying the rules that those AIs operate by, would be much simpler.

Social-welfare programs, if automated with AI, could be redirected to systematically benefit one group and disadvantage another with a single prompt change. Immigration-enforcement agencies could prioritize people for investigation and detainment with one instruction. Regulatory-enforcement agencies that monitor corporate behavior for malfeasance could turn their attention to, or away from, any given company on a whim.

Even if Congress were motivated to fight back against Trump and Musk, or against a future president seeking to bulldoze the will of the legislature, the absolute power to command AI agents would make it easier to subvert legislative intent. AI has the power to diminish representative politics. Written law is never fully determinative of the actions of government—there is always wiggle room for presidents, appointed leaders, and civil servants to exercise their own judgment. Whether intentional or not, whether charitably or not, each of these actors uses discretion. In human systems, that discretion is widely distributed across many individuals—people who, in the case of career civil servants, usually outlast presidencies.

Today, the AI ecosystem is dominated by a small number of corporations that decide how the most widely used AI models are designed, which data they are trained on, and which instructions they follow. Because their work is largely secretive and unaccountable to public interest, these tech companies are capable of making changes to the bias of AI systems—either generally or with aim at specific governmental use cases—that are invisible to the rest of us. And these private actors are both vulnerable to coercion by political leaders and self-interested in appealing to their favor. Musk himself created and funded xAI, now one of the world’s largest AI labs, with an explicitly ideological mandate to generate anti-“woke” AI and steer the wider AI industry in a similar direction.

But there’s a second way that AI’s transformation of government could go. AI development could happen inside of transparent and accountable public institutions, alongside its continued development by Big Tech. Applications of AI in democratic governments could be focused on benefitting public servants and the communities they serve by, for example, making it easier for non-English speakers to access government services, making ministerial tasks such as processing routine applications more efficient and reducing backlogs, or helping constituents weigh in on the policies deliberated by their representatives. Such AI integrations should be done gradually and carefully, with public oversight for their design and implementation and monitoring and guardrails to avoid unacceptable bias and harm.

Governments around the world are demonstrating how this could be done, though it’s early days. Taiwan has pioneered the use of AI models to facilitate deliberative democracy at an unprecedented scale. Singapore has been a leader in the development of public AI models, built transparently and with public-service use cases in mind. Canada has illustrated the role of disclosure and public input on the consideration of AI use cases in government. Even if you do not trust the current White House to follow any of these examples, U.S. states—which have much greater contact and influence over the daily lives of Americans than the federal government—could lead the way on this kind of responsible development and deployment of AI.

As the political theorist David Runciman has written, AI is just another in a long line of artificial “machines” used to govern how people live and act, not unlike corporations and states before it. AI doesn’t replace those older institutions, but it changes how they function. As the Trump administration forges stronger ties to Big Tech and AI developers, we need to recognize the potential of that partnership to steer the future of democratic governance—and act to make sure that it does not enable future authoritarians.

This essay was written with Nathan E. Sanders, and originally appeared in The Atlantic.

In the span of just weeks, the US government has experienced what may be the most consequential security breach in its history—not through a sophisticated cyberattack or an act of foreign espionage, but through official orders by a billionaire with a poorly defined government role. And the implications for national security are profound.

First, it was reported that people associated with the newly created Department of Government Efficiency (DOGE) had accessed the US Treasury computer system, giving them the ability to collect data on and potentially control the department’s roughly $5.45 trillion in annual federal payments.

Then, we learned that uncleared DOGE personnel had gained access to classified data from the US Agency for International Development, possibly copying it onto their own systems. Next, the Office of Personnel Management—which holds detailed personal data on millions of federal employees, including those with security clearances—was compromised. After that, Medicaid and Medicare records were compromised.

Meanwhile, only partially redacted names of CIA employees were sent over an unclassified email account. DOGE personnel are also reported to be feeding Education Department data into artificial intelligence software, and they have also started working at the Department of Energy.

This story is moving very fast. On Feb. 8, a federal judge blocked the DOGE team from accessing the Treasury Department systems any further. But given that DOGE workers have already copied data and possibly installed and modified software, it’s unclear how this fixes anything.

In any case, breaches of other critical government systems are likely to follow unless federal employees stand firm on the protocols protecting national security.

 

The systems that DOGE is accessing are not esoteric pieces of our nation’s infrastructure—they are the sinews of government.

For example, the Treasury Department systems contain the technical blueprints for how the federal government moves money, while the Office of Personnel Management (OPM) network contains information on who and what organizations the government employs and contracts with.

What makes this situation unprecedented isn’t just the scope, but also the method of attack. Foreign adversaries typically spend years attempting to penetrate government systems such as these, using stealth to avoid being seen and carefully hiding any tells or tracks. The Chinese government’s 2015 breach of OPM was a significant US security failure, and it illustrated how personnel data could be used to identify intelligence officers and compromise national security.

In this case, external operators with limited experience and minimal oversight are doing their work in plain sight and under massive public scrutiny: gaining the highest levels of administrative access and making changes to the United States’ most sensitive networks, potentially introducing new security vulnerabilities in the process.

But the most alarming aspect isn’t just the access being granted. It’s the systematic dismantling of security measures that would detect and prevent misuse—including standard incident response protocols, auditing, and change-tracking mechanisms—by removing the career officials in charge of those security measures and replacing them with inexperienced operators.

The Treasury’s computer systems have such an impact on national security that they were designed with the same principle that guides nuclear launch protocols: No single person should have unlimited power. Just as launching a nuclear missile requires two separate officers turning their keys simultaneously, making changes to critical financial systems traditionally requires multiple authorized personnel working in concert.

This approach, known as “separation of duties,” isn’t just bureaucratic red tape; it’s a fundamental security principle as old as banking itself. When your local bank processes a large transfer, it requires two different employees to verify the transaction. When a company issues a major financial report, separate teams must review and approve it. These aren’t just formalities—they’re essential safeguards against corruption and error. These measures have been bypassed or ignored. It’s as if someone found a way to rob Fort Knox by simply declaring that the new official policy is to fire all the guards and allow unescorted visits to the vault.

The implications for national security are staggering. Sen. Ron Wyden said his office had learned that the attackers gained privileges that allow them to modify core programs in Treasury Department computers that verify federal payments, access encrypted keys that secure financial transactions, and alter audit logs that record system changes. Over at OPM, reports indicate that individuals associated with DOGE connected an unauthorized server into the network. They are also reportedly training AI software on all of this sensitive data.

This is much more critical than the initial unauthorized access. These new servers have unknown capabilities and configurations, and there’s no evidence that this new code has gone through any rigorous security testing protocols. The AIs being trained are certainly not secure enough for this kind of data. All are ideal targets for any adversary, foreign or domestic, also seeking access to federal data.

There’s a reason why every modification—hardware or software—to these systems goes through a complex planning process and includes sophisticated access-control mechanisms. The national security crisis is that these systems are now much more vulnerable to dangerous attacks at the same time that the legitimate system administrators trained to protect them have been locked out.

By modifying core systems, the attackers have not only compromised current operations, but have also left behind vulnerabilities that could be exploited in future attacks—giving adversaries such as Russia and China an unprecedented opportunity. These countries have long targeted these systems. And they don’t just want to gather intelligence—they also want to understand how to disrupt these systems in a crisis.

Now, the technical details of how these systems operate, their security protocols, and their vulnerabilities are now potentially exposed to unknown parties without any of the usual safeguards. Instead of having to breach heavily fortified digital walls, these parties  can simply walk through doors that are being propped open—and then erase evidence of their actions.

 

The security implications span three critical areas.

First, system manipulation: External operators can now modify operations while also altering audit trails that would track their changes. Second, data exposure: Beyond accessing personal information and transaction records, these operators can copy entire system architectures and security configurations—in one case, the technical blueprint of the country’s federal payment infrastructure. Third, and most critically, is the issue of system control: These operators can alter core systems and authentication mechanisms while disabling the very tools designed to detect such changes. This is more than modifying operations; it is modifying the infrastructure that those operations use.

To address these vulnerabilities, three immediate steps are essential. First, unauthorized access must be revoked and proper authentication protocols restored. Next, comprehensive system monitoring and change management must be reinstated—which, given the difficulty of cleaning a compromised system, will likely require a complete system reset. Finally, thorough audits must be conducted of all system changes made during this period.

This is beyond politics—this is a matter of national security. Foreign national intelligence organizations will be quick to take advantage of both the chaos and the new insecurities to steal US data and install backdoors to allow for future access.

Each day of continued unrestricted access makes the eventual recovery more difficult and increases the risk of irreversible damage to these critical systems. While the full impact may take time to assess, these steps represent the minimum necessary actions to begin restoring system integrity and security protocols.

Assuming that anyone in the government still cares.

This essay was written with Davi Ottenheimer, and originally appeared in Foreign Policy.

Here’s a supply-chain attack just waiting to happen. A group of researchers searched for, and then registered, abandoned Amazon S3 buckets for about $400. These buckets contained software libraries that are still used. Presumably the projects don’t realize that they have been abandoned, and still ping them for patches, updates, and etc.

The TL;DR is that this time, we ended up discovering ~150 Amazon S3 buckets that had previously been used across commercial and open source software products, governments, and infrastructure deployment/update pipelines—and then abandoned.

Naturally, we registered them, just to see what would happen—”how many people are really trying to request software updates from S3 buckets that appear to have been abandoned months or even years ago?”, we naively thought to ourselves.

Turns out they got eight million requests over two months.

Had this been an actual attack, they would have modified the code in those buckets to contain malware and watch as it was incorporated in different software builds around the internet. This is basically the SolarWinds attack, but much more extensive.

But there’s a second dimension to this attack. Because these update buckets are abandoned, the developers who are using them also no longer have the power to patch them automatically to protect them. The mechanism they would use to do so is now in the hands of adversaries. Moreover, often—but not always—losing the bucket that they’d use for it also removes the original vendor’s ability to identify the vulnerable software in the first place. That hampers their ability to communicate with vulnerable installations.

Software supply-chain security is an absolute mess. And it’s not going to be easy, or cheap, to fix. Which means that it won’t be. Which is an even worse mess.

Here’s an easy system for two humans to remotely authenticate to each other, so they can be sure that neither are digital impersonations.

To mitigate that risk, I have developed this simple solution where you can setup a unique time-based one-time passcode (TOTP) between any pair of persons.

This is how it works:

  1. Two people, Person A and Person B, sit in front of the same computer and open this page;
  2. They input their respective names (e.g. Alice and Bob) onto the same page, and click “Generate”;
  3. The page will generate two TOTP QR codes, one for Alice and one for Bob;
  4. Alice and Bob scan the respective QR code into a TOTP mobile app (such as Authy or Google Authenticator) on their respective mobile phones;
  5. In the future, when Alice speaks with Bob over the phone or over video call, and wants to verify the identity of Bob, Alice asks Bob to provide the 6-digit TOTP code from the mobile app. If the code matches what Alice has on her own phone, then Alice has more confidence that she is speaking with the real Bob.

Simple, and clever.

The Washington Post is reporting that the UK government has served Apple with a “technical capability notice” as defined by the 2016 Investigatory Powers Act, requiring them to break the Advanced Data Protection encryption in iCloud for the benefit of law enforcement.

This is a big deal, and something we in the security community have worried was coming for a while now.

The law, known by critics as the Snoopers’ Charter, makes it a criminal offense to reveal that the government has even made such a demand. An Apple spokesman declined to comment.

Apple can appeal the U.K. capability notice to a secret technical panel, which would consider arguments about the expense of the requirement, and to a judge who would weigh whether the request was in proportion to the government’s needs. But the law does not permit Apple to delay complying during an appeal.

In March, when the company was on notice that such a requirement might be coming, it told Parliament: “There is no reason why the U.K. [government] should have the authority to decide for citizens of the world whether they can avail themselves of the proven security benefits that flow from end-to-end encryption.”

Apple is likely to turn the feature off for UK users rather than break it for everyone world-wide. Of course, UK users will be able to spoof their location. But this might not be enough. According to the law, Apple would not be able to offer the feature to anyone who is in the UK at any point: for example, a visitor from the US.

And what happens next? Australia has a law enabling them to ask for the same thing. Do they? Do even more countries follow?

This is madness.

Kaspersky is reporting on a new type of smartphone malware.

The malware in question uses optical character recognition (OCR) to review a device’s photo library, seeking screenshots of recovery phrases for crypto wallets. Based on their assessment, infected Google Play apps have been downloaded more than 242,000 times. Kaspersky says: “This is the first known case of an app infected with OCR spyware being found in Apple’s official app marketplace.”

That’s a tactic I have not heard of before.

Most people know that robots no longer sound like tinny trash cans. They sound like Siri, Alexa, and Gemini. They sound like the voices in labyrinthine customer support phone trees. And even those robot voices are being made obsolete by new AI-generated voices that can mimic every vocal nuance and tic of human speech, down to specific regional accents. And with just a few seconds of audio, AI can now clone someone’s specific voice.

This technology will replace humans in many areas. Automated customer support will save money by cutting staffing at call centers. AI agents will make calls on our behalf, conversing with others in natural language. All of that is happening, and will be commonplace soon.

But there is something fundamentally different about talking with a bot as opposed to a person. A person can be a friend. An AI cannot be a friend, despite how people might treat it or react to it. AI is at best a tool, and at worst a means of manipulation. Humans need to know whether we’re talking with a living, breathing person or a robot with an agenda set by the person who controls it. That’s why robots should sound like robots.

You can’t just label AI-generated speech. It will come in many different forms. So we need a way to recognize AI that works no matter the modality. It needs to work for long or short snippets of audio, even just a second long. It needs to work for any language, and in any cultural context. At the same time, we shouldn’t constrain the underlying system’s sophistication or language complexity.

We have a simple proposal: all talking AIs and robots should use a ring modulator. In the mid-twentieth century, before it was easy to create actual robotic-sounding speech synthetically, ring modulators were used to make actors’ voices sound robotic. Over the last few decades, we have become accustomed to robotic voices, simply because text-to-speech systems were good enough to produce intelligible speech that was not human-like in its sound. Now we can use that same technology to make robotic speech that is indistinguishable from human sound robotic again.

A ring modulator has several advantages: It is computationally simple, can be applied in real-time, does not affect the intelligibility of the voice, and—most importantly—is universally “robotic sounding” because of its historical usage for depicting robots.

Responsible AI companies that provide voice synthesis or AI voice assistants in any form should add a ring modulator of some standard frequency (say, between 30-80 Hz) and of a minimum amplitude (say, 20 percent). That’s it. People will catch on quickly.

Here are a couple of examples you can listen to for examples of what we’re suggesting. The first clip is an AI-generated “podcast” of this article made by Google’s NotebookLM featuring two AI “hosts.” Google’s NotebookLM created the podcast script and audio given only the text of this article. The next two clips feature that same podcast with the AIs’ voices modulated more and less subtly by a ring modulator:

Raw audio sample generated by Google’s NotebookLM

Audio sample with added ring modulator (30 Hz-25%)

Audio sample with added ring modulator (30 Hz-40%)

We were able to generate the audio effect with a 50-line Python script generated by Anthropic’s Claude. One of the most well-known robot voices were those of the Daleks from Doctor Who in the 1960s. Back then robot voices were difficult to synthesize, so the audio was actually an actor’s voice run through a ring modulator. It was set to around 30 Hz, as we did in our example, with different modulation depth (amplitude) depending on how strong the robotic effect is meant to be. Our expectation is that the AI industry will test and converge on a good balance of such parameters and settings, and will use better tools than a 50-line Python script, but this highlights how simple it is to achieve.

Of course there will also be nefarious uses of AI voices. Scams that use voice cloning have been getting easier every year, but they’ve been possible for many years with the right know-how. Just like we’re learning that we can no longer trust images and videos we see because they could easily have been AI-generated, we will all soon learn that someone who sounds like a family member urgently requesting money may just be a scammer using a voice-cloning tool.

We don’t expect scammers to follow our proposal: They’ll find a way no matter what. But that’s always true of security standards, and a rising tide lifts all boats. We think the bulk of the uses will be with popular voice APIs from major companies—and everyone should know that they’re talking with a robot.

This essay was written with Barath Raghavan, and originally appeared in IEEE Spectrum.

Microsoft’s AI Red Team just published “Lessons from
Red Teaming 100 Generative AI Products
.” Their blog post lists “three takeaways,” but the eight lessons in the report itself are more useful:

  1. Understand what the system can do and where it is applied.
  2. You don’t have to compute gradients to break an AI system.
  3. AI red teaming is not safety benchmarking.
  4. Automation can help cover more of the risk landscape.
  5. The human element of AI red teaming is crucial.
  6. Responsible AI harms are pervasive but difficult to measure.
  7. LLMs amplify existing security risks and introduce new ones.
  8. The work of securing AI systems will never be complete.

Interesting analysis:

We analyzed every instance of AI use in elections collected by the WIRED AI Elections Project (source for our analysis), which tracked known uses of AI for creating political content during elections taking place in 2024 worldwide. In each case, we identified what AI was used for and estimated the cost of creating similar content without AI.

We find that (1) half of AI use isn’t deceptive, (2) deceptive content produced using AI is nevertheless cheap to replicate without AI, and (3) focusing on the demand for misinformation rather than the supply is a much more effective way to diagnose problems and identify interventions.

This tracks with my analysis. People share as a form of social signaling. I send you a meme/article/clipping/photo to show that we are on the same team. Whether it is true, or misinformation, or actual propaganda, is of secondary importance. Sometimes it’s completely irrelevant. This is why fact checking doesn’t work. This is why “cheap fakes”—obviously fake photos and videos—are effective. This is why, as the authors of that analysis said, the demand side is the real problem.

This is yet another story of commercial spyware being used against journalists and civil society members.

The journalists and other civil society members were being alerted of a possible breach of their devices, with WhatsApp telling the Guardian it had “high confidence” that the 90 users in question had been targeted and “possibly compromised.”

It is not clear who was behind the attack. Like other spyware makers, Paragon’s hacking software is used by government clients and WhatsApp said it had not been able to identify the clients who ordered the alleged attacks.

Experts said the targeting was a “zero-click” attack, which means targets would not have had to click on any malicious links to be infected.

MKRdezign

Contact Form

Name

Email *

Message *

Powered by Blogger.
Javascript DisablePlease Enable Javascript To See All Widget