To accelerate the transition to memory safe programming languages, the US Defense Advanced Research Projects Agency (DARPA) is driving the development of TRACTOR, a programmatic code conversion vehicle.
The term stands for TRanslating All C TO Rust. It’s a DARPA project that aims to develop machine-learning tools that can automate the conversion of legacy C code into Rust.
The reason to do so is memory safety. Memory safety bugs, such buffer overflows, account for the majority of major vulnerabilities in large codebases. And DARPA’s hope is that AI models can help with the programming language translation, in order to make software more secure.
“You can go to any of the LLM websites, start chatting with one of the AI chatbots, and all you need to say is ‘here’s some C code, please translate it to safe idiomatic Rust code,’ cut, paste, and something comes out, and it’s often very good, but not always,” said Dan Wallach, DARPA program manager for TRACTOR, in a statement.
“You can go to any of the LLM websites, start chatting with one of the AI chatbots, and all you need to say is ‘here’s some C code, please translate it to safe idiomatic Rust code,’ cut, paste, and something comes out, and it’s often very good, but not always,” said Dan Wallach, DARPA program manager for TRACTOR, in a statement.
“This parlor trick impressed me. I’m sure it can scale to solve difficult real world problems.”
It’s a promising approach worth trying, but I won’t be holding my breath.
If DARPA really wanted safer languages, they could be pushing test coverage, not blindly converting stable well tested C code into untested Rust code.
This, like most AI speculation, reeks of looking for shortcuts instead of doing the boring job at hand.
You have tests?
Edit: guess could always use AI to auto generate tests /s
I’m thinking they also want to future proof this.
The quantity of C devs are dying. It’s a really difficult language to get competent with.
If DARPA really wanted safer languages, they could be pushing test coverage,
Or Ada…
Ada is not strictly safer. It’s not memory safe for example, unless you never free. The advantage it has is mature support for formal verification. But there’s literally no way you’re going to be able to automatically convert C to Ada + formal properties.
In any case Rust has about a gazillion in-progress attempts at adding various kinds of formal verification support. Kani, Prusti, Cruesot, Verus, etc. etc. It probably won’t be long before it’s better than Ada.
Also if your code is Ada then you only have access to the tiny Ada ecosystem, which is probably fine in some domains (e.g. embedded) but not in general.
A: “We really need this super-important and highly-technical job done.”
B: “We could just hire a bunch of highly-technical people to do it.”
A: “No, we would have to hire people and that would cost us millions.”
B: “We could spend billions on untested technology and hope for the best.”
A: “Excellent work B! Charge the government $100M for our excellent idea.”
turning C code automatically into Rust…
Oh wow they must have some sick transpiler, super exciting…
With AI, of course
God fucking damnit.
Code works in C
Want to make it safer
Put it into a fucking LLM
You know sometimes I wonder if I’m an idiot or that maybe I just don’t have the right family connections to get a super high paying job
Key detail in the actual memo is that they’re not using just an LLM. “Wallach anticipates proposals that include novel combinations of software analysis, such as static and dynamic analysis, and large language models.”
They also are clearly aware of scope limitations. They explicitly call out some software, like entire kernels or pointer arithmetic heavy code, as being out of scope. They also seem to not anticipate 100% automation.
So with context, they seem open to any solutions to “how can we convert legacy C to Rust.” Obviously LLMs and machine learning are attractive avenues of investigation, current models are demonstrably able to write some valid Rust and transliterate some code. I use them, they work more often than not for simpler tasks.
TL;DR: they want to accelerate converting C to Rust. LLMs and machine learning are some techniques they’re investigating as components.