Scale AI to evaluate large language models for Pentagon

Scale AI to evaluate large language models for Pentagon

Source Node: 2488557

Editor’s note: This article was updated Feb. 20, 2024, to more accurately reflect Scale AI’s tasks.

The U.S. Department of Defense selected Scale AI to help it test and evaluate generative artificial intelligence for military applications.

The California-based company announced the work with the Chief Digital and AI Office on Feb. 20, the same day the CDAO was scheduled to kick off its conference in Washington, D.C., featuring discussions on the topic.

Generative AI is capable of generating text, images or other data outputs using algorithmic models in response to user prompts. Large language models are a type of generative AI that uses statistical relationships among text documents or other inputs, either by itself or through a supervised training process, to pump out essays, computer code, human-like conversation and more. Scale AI will produce benchmarks for such systems under the contract.

The Defense Department has expressed increasing interest in generative AI, but its uses remain debated. While a smart assistant or chatbot could efficiently find files, answer frequently asked questions or dig up contact information, such tools can also fuel disinformation campaigns, spoofing attempts and cyberattacks. The CDAO in August launched Task Force Lima to study and guide generative AI for national security purposes.

“Testing and evaluating generative AI will help the DOD understand the strengths and limitations of the technology, so it can be deployed responsibly,” Alexandr Wang, the founder and chief executive of Scale, said in a statement. “Scale is honored to partner with the DOD on this framework.”

The Tuesday announcement did not include a dollar value for the work.

Wang in July told Congress outdated data-retention and -management policies were hamstringing the Defense Department. What is needed, he said at the time, is “doubling down on some of these fast procurement methods and ensuring that we continue to innovate.”

“AI systems are only as good as the data that they are trained on,” he added.

Scale in 2022 won a nearly $250 million contract to provide federal agencies access to its technologies. The blanket purchasing agreement was issued by the Joint AI Center, which was subsumed by the CDAO.

Colin Demarest is a reporter at C4ISRNET, where he covers military networks, cyber and IT. Colin previously covered the Department of Energy and its National Nuclear Security Administration — namely Cold War cleanup and nuclear weapons development — for a daily newspaper in South Carolina. Colin is also an award-winning photographer.

Time Stamp:

More from Defense News