Bridgewater's fine-tuned model beats frontier LLMs on finance at 1/14th the cost
AnalysisA custom model trained on one hedge fund's own expert judgment scored 84.7% on six real finance tasks, beating the best carefully-prompted frontier model at 78.2% and running at about a fourteenth of the cost per call. Thinking Machines, the lab Mira Murati started after leaving OpenAI, ran it with Bridgewater's AIA Labs and published the results July 2. The jobs were unglamorous: scoring whether an article is relevant, reading a central-bank statement. Off-the-shelf models sat near 50% on a plain prompt. For a narrow task with good labeled data, teaching a small model your judgment now beats renting a genius by the token.