
DATA-DRIVEN INSIGHTS AND NEWS
ON HOW BANKS ARE ADOPTING AI
Live from NeurIPS

Source: Alex Inch | AI researchers flock to San Diego for NeurIPS
4 December 2025
Welcome back to the Banking Brief! This week, Evident’s Alex Inch is on the ground in San Diego at NeurIPS, the biggest AI conference of the year. In between presenting his own research, he's been talking with the bankers descending on the Pacific coast to see what they get from rubbing shoulders with the academic set. They swear they aren’t there *just* for the southern Californian weather. Plus, we comb through the ocean of papers accepted there and give you a look at the AI that’s to come for banking.
Speaking of the future, we're still on the hunt for your anti-predictions: What AI in banking prediction that you've seen is dead wrong? Tell us at [email protected] and include your name
People mentioned: Ilya Sutskever, Rich Sutton, Chee Mun Foong, Ranil Boteju, Rodrigo Castillo, Dhanny Mandil, Matt Partridge, Gert Van Assche, Gemma McNair, Marguerite Bérard, Carsten Bittner, Pat Opet and Michael Ruttledge.
This edition is 1,665 words, a 5 minute read. Check it out online. If you were forwarded the Brief, you can subscribe here.
– Alexandra Mousavizadeh & Annabel Ayles
TOP OF THE NEWS
BANKS INVADE IVORY TOWER
SAN DIEGO — The faded carpets and boxed lunches here at NeurIPS might remind you of a downmarket trade show. But what’s being hashed out in San Diego this week could define how banks use AI for years to come.
Welcome to the world’s largest academic AI conference: When Rich Sutton, the “father of reinforcement learning,” gives an 8:30 a.m. talk in a 16,000-seat hall, organizers scramble for overflow rooms. The way to secure a highly-coveted spot at one of the yacht parties happening this week? Solve a math problem.
But among Google, Meta and the rest of Silicon Valley is Wall Street. Six of the 50 banks we track now sponsor this event, up from four last year. And 10 banks have published papers, up from six last year. They tell me their hopes are the same: Somewhere around here is their next AI star or a new way to develop their next tool.

They’re in the right place: The floor teems with thousands of Ph.D.’s eyeing their next move and researchers looking to share their work. One senior AI engineer at a major UK bank told me he was headhunted right off the floor of last year’s conference. An RBC engineer said they’ve had to sharpen their pitch as the number of companies here has doubled in just five years.
But there’s more to it than talent. A TD Bank researcher told me hallway conversations had already helped clarify an approach they’re using to update their internal models. A Capital One engineer said he’d saved himself future headaches by chatting with a researcher about what methods hadn’t worked.
That’s what’s really on offer here: a chance to make the cutting edge practical and skip the failures in the middle. Banks here tell me they want to crack agentic or build tabular foundation models – systems that understand spreadsheets as fluently as ChatGPT speaks English.
But becoming an AI leader under the fluorescent lights of Wall Street begins under the San Diego sun.
– Alex Inch

How are the world’s leading banks and insurers powering their innovation strategies by advancing and protecting AI-related IP? In November we launched the Evident AI Patent Tracker, our member-only database of 1500+ AI patents filed by 80 major banks and insurance companies, alongside our analysis of the latest trends in how patents contribute to firms’ AI strategies.
FROM THE EVIDENT AI INDEX
PAPER TRAIL
If you want to see how banks will use AI in the future, look at the accepted papers at NeurIPS. We waded through all 5,990 papers to pull out the three trends banks are – or ought to be – paying attention to if they want to scale AI.
#1 JUDGMENT DAY: Evaluation frameworks – ways to judge whether models behave correctly – is the top research focus, with 13% of all the conference’s papers and more than 25% authored by banks. There's a good reason: Scaling AI depends on knowing model outputs can be trusted. And that requires new ways to gauge how they perform in different, increasingly-complex situations.
- Papers in practice: A Goldman Sachs paper published for NeurIPS introduces PHANTOM, a new way to test whether models will hallucinate when generating long financial documents. BNY published a paper showcasing an “early warning system” that spots when models are about to give a risky answer. And JPMorganChase researchers showed how models can still infer answers from information they were told to “unlearn” (like toxic language or personal data), which gives the bank new insight into how to fight hallucinations.
HOT TOPICS
The top focus areas of NeurIPS research revolve around making AI models scale

#2 MODEL OZEMPIC: Another big focus (11% of papers) is how to slim down models so running them at scale doesn’t cost so much. Two approaches stand out: “Modular architecture,” gets systems to allocate work to models best suited for a particular task, which saves resources. And “quantization and pruning,” gets models to store less detailed info, which “shrinks” them so they’re more practical for day-to-day use.
- Papers in practice: Amazon researchers published a paper showing how firms can cut costs by breaking tasks into pieces and routing each piece to different parts of models instead of having the whole machine firing for every request. Researchers from Microsoft, Nvidia and Google show how to save compute power by getting models to be better at determining which tasks they can blaze through and which ones they need to think hard about. And a Morgan Stanley paper details how the bank can tighten the belt by getting AI models to pull in only relevant data when they build a financial forecast.
# 3 THINK DIFFERENT: You won’t see it on the chart above, but there’s a growing focus on how to redesign models to think more efficiently. Today’s LLMs can be clunky at scale: They read every word and overthink even the easy questions. But two approaches to fix that are gaining steam and worth banks paying attention to: “Sparse attention” is a way to get models to skim read prompts and tune out superfluous information. And while "Mixture of Experts” models have been around for a while, this architecture – which saves brainpower by only “waking up” the parts of a model needed for a given task – is getting new life as efficiency takes priority.
- Papers in practice: An Alibaba paper at NeurIPS shows how to get models to discard messy parts of long prompts. Google researchers detail how to get models to be choosier about which tasks they think hard about. And Capital One is exploring how to make MoE models practical for banking. The bank’s paper details how to change the way feedback loops work so that “experts” that aren’t used much in this type of architecture can still improve even when they’re not tapped for a task.
NOTABLY QUOTABLE
“From 2012 to 2020, it was the age of research. Now, from 2020 to 2025, it was the age of scaling…because people say, ‘This is amazing. You’ve got to scale more. Keep scaling.’ The one word: scaling. But now the scale is so big. Is the belief really, ‘Oh, it’s so big, but if you had 100x more, everything would be so different?’ It would be different, for sure. But is the belief that if you just 100x the scale, everything would be transformed? I don’t think that’s true. So it’s back to the age of research again, just with big computers.”
– Ilya Sutskever, on the Dwarkesh podcast, Nov. 25


Want to speak directly to tech decisionmakers at the biggest banks around the world? Our highly-engaged audience of more than 20,000 subscribers includes CIOs, CDAOs, CTOs and CEOs of the top banks and financial services companies. Sponsorship for 2026 is now open; secure your spot today.
USE CASE CORNER
AGENTIC VENMO
One challenge for banks as they scale AI tools: U.S.-built LLMs struggle when customers move between languages. Mistral’s “multilinguistic” model release this week shows labs outside the U.S. are working to use language skills to their advantage, but they’re not the only ones.
In this week’s Corner, we spoke with Chee Mun Foong, chief product officer of Malaysia’s Ryt Bank, about how the firm built a foundation model that understands customers who switch between Malay and English and turned it into the foundation for an agentic chatbot.

Use Case: Ryt AI
Vendor: YTL
Bank: Ryt Bank
Why it’s interesting: The Malaysian bank wanted its chatbot to be able to take actions – like moving money – on its customers’ behalf. But building in this kind of agentic capability to a chatbot means being absolutely sure the tool can understand what it's being asked to do. The bank built its own foundation model to understand customers that mix both Malay and English into the same conversation so that it could bring conversational banking to life. “This is a product of reimagining what user interfaces can be in the age of generative AI,” Foong said. “It’s going to be a source of innovation for the next few years.”
How it works: Customers can now speak, text (in multiple languages) or send pictures to a single chatbot. The bank’s multi-agent system then breaks down what those users want to do with their money: One agent listens or reads what a customer sent to determine their intent, another checks the action against compliance guardrails, and an execution agent turns what the customer wants to do into a deterministic API call before giving the customer one last chance to approve the action. In practice, that means telling the bot to “send $40 to Mike,” can trigger a bank transfer. “We have about eight agents so far,” said Foong.
By the numbers: The bank has yet to publish accuracy metrics, but more than 50,000 users are using it and the chatbot is processing 80,000 transactions per month. Foong cites “strong adoption” across the customer base: “Interestingly, people in their 60s and 70s…find it comforting, it’s like chatting with your banker,” he said.
Differentiating factor: The bank sees its foundation model as a structural advantage that helped it develop the agentic tool. Models from OpenAI and Google “didn’t work so well for us given Malaysian ways of speaking,” Foong said. Instead of spending time trying to tune them to understand when customers use multiple languages in the same interaction, the bank can now rely on its own model already designed to parse that behavior. Next up for the bank: working on persistent memory, so the chatbot becomes “the teller that knows you,” Foong said.
Want to know more about the specific ways banks are rolling out AI? Check out our Use Case Tracker – the inventory of all the AI use cases announced by the world’s largest banks available to members.
IN THE NEWS
ENTENTE CORDIALE
HSBC and Mistral inked a deal this week that lets the bank use the French AI company’s foundation models. The bank will start with its communications, document analysis and translation tools. In the future, the companies will collaborate on credit and lending, customer onboarding and fraud prevention use cases. Look to BNP Paribas for how the partnership might progress: The French bank now uses Mistral models in about 100 use cases.
Citizens Financial has moved 750 of its applications onto the cloud to support AI development, leaving just 11 that run on prem, CIO Michael Ruttledge said last week. It’s helped the bank cut infrastructure costs by 10-15%. But it hasn’t yet cracked employee adoption at scale: About 800 people – half of the bank’s software engineers – are using Github Copilot. At BNY, 80% of the developer community use GitHub Copilot every day, and at TD Bank, 92% of engineers use the tool weekly.
OpenAI showed last week why being a CISO will never get easier. The model maker disclosed a security incident tied to a breach of analytics firm Mixpanel, one of its partners. Only “limited analytics data related to some users of the API” were exposed, but it shows why JPMorganChase CISO Pat Opet’s “open letter to suppliers” was prescient but still didn’t capture the full issue. It’s no longer just bank suppliers CISOs need to worry about; it’s the suppliers behind those suppliers.
STAT OF THE WEEK
-1.png%3FupdatedAt%3D2025-12-04T14%3A29%3A06.825Z&w=3840&q=100)
How much of ABN AMRO’s 2028 tech budget it plans to direct to “commercial initiatives” that can drive revenue – including AI tools – rather than spending the bulk on maintaining IT systems, CEO Marguerite Bérard said at last week’s capital markets day. The bank is currently spending 85% of its tech budget keeping legacy systems running.
The bank will use the freed-up budget to roll out 100 new Gen AI use cases to supplement the 25 it already has in production. It’s counting on those tools delivering value fast: The bank will also reduce headcount by 5,200 over the next three years – more than 20% of its overall headcount. “The beauty of the AI cases is that you normally actually have very small teams who can implement AI use cases that have a big lever and big impact…this has a very nice return on investment,” said Carsten Bittner, chief innovation and technology officer.
TALENT MATTERS
SOUTH FOR THE WINTER
CommBank named Ranil Boteju chief AI officer. Boteju was chief data and analytics officer at Lloyds, leading a team of more than 2,000. He’ll report to Rodrigo Castillo, the bank’s CTO and interim CIO for business technology.
Bank of America hired Dhanny Mandil to lead its infrastructure developer platforms team. Mandil was global head of private cloud engineering at Wells Fargo and served as head of identity and access management engineering at JPMorganChase before that.
Matt Partridge is now AI platform initiative management lead at UBS, where he’ll be “developing the strategy and vision to provide a fast, secure and compliant platform to enable Agentic AI and system integration at scale,” he wrote on LinkedIn. He was previously the bank’s head of foreign exchange, fixed income & equities integration management.
Goldman Sachs hired Gert Van Assche as head of AI model analytics. Van Assche was previously CTO at DATAmundi, a company that labels and tags “multilingual data for machine translation and conversational AI applications,” he wrote on LinkedIn.
Barclays brought on Gemma McNair as director of workforce transformation where she’ll work on “implementing AI into our ways of working.” McNair was previously in a similar role at NatWest.
WHAT'S ON
Sun 30 Nov - Sun 7 Dec
NeurIPS, Mexico City & San Diego
Tues 2 - Thurs 4 Dec
Global Banking Summit, London
Weds 10 - Thurs 11 Dec
The AI Summit, New York
Mon 19 - Fri 23 Jan
WEF, Davos, Switzerland
- Alexandra-Mousavizadeh|Co-founder & CEO|[email protected]
- Annabel-Ayles|Co-founder & co-CEO|[email protected]
- Colin-Gilbert|VP, Intelligence|[email protected]
- Andrew-Haynes|VP, Innovation|[email protected]
- Alex-Inch|Data Scientist|[email protected]
- Gabriel-Perez-Jaen|Research Manager|[email protected]
- Matthew-Kaminski|Senior Advisor|[email protected]
- Kevin-McAllister|Senior Editor|[email protected]
- Sam-Meeson|AI Research Analyst|[email protected]
- Jay Prynne|Head of Design|[email protected]
- Marcus Gurtler|Junior Designer|[email protected]