Entrain Your Own Large Language Model? Too Expensive, Even for Goldman Sachs
Entrain Your Own Large Language Model? Too Expensive, Even for Goldman Sachs. The bank’s CIO is rather looking towards hybrid approaches combining LLM and more specialized models.
In an interview on the Goldman Sachs website, the CIO of the American bank, Marco Argenti, shares his analysis of the evolution of business usage of generative AI. He highlights how the approach of these companies has significantly evolved over the months: “In the beginning, everyone wanted to train their own model, build their proprietary model with their proprietary data, keeping the data largely on-premise to ensure tight control. Then, people began to realize that, to achieve the level of performance of large models, replicating infrastructure is simply too expensive – to the tune of hundreds of millions of dollars.”
GPT-4, the latest of the large language models (LLMs) from OpenAI, was probably trained using thousands of billions of words of text and mobilizing thousands of processors. A process estimated to cost over 100 million dollars.
The LLM? “A brain to interpret the prompt”
This is Marco Argenti’s reasoning in favor of a hybrid architecture, combining pre-trained LLMs – “appreciated for some of their abilities in reasoning, problem-solving, and logic” – and smaller, often Open Source and specialized models for specific tasks, trained from proprietary data. In other words, the combination of foundation models available in the cloud, exploited to dissociate complex problems into a sum of more elementary problems, and more specialized models, hosted on on-premise infrastructures or on private clouds and supporting these specialized tasks. “Hybrid generative AI is about using large models as a kind of brain charged with interpreting the prompt,” the CIO says. He further explains: ” Sectors that rely heavily on proprietary data and are subject to very strict regulations will most likely be the first to adopt this architecture.” A remark that obviously applies to the banking sector.
RAG, vectorization, and prompt engineering
“In the beginning, everyone thought that without their own pre-trained model, it would be impossible to leverage the power of AI in business,” Marco Argenti continues. “Today, appropriate techniques such as RAG (Retrieval-augmented generation), content vectorization, and prompt engineering offer comparable, if not superior, performance to pre-trained models in some 95% of use cases, and for a fraction of the cost.”
Marco Argenti, the CIO of Goldman Sachs, outlines an architecture associating one or more market LLMs and specialized AI models trained on internal data. (Photo: D.R.)
If, for the CIO, 2024 will be strongly focused on finding AI use cases that bring ROI, he also expects the emergence of a complementary ecosystem of tools ensuring security, compliance, or data protection for AI applications, “as technology becomes embedded in critical tasks.” He also anticipates the emergence of multimodal models, adapted, for example, for the interpretation of time series. “This is perhaps where we will see the next race [among industry players, editor’s note] to capture a variety of use cases that have not yet been explored,” says Marco Argenti.
Last March, the CIO, who joined Goldman Sachs in 2019 from Amazon, explained that the bank’s developer teams were using generative AI to generate computer code and test it. While presenting these early use cases as experiments. In November 2023, an innovation leader within The Firm – Goldman Sachs’ nickname – indicated that the bank was working on a dozen projects incorporating generative AI, presenting uses around IT development (code generation and documentation writing) as among the most advanced. None of these projects, however, is associated with a client-related application, due to regulations, as the Goldman Sachs official then underscored.