Oracle HeatWave’s in-database LLMs to help reduce infra costsInnovatePC

Oracle HeatWave’s in-database LLMs to help reduce infra costs

Posted by Richy George on 26 June, 2024

This post was originally published on this site

Oracle is adding new generative AI-focused features to its Heatwave data analytics cloud service, previously known as MySQL HeatWave.

The new name highlights how HeatWave offers more than just MySQL support, and also includes HeatWave Gen AI, HeatWave Lakehouse, and HeatWave AutoML, said Nipun Agarwal, senior vice president of HeatWave at Oracle.

At its annual CloudWorld conference in September 2023, Oracle previewed a series of generative AI-focused updates for what was then MySQL HeatWave.

These updates included an interface driven by a large language model (LLM), enabling enterprise users to interact with different aspects of the service in natural language, a new Vector Store, Heatwave Chat, and AutoML support for HeatWave Lakehouse.

Some of these updates, along with additional capabilities, have been combined to form the HeatWave Gen AI offering inside HeatWave, Oracle said, adding that all these capabilities and features are now generally available at no additional cost.

In-database LLM support to reduce cost

In a first among database vendors, Oracle has added support for LLMs inside a database, analysts said.

HeatWave Gen AI’s in-database LLM support, which leverages smaller LLMs with fewer parameters such as Mistral-7B and Meta’s Llama 3-8B running inside the database, is expected to reduce infrastructure cost for enterprises, they added.

“This approach not only reduces memory consumption but also enables the use of CPUs instead of GPUs, making it cost-effective, which given the cost of GPUs will become a trend at least in the short term until AMD and Intel catch up with Nvidia,” said Ron Westfall, research director at The Futurum Group.

Another reason to use smaller LLMs inside the database is the ability to have more influence on the model with fine tuning, said David Menninger, executive director at ISG’s Ventana Research.

“With a smaller model the context provided via retrieval augmented generation (RAG) techniques has a greater influence on the results,” Menninger explained.

Westfall also gave the example of IBM’s Granite models, saying that the approach to using smaller models, especially for enterprise use cases, was becoming a trend.

The in-database LLMs, according to Oracle, will allow enterprises to search data, generate or summarize content, and perform RAG with HeatWave’s Vector Store.

Separately, HeatWave Gen AI also comes integrated with the company’s OCI Generative Service, providing enterprises with access to pre-trained and other foundational models from LLM providers.

Rebranded Vector Store and scale-out vector processing

A number of database vendors that didn’t already offer specialty vector databases have added vector capabilities to their wares over the last 12 months—MongoDB, DataStax, Pinecone, and CosmosDB for NoSQL among them — enabling customers to build AI and generative AI-based use cases over data stored in these databases without moving data to a separate vector store or database.

Oracle’s Vector Store, already showcased in September, automatically creates embeddings after ingesting data in order to process queries faster.

Another capability added to HeatWave Gen AI is scale-out vector processing that will allow HeatWave to support VECTOR as a data type and in turn help enterprises process queries faster.

“Simply put, this is like adding RAG to a standard relational database,” Menninger said. “You store some text in a table along with an embedding of that text as a VECTOR data type. Then when you query, the text of your query is converted to an embedding. The embedding is compared to those in the table and the ones with the shortest distance are the most similar.”

A graphical interface via HeatWave Chat

Another new capability added to HeatWave Gen AI is HeatWave Chat—a Visual Code plug-in for MySQL Shell which provides a graphical interface for HeatWave GenAI and enables developers to ask questions in natural language or SQL.

The retention of chat history makes it easier for developers to refine search results iteratively, Menninger said.

HeatWave Chat comes in with another feature dubbed the Lakehouse Navigator, which allows enterprise users to select files from object storage to create a new vector store.

This integration is designed to enhance user experience and efficiency of developers and analysts building out a vector store, Westfall said.

Next read this:

Posted Under: Database