Dell Latitude E6410 Notebook| Quantity Available: 40+
This post is intended for businesses and other organizations interested... Read more →
Posted by Richy George on 23 May, 2024
This post was originally published on this siteRelational database provider EnterpriseDB (EDB) on Thursday introduced EDB Postgres AI, a new database aimed at transactional, analytical, and AI workloads.
EDB Postgres AI, which was internally named Project Beacon during its development, started its life as a data lakehouse project with support for Delta Live Tables and later evolved into a product that combines EDB’s PostgreSQL software and other components such as data lakehouse analytics into a singular unified offering.
PostgreSQL, which is an object-oriented relational database, has been gaining popularity because it is an open-source database with many potential providers and can be adapted to multiple workloads, said Matt Aslett, director at ISG’s Ventana Research.
“As a general-purpose database, PostgreSQL is suitable for both transactional and analytic applications,” Aslett explained.
The huge ecosystem of PostgreSQL-based databases that leverage the core technology and skills base makes PostgreSQL impossible to ignore, positioning it as a default standard enterprise-grade open-source database, experts said.
According to data from database knowledge base DB-Engines, PostgreSQL has been steadily rising in popularity and is currently the fourth most popular RDBMS (relational database management system) and fourth most popular database product overall in their rankings.
The constant rise in popularity has forced hyperscalers such as AWS, Google Cloud Platform, and Microsoft Azure to create database services built on PostgreSQL. Examples of these databases are AlloyDB, CitiusDB (PostgreSQL on Azure), Amazon Aurora, and Amazon RDS for PostgreSQL. Other rivals of EDB include YugabyteDB and CockroachDB.
The components of of EDB Postgres AI, which the company describes as an “intelligent data platform,” include a central management console with AI assistance, EDP Postgres databases, data lakehouse analytics, and AI/ML including vector database support.
The console, according to the company, provides a single pane to all EDB Postgres AI operations and helps manage the database landscape of an enterprise, in turn providing better observability. The console comes with an AI agent that can help enterprises manage on-premises databases.
EDB Postgres AI supports most databases that EDB offers including EDB Postgres, EDB Postgres Advanced Server, EDB Postgres Extended Server, and their respective distributed high availability versions. EDB Postgres Advanced Server also provides Oracle Database compatibility, the company said.
The lakehouse analytics module, according to EDB, brings structured and unstructured data together with the help of Nodes to be analyzed. Nodes support multiple formats, the company said, adding that it has built a custom store to support multiple data formats.
The AI/ML module includes vector support, which effectively gives the platform its capability to build AI-powered applications.
Additionally, Postgres AI comes with support for extensions such as Postgres Enterprise Manager, Barman, Query Advisor, and migration tools, such as the Migration Toolkit and the Migration Portal.
The Migration Portal, according to the company, is among the first EDB tools to include embedded AI via an AI copilot that can assist users in developing migration strategies.
The combination of these components or modules result in Postgres AI’s key capabilities such as rapid analytics, observability, vector support, high availability, and legacy modernization.
Explaining rapid analytics as a capability, EDB said that Postgres AI allows enterprises to spin up analytics clusters on demand.
“With EDB Postgres Lakehouse capabilities, operational data can be stored in a columnar format, optimizing it for fast analytics,” the company said in a statement.
EDB added that its acquisition of Splitgraph, a startup that provides a PostgreSQL-compatible serverless SQL API for building data-driven applications from multiple data sources, last year played a foundational role in building out the analytics capability.
The release of EDB Postgres AI saw the company partner with the likes of Red Hat, Nutanix, and SADA.
EDB’s collaboration with Red Hat will enable enterprises to build AI models on Red Hat OpenShift AI and deliver enterprise-grade, day-two operations with EDB Postgres AI, the company said.
EDB Postgres AI, according to Jozef de Vries, the chief engineering officer at EDB, is available as a managed service on AWS, GCP, and Azure.
“The analytics functionality is initially available only on AWS, with the other public clouds to follow soon,” Vries said, adding that Postgres AI can also be self-managed on a public cloud and private cloud environment of the customer’s choice.
EDB prices its EDB Postgres offerings on a per vCPU-hour basis. The company also provides a free tier across all of its database offerings.
The EDB Postgres offering costs $0.0856 per vCPU-hour, the company’s subscription listing showed. Other options, such as EDB Postgres Extended Server, EDB Postgres Advanced Server, and their distributed high availability versions cost $0.1655 per vCPU-hour, $0.2568 per vCPU-hour, $0.3424 per vCPU-hour, and $0.2511 per vCPU-hour respectively.
EDB Postgres AI, according to Aslett of Ventana Research, is being positioned to help enterprises bring AI capabilities to a variety of workloads regardless of deployment location.
“With EDB Postgres AI, EDB is addressing the requirement for data storage and processing both on-premises and in the cloud. This is important for AI-infused applications, which require high-performance AI inference at the point of interaction,” the research director explained.
Omdia’s chief analyst Bradley Shimmin sees the release of EDB Postgres AI as the repeat of the market branding mania from 1980s.
“It does seem like the 1980s with its ‘Turbo’ market branding mania, as we’re seeing a sudden and very pervasive influx of AI-branded products entering the market,” Shimmin said. Shimmin cited recent examples such as Red Hat Enterprise Linux AI and Oracle Database 23ai.
Shimmin said that he sees the EDB Postgres AI release as a mix of marketing hype and maturation of Postgres offerings.
“The Postgres database was certainly capable of supporting AI workloads before this release, plying vector data as a means of steering large language models away from confabulation and toward fact. What we are seeing from vendors like EDB with these AI branded releases is the vertical integration of functionality geared toward streamlining, simplifying, and accelerating AI-infused use cases,” the analyst explained.
The release of EDB Postgres AI coincides with EDB’s strategic move to a new corporate identity, EDB said.
Next read this:
Posted by Richy George on 22 May, 2024
This post was originally published on this siteThere appears to be many questions and few answers about MariaDB plc’s long-term strategy following an announcement that its shareholders have accepted an offer by California-based investment firm K1 Investment Management.
News that the company that provides database and SaaS services around the open-source database MariaDB had been acquired came on Monday, when it was announced that a trio of companies—K1; Meridian Bidco LLC, a K1 affiliate; and K5 Capital Advisors—“now have irrevocable shareholder support in respect of 68.51% of MariaDB shares.”
The company has had a litany of financial issues over the past 12 months, but when it released financial results for the quarter ended March 31 last week there was one bright spot: Its net loss had shrunk to $3.5 million, compared to a net Loss of $11.9 million a year earlier.
“We have demonstrated our ability to quickly turn our financial story around and are optimistic about the future performance of the business,” Paul O’Brien, CEO of MariaDB plc said in a statement accompanying the financial results.
As reported in InfoWorld in February, after going public in December 2022, the company saw its market capitalization plummet from $445 million to around $10 million by the end of 2023.
Carl Olofson, research vice president and database analyst with IDC, said that the key to determining what happens next is why the acquisition happened in the first place.
While executives at K1 and MariaDB plc have yet to comment on their future plans, Olofson said that “when you see something like this, there is one of two motivations. One is that you want to dismantle the company, and make a profit from the assets, which is not going to be the case here, because they do not really have assets.
“The other option is to really believe that with proper management, the right approach, the company can grow far beyond where it’s at now—make fabulous profits, sell it off and everybody walks away happy.”
In the open source database space, he said, there is a big difference when it comes to intellectual property (IP) between MariaDB and MySQL, the open-source database of which MariaDB is a fork. In the case of MySQL, the IP is owned by Oracle, but for MariaDB “it is owned by the MariaDB community, which is not part of the company. There is a clear distinction between MariaDB, the company, and MariaDB, the community.”
Olofson added that regardless of what happens to the corporate entity, the community will continue.
The open source project was created by Michael “Monty” Widenius, who was also a creator of MySQL. He set up the company Monty Program Ab which later became MariaDB, the company, and also the MariaDB Foundation, which is the custodian of the project’s open-source code.
Olofson spoke with Widenius briefly last year at a MariaDB user conference and described him as an “open source purist” who wants to “just put technology out there and let people do what they will with it.” But while that attitude might be great for users, it’s not good for MariaDB, the company: “That does not really help it in trying to survive commercially and meet payroll,” Olofson said.
As for the commercial side, while Olofson has no idea what K1’s corporate strategy might be moving forward, he said one option is to “go down the well-worn path of other open source software companies that are trying to make a living from the technology by what’s called an open core approach.”
That was already MariaDB’s strategy—it offers paid add-ons such as MaxScale, ColumnStore, or Galera Cluster, as well as consulting, migration and managed cloud services—“but they basically just ran out of money,” said Olofson.
“They were adding some really exciting and innovative features ,” he said, “and then they pulled back from some of that. I interpreted that as meaning they did not have the money to continue to support development.”
InfoWorld reached out to both MariaDB plc and K1 for comment, but at press time had not heard from either organization.
Next read this:
Posted by Richy George on 22 May, 2024
This post was originally published on this siteLift the hood on most business applications, and you’ll find they have some way to store and use structured data. Whether it’s a client-side app, an app with a web front end, or an edge-device app, chances are a business application needs a database. In many cases, an embedded database will do. Embedded databases are lightweight, compact, and portable—and for some applications, they are a better choice than a traditional server.
SQLite is an embeddable open source database, written in C and queryable with conventional SQL. SQLite is designed to be fast, portable, and reliable, whether you’re storing only kilobytes of data or multi-gigabyte blobs. We’ll take a look at SQLite, including where and when to use it and how it compares to alternatives such as MySQL, MariaDB, and other popular embedded databases.
The most common and obvious use case for SQLite is serving as a conventional, table-oriented relational database. SQLite supports transactions and atomic behaviors, so a program crash or even a power outage won’t leave you with a corrupted database. SQLite also has other features found in higher-end databases, such as full-text indexing, and support for large databases—up to 281 terabytes with row sizes up to 1GB.
SQLite also provides a fast and powerful way to store configuration data for a program. Instead of parsing a file format like JSON or YAML, a developer can use SQLite as an interface to those files—often far faster than operating on them manually. SQLite can work with in-memory data or external files (e.g., CSV files) as if they were native database tables, providing a handy way to query that data. It also natively supports JSON data, so data can be stored as JSON or queried in-place.
SQLite has many advantages, starting with its platform and language portability. Here are the main benefits of using SQLite:
SQLite is frequently compared to MySQL, the widely used open source database product that is a staple of today’s application stacks. As much as SQLite resembles MySQL, there are good reasons to favor one over the other, depending on the use case. The same is true for MariaDB, another popular database that is sometimes compared to SQLite.
SQLite has relatively few native data types—BLOB
, NULL
, INTEGER
, REAL
, and TEXT
. Both MySQL and MariaDB, on the other hand, have dedicated data types for dates and times, various precisions of integers and floats, and much more.
If you’re storing relatively few data types, or you want to use your data layer to perform validation on the data, SQLite is useful. However, if you want your data layer to provide its own validation and normalization, go with MySQL or MariaDB.
SQLite’s configuration and tuning options are minimal. Most of its internal or command-line flags deal with edge cases or backward compatibility. This fits with SQLite’s overall philosophy of simplicity: the default options are well-suited to most common use cases.
MySQL and MariaDB offer a veritable forest of database- and installation-specific configuration options—collations, indexing, performance tuning, storage engines, etc. The plethora of options is because these database products offer far more features. You may have to tweak them more, but it’s likely because you’re trying to do more in the first place.
SQLite is best suited for applications with a single concurrent user, such as in desktop or mobile apps. MySQL and MariaDB are designed to handle multiple concurrent users. They can also provide clustered and scale-out solutions, whereas SQLite can’t.
Some projects add scaling features to SQLite, although not as a direct substitute for MySQL or MariaDB. Canonical has created its own variant of SQLite, dqlite, designed to scale out across a cluster. Data is kept consistent across nodes by way of a Raft algorithm, and deploying dqlite has only marginally more administrative overhead than SQLite.
SQLite is far from the only embeddable database. Many others deliver similar features but emphasize different use cases or deployment models.
SQLite’s design choices make it well-suited for some scenarios but poorly suited for others. Here are some places where SQLite doesn’t work well:
Next read this:
Posted by Richy George on 21 May, 2024
This post was originally published on this siteMicrosoft released multiple updates to its database offerings at its Build 2024 conference.
One of the major updates to its database offerings includes the addition of vector search to Azure Cosmos DB for NoSQL.
Azure Cosmos DB for NoSQL, which is a non-relational database service, is a part of the larger Azure Cosmos DB database offering, a distributed database which implements a set of different consistency models enabling users to trade off performance against latency in their applications.
Fundamentally, NoSQL databases do away with SQL databases’ constraints of data types and consistency in order to achieve more speed, flexibility, and scale.
Azure Cosmos DB supports working with different data models, including APIs for MongoDB and Apache Cassandra. The database also has a flavor of PostgreSQL.
Last year at Build, Microsoft had introduced vector search to Cosmos DB, building on the company’s Cosmos DB for MongoDB vCore service.
The vector search in Cosmos DB for NoSQL, according to the company, is powered by DiskANN—a suite of scalable approximate nearest neighbor search algorithms that support real-time changes.
In addition, the company also made the Azure Database for PostgreSQL extension for Azure AI generally available in order to bring AI capabilities to data in PostgreSQL.
“This enables developers who prefer PostgreSQL to plug data directly into Azure AI for a simplified path to leverage LLMs and build rich PostgreSQL generative AI experiences,” the company said in a statement.
Additionally, the company said it is working to add an embeddings generation feature inside its Azure Database for PostgreSQL offering. The new feature, which is currently in preview, can generate embeddings within the database.
Further, the company said it was adding more capabilities to Copilot inside databases to help developers, including adding the ability to provide summaries of technical documentation in response to user questions inside Azure Database for MySQL.
In March, the company announced a private preview of Copilot in Azure SQL Database in order to offer natural language to SQL conversion, along with self-help for database administration.
Next read this:
Posted by Richy George on 16 May, 2024
This post was originally published on this siteIn May 1974, Donald Chamberlin and Raymond Boyce published a paper on SEQUEL, a structured query language that could be used to manage and sort data. After a change in title due to another company’s copyright on the word SEQUEL, Structured Query Language (SQL) was taken up by database companies like Oracle alongside their new-fangled relational database products later in the 1970s. The rest, as they say, is history.
SQL is now 50 years old. SQL was designed and then adopted around databases, and it has continued to grow and develop as a way to manage and interact with data. According to Stack Overflow, it is the third most popular language used by professional programmers on a regular basis. In 2023, the IEEE noted that SQL was the most popular language for developers to know when it came to getting a job, due to how it could be combined with other programming languages.
When you look at other older languages being used today, the likes of COBOL (launched in 1959) and FORTRAN (first compiled in 1958) are still going, too. While they can lead to well-paying roles, they are linked to existing legacy deployments rather than new and exciting projects. SQL, on the other hand, is still being used as part of work around AI, analytics, and software development. It continues to be the standard for how we interact with data on a daily basis.
When you look at SQL, you may ask why it has survived—even thrived—for so long. It is certainly not easy to learn, as it has a peculiar syntax that is very much of its time. The user experience around SQL can be challenging for new developers to pick up. Alongside this, every database vendor has to support SQL, but each also will have their own quirks or nuances in how they implement this support. Consequently, your approach for one database may not translate to another database easily, leading to both more work and more support requirements.
To make matters worse, it is easy to make mistakes in SQL that can have real and potentially catastrophic consequences. For example, missing a WHERE
clause in your instructions can cause you to delete an entire table rather than carrying out the transaction you want, leading to lost data and recovery work. Checking your logic and knowing how things work in practice is a necessary requirement.
So why is SQL still the leading way to work with data today, 50 years after it was first designed and released? SQL is based on strong mathematical theory, so it continues to perform effectively and support the use cases it was designed for. The truth is that when you combine SQL with relational databases, you can map the data that you create—and how you manage that data—to many business practices in a way that is reliable, effective, and scalable. Put simply, SQL works, and no replacement option has measured up in the same way.
As an example, SQL was the first programming language to return multiple rows per single request. This makes it easier to get data on what is taking place within a set of data—and consequently, within the business and its applications—and then turn it into something the business can use. Similarly, SQL made it easier to compartmentalize and segregate information into different tables, and then use the data in those tables for specific business tasks, such as putting customer data in one table and manufacturing data in another. The ability to perform transactions is the backbone of most processes today, and SQL made that possible at scale.
Another important reason for the success of SQL is that the language has always moved with the times. From its relational roots, SQL has added support for geographic information system (GIS) data, for JSON documents, and for XML and YAML over the years. This has kept SQL up to speed with how developers want to interact with data. Now, SQL can be combined with vector data, enabling developers to interact with data using SQL but carrying out vector searches for generative AI applications.
There have been attempts to replace SQL in the past. NoSQL (Not only SQL) databases were developed to replace relational databases and get away from the traditional models of working with and managing data at scale. However, rather than replacing SQL, these databases added their own SQL-like languages that replicated some of the methods and approaches that SQL has ingrained into how developers work.
In the past, natural language processing advocates have called for new methods that do away with SQL’s standardized and clunky approach. However, these attempts have ended up with methods that were just as clunky as what they tried to replace, which led to them being sidelined or ignored. Generative AI may take on more of the task of writing SQL for developers, as large language models have been exposed to large quantities of SQL code as part of their training. However, while this approach may develop and become more popular in time, it still relies on SQL for the actual interaction with those sets of data and to deliver the results back to the user. If anything, this will likely make SQL more important for the future, not less, even though it will be less visible to the developer.
Even if SQL ends up moving behind the curtain, it will continue to play a critical role in how we interact with and use data. With such a huge percentage of all our IT systems relying on data to function, SQL will not be going away any time soon. So, let’s celebrate SQL turning 50, and consider how we can continue to develop and use it in the future.
Charly Batista is PostgreSQL technical lead at Percona.
—
New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.
Next read this:
Posted by Richy George on 15 May, 2024
This post was originally published on this siteMost people assume that analytical databases, or OLAPs, are big, powerful beasts—and they are correct. Systems like Snowflake, Redshift, or Postgres involve a lot of setup and maintenance, even in their cloud-hosted incarnations. But what if all you want is “just enough” analytics for a dataset on your desktop? In that case, DuckDB is worth exploring.
DuckDB is a tiny but powerful analytics database engine—a single, self-contained executable, which can run standalone or as a loadable library inside a host process. There’s very little you need to set up or maintain with DuckDB. In this way, it is more like SQLite than the bigger analytical databases in its class.
DuckDB is designed for column-oriented data querying. It ingests data from sources like CSV, JSON, and Apache Parquet, and enables fast querying using familiar SQL syntax. DuckDB supports libraries for all the major programming languages, so you can work with it programmatically using the language of your choice. Or you can use DuckDB’s command-line interface, either on its own or as part of a shell pipeline.
When you work with data in DuckDB, there are two modes you can use for that data. Persistent mode writes the data to disk so it can handle workloads bigger than system memory. This approach comes at the cost of some speed. In-memory mode keeps the data set entirely in memory, which is faster but retains nothing once the program ends. (SQLite can be used the same way.)
DuckDB can ingest data from a variety of formats. CSV, JSON, and Apache Parquet files are three of the most common. With CSV and JSON, DuckDB by default attempts to figure out the columns and data types on its own, but you can override that process as needed—for instance, to specify a format for a date column.
Other databases, like MySQL or Postgres, can also be used as data sources. You’ll need to load a DuckDB extension (more on this later) and provide a connection string to the database server; DuckDB doesn’t read the files for those databases directly. With SQLite, though, you connect to the SQLite database file as though it were just another data file.
To load data into DuckDB from an external source, you can use an SQL string, passed directly into DuckDB:
SELECT * FROM read_csv('data.csv');
You can also use methods in the DuckDB interface library for a given language. With the Python library for DuckDB, ingesting looks like this:
import duckdb
duckdb.read_csv("data.csv")
You can also query certain file formats directly, like Parquet:
SELECT * FROM 'test.parquet';
You can also issue file queries to create a persistent data view, which is usable as a table for multiple queries:
CREATE VIEW test_data AS SELECT * FROM read_parquet('test.parquet');
DuckDB has optimizations for working with Parquet files, so that it reads only what it needs from the file.
Other interfaces like ADBC and ODBC can also be used. ODBC serves as a connector for data visualization tools like Tableau.
Data imported into DuckDB can also be re-exported in many common formats: CSV, JSON, Parquet, Microsoft Excel, and others. This makes DuckDB useful as a data-conversion tool in a processing pipeline.
Once you’ve loaded data into DuckDB, you can query it using SQL expressions. The format for such expressions is no different from regular SQL queries:
SELECT * FROM users WHERE ID>1000 ORDER BY Name DESC LIMIT 5;
If you’re using a client API to query DuckDB, you can pass SQL strings through the API, or you can use the client’s relational API to build up queries programmatically. In Python, reading from a JSON file and querying it might look like this:
import duckdb
file = duckdb.read_json("users.json")
file.select("*").filter("ID>1000").order("Name").limit(5)
If you use Python, you can use the PySpark API to query DuckDB directly, although DuckDB’s implementation of PySpark doesn’t yet support the full feature set.
DuckDB’s dialect of SQL closely follows most common SQL dialects, although it comes with a few gratuitous additions for the sake of analytics. For instance, placing the SAMPLE clause in a query lets you run a query using only a subset of the data in a table. The resulting query runs faster but it may be less accurate. DuckDB also supports the PIVOT
keyword (for creating pivot tables), window functions and QUALIFY
clauses to filter them, and many other analytics functions in its SQL dialect.
DuckDB isn’t limited to the data formats and behaviors baked into it. Its extension API makes it possible to write third-party add-ons for DuckDB to support new data formats or other behaviors.
Some of the functionality included with DuckDB is implemented through first-party add-ons, like support for Parquet files. Others, like MySQL or Postgres connectivity, or vector similarity search, are also maintained by DuckDB’s team but provided separately.
Next read this:
Posted by Richy George on 2 May, 2024
This post was originally published on this siteMongoDB has made Atlas Stream Processing, a new capability it trailed last June, generally available, it announced at its MongoDB.local event in New York City.
It added Atlas Stream processing to its NoSQL Atlas database-as-a-service (DBaaS) in order to help enterprises manage real-time streaming data from multiple sources in a single interface.
The new interface that can process any kind of data and has a flexible data model, bypassing the need for developers to use multiple specialized programming languages, libraries, application programming interfaces (APIs), and drivers, while avoiding the complexity of using these multiple tools, the company said, adding that it can work with both streaming and historical data using the document model.
Atlas Search Nodes is also generally available on AWS and Google Cloud, although the capability is still in preview on Microsoft Azure. This too was showcased last year: It’s a new capability inside the Atlas database that isolates search workloads from database workloads in order to maintain database and search performance.
Users will have to wait for one new capability: Atlas Edge Server. This feature, now in preview, gives developers the capability to deploy and operate distributed applications in the cloud and at the edge, the company said. It provides a local instance of MongoDB with a synchronization server that runs on local or remote infrastructure and significantly reduces the complexity and risk involved in managing applications in edge environments, allowing applications to access operational data even with intermittent connections to the cloud.
One other MongoDB feature also entered general availability: its Vector Search integration with AWS’ generative AI service, Amazon Bedrock. This means that enterprises can use the integration to customize foundation large language models with real-time operational data by converting it into vector embeddings.
Further, enterprises can also use Agents for Amazon Bedrock for retrieval-augmented generation (RAG), the company said.
Next read this:
Posted by Richy George on 2 May, 2024
This post was originally published on this siteOracle is making the latest long-term support release version of its database offering — Database 23c — generally available for enterprises under the name Oracle Database 23ai.
The change in nomenclature can be attributed to the addition of new features to the database that are expected to help with AI-based application development among other tasks, the company said.
Database 23c, showcased for the first time at the company’s annual event in 2022, was released to developers in early 2023 before being released to enterprises, marking a shift in the company’s tradition for the first time.
Stiff competition from database rivals forced Oracle to shift its strategy for its databases business in favor of developers, who could offer the company a much-needed impetus for growth.
In September last year, Oracle said it was working on adding vector search capabilities to Database 23c at its annual CloudWorld conference.
These capabilities, dubbed AI Vector Search, included a new vector data type, vector indexes, and vector search SQL operators that enable the Oracle Database to store the semantic content of documents, images, and other unstructured data as vectors, and use these to run fast similarity queries, the company said.
AI Vector Search in Database 23c that has been passed onto 23ai along with other features, according to the company, also supports retrieval-augmented generation (RAG), a generative AI technique, that combines large language models (LLMs) and private business data to deliver responses to natural language questions.
Other notable features of Database 23c that have been passed onto 23ai include JSON Relational Duality, which unifies the relational and document data models, SQL support for Graph queries directly on OLTP data, and stored procedures in JavaScript, allowing developers to build applications in either relational or JSON paradigms.
Database 23ai, according to Oracle, will be available as a cloud service as well as on-premises through a variety of offerings, including Oracle Exadata Database Service, Oracle Exadata Cloud@Customer, and Oracle Base Database Service, as well as on Oracle Database@Azure.
While Oracle did not release Database 23ai’s pricing, the developer version of Database 23c continues to be free since its release.
The reason to offer Database 23c for free can be attributed to the company’s strategy to lower the barriers to the adoption of its database as rival database providers also add newer features, such as vector search, to support AI workloads.
Several database vendors, such as MongoDB, AWS, Google Cloud, Microsoft, Zilliz, DataStax, Pinecone, Couchbase, Snowflake, and SingleStore, have all added capabilities to support AI-based tasks.
Vector databases and vector search are two technologies that developers use to convert unstructured information into vectors, now more commonly called embeddings.
These embeddings, in turn, make storing, searching, and comparing the information easier, faster, and significantly more scalable for large datasets.
Next read this:
Posted by Richy George on 30 April, 2024
This post was originally published on this siteData has the potential to provide transformative business insights across various industries, yet harnessing that data presents significant challenges. Many businesses struggle with data overload, with vast amounts of data that are siloed and underutilized. How can organizations deal with large and growing volumes of data without sacrificing performance and operational efficiency? Another challenge is extracting insights from complex data. Traditionally, this work has required significant technical expertise, restricting access to specialized data scientists and analysts.
Recent AI breakthroughs in natural language processing are democratizing data access, enabling a wider range of users to query and interpret complex data sets. This broadened access helps organizations make informed decisions swiftly, capitalizing on the capability of AI copilots to process and analyze large-scale data in real time. AI copilots can also curb the high costs associated with managing large data sets by automating complex data processes and empowering less technical staff to undertake sophisticated data analysis, thus optimizing overall resource allocation.
Generative AI and large language models (LLMs) are not without their shortcomings, however. Most LLMs are built on general purpose, public knowledge. They won’t know the specific and sometimes confidential data of a particular organization. It’s also very challenging to keep LLMs up-to-date with ever-changing information. The most serious problem, however, is hallucinations—when the statistical processes in a generative model generate statements that simply aren’t true.
There’s an urgent need for AI that is more contextually relevant and less error-prone. This is particularly vital in predictive analytics and machine learning, where the quality of data can directly impact business outcomes.
TigerGraph CoPilot is an AI assistant that combines the powers of graph databases and generative AI to enhance productivity across various business functions, including analytics, development, and administration tasks. TigerGraph CoPilot allows business analysts, data scientists, and developers to use natural language to execute real-time queries against up-to-date data at scale. The insights can then be presented and analyzed through natural language, graph visualizations, and other perspectives.
TigerGraph CoPilot adds value to generative AI applications by increasing accuracy and reducing hallucinations. With CoPilot, organizations can tap the full potential of their data and drive informed decision-making across a spectrum of domains, including customer service, marketing, sales, data science, devops, and engineering.
TigerGraph CoPilot allows non-technical users to use their everyday speech to query and analyze their data, freeing them to focus on mining insights rather than having to learn a new technology or computer language. For each question, CoPilot employs a novel three-phase interaction with both the TigerGraph database and a LLM of the user’s choice, to obtain accurate and relevant responses.
The first phase aligns the question with the particular data available in the database. TigerGraph CoPilot uses the LLM to compare the question with the graph’s schema and replace entities in the question by graph elements. For example, if there is a vertex type of BareMetalNode and the user asks “How many servers are there?,” then the question will be translated to “How many BareMetalNode vertices are there?”
In the second phase, TigerGraph CoPilot uses the LLM to compare the transformed question with a set of curated database queries and functions in order to select the best match. Using pre-approved queries provides multiple benefits. First and foremost, it reduces the likelihood of hallucinations, because the meaning and behavior of each query has been validated. Second, the system has the potential of predicting the execution resources needed to answer the question.
In the third phase, TigerGraph CoPilot executes the identified query and returns the result in natural language along with the reasoning behind the actions. CoPilot’s graph-augmented natural language inquiry provides strong guardrails, mitigating the risk of model hallucinations, clarifying the meaning of each query, and offering an understanding of the consequences.
TigerGraph CoPilot also can create chatbots with graph-augmented AI on a user’s own documents. There’s no need to have an existing graph database. In this mode of operation, TigerGraph CoPilot builds a knowledge graph from source material and applies its unique variant of retrieval-augmented generation (RAG) to improve the contextual relevance and accuracy of answers to natural language questions.
First, when loading users’ documents, TigerGraph CoPilot extracts entities and relationships from document chunks and constructs a knowledge graph from the documents. Knowledge graphs organize information in a structured format, connecting data points through relationships. CoPilot will also identify concepts and build an ontology, adding semantics and reasoning to the knowledge graph, or users can provide their own concept ontology. Then, using this comprehensive knowledge graph, CoPilot performs hybrid retrievals, combining traditional vector search and graph traversals, to collect more relevant information and richer context to answer users’ questions.
Organizing the data as a knowledge graph allows a chatbot to access accurate, fact-based information quickly and efficiently, thereby reducing the reliance on generating responses from patterns learned during training, which can sometimes be incorrect or outdated.
TigerGraph CoPilot mitigates hallucinations by allowing LLMs to access the graph database via curated queries. It also adheres to the same role-based access control and security measures (already part of the TigerGraph database) to assure responsible AI. TigerGraph CoPilot also supports openness and transparency by open-sourcing its major components and allowing users to choose their LLM service.
By leveraging the TigerGraph database, TigerGraph CoPilot brings high performance to graph analytics. As a graph-RAG solution, it supports large-scale knowledge bases for knowledge graph-powered Q&A solutions.
Whether you are a business analyst, specialist, or investigator, TigerGraph CoPilot enables you to get information and insights quickly from your data. For example, CoPilot can generate reports for fraud investigators by answering questions like “Show me the list of recent fraud cases that were false positives.” CoPilot also facilitates more accurate investigations like “Who had transactions with account 123 in the past month with amounts larger than $1000?”
TigerGraph CoPilot can even answer “What if” questions by traversing your graph along dependencies. For example, you can easily find out “What suppliers can cover the shortage of part 123?” from your supply chain graph, or “What services would be affected by an upgrade to server 321” from your digital infrastructure graph.
TigerGraph CoPilot provides a complete solution for building Q&A chatbot on your own data and documents. Its knowledge graph-based RAG approach enables contextually accurate information retrieval that facilitates better answers and more informed decisions. CoPilot’s context-rich Q&A directly improves productivity and reduces costs in typical Q&A applications such as call centers, customer services, and knowledge search.
Furthermore, by merging a document knowledge graph and an existing business graph (e.g., product graph) into one intelligence graph, TigerGraph CoPilot can tackle problems that cannot be addressed by other RAG solutions. For example, by combining customers’ purchase history with product graphs, CoPilot can make more accurate personalized recommendations when customers type in their search queries or ask for recommendations. By combining patients’ medical history with healthcare graphs, doctors or health specialists can get more useful information about the patients to provide better diagnoses or treatments.
TigerGraph CoPilot addresses both the complex challenges associated with data management and analysis and the serious shortcomings of LLMs for business applications. By leveraging the power of natural language processing and advanced algorithms, organizations can unlock transformative business insights while navigating data overload and accessibility barriers. By tapping graph-based RAG, they can ensure the accuracy and relevance of LLM output.
CoPilot allows a wider range of users to leverage data effectively, driving informed decision-making and optimizing resource allocation across organizations. We believe it is a significant step forward in democratizing data access and empowering organizations to harness the full potential of their data assets.
Hamid Azzawe is CEO of TigerGraph.
—
Generative AI Insights provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss the challenges and opportunities of generative artificial intelligence. The selection is wide-ranging, from technology deep dives to case studies to expert opinion, but also subjective, based on our judgment of which topics and treatments will best serve InfoWorld’s technically sophisticated audience. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Contact doug_dineley@foundryco.com.
Next read this:
Posted by Richy George on 29 April, 2024
This post was originally published on this siteAt the end of March 2024, Mike Stonebraker announced in a blog post the release of DBOS Cloud, “a transactional serverless computing platform, made possible by a revolutionary new operating system, DBOS, that implements OS services on top of a distributed database.” That sounds odd, to put it mildly, but it makes more sense when you read the origin story:
The idea for DBOS (DataBase oriented Operating System) originated 3 years ago with my realization that the state an operating system must maintain (files, processes, threads, messages, etc.) has increased in size by about 6 orders of magnitude since I began using Unix on a PDP-11/40 in 1973. As such, storing OS state is a database problem. Also, Linux is legacy code at the present time and is having difficulty making forward progress. For example there is no multi-node version of Linux, requiring people to run an orchestrator such as Kubernetes. When I heard a talk by Matei Zaharia in which he said Databricks could not use traditional OS scheduling technology at the scale they were running and had turned to a DBMS solution instead, it was clear that it was time to move the DBMS into the kernel and build a new operating system.”
If you don’t know Stonebraker, he’s been a database-focused computer scientist (and professor) since the early 1970s, when he and his UC Berkeley colleagues Eugene Wong and Larry Rowe founded Ingres. Ingres later inspired Sybase, which was eventually the basis for Microsoft SQL Server. After selling Ingres to Computer Associates, Stonebraker and Rowe started researching Postgres, which later became PostgreSQL and also evolved into Illustra, which was purchased by Informix.
I heard Stonebraker talk about Postgres at a DBMS conference in 1980. What I got out of that talk, aside from an image of “jungle drums” calling for SQL, was the idea that you could add support for complex data types to the database by implementing new index types, extending the query language, and adding support for that to the query parser and optimizer. The example he used was geospatial information, and he explained one kind of index structure that would make 2D geometric database queries go very fast. (This facility eventually became PostGIS. The R-tree currently used by default in PostGIS GiST indexes wasn’t invented until 1984, so Mike was probably talking about the older quadtree index.)
Skipping ahead 44 years, it should surprise precisely nobody in the database field that DBOS uses a distributed version of PostgreSQL as its kernel database layer.
DBOS Transact, an open-source TypeScript framework, supports Postgres-compatible transactions, reliable workflow orchestration, HTTP serving using GET and POST, communication with external services and third-party APIs, idempotent requests using UUID keys, authentication and authorization, Kafka integration with exactly-once semantics, unit testing, and self-hosting. DBOS Cloud, a transactional serverless platform for deploying DBOS Transact applications, supports serverless app deployment, time-travel debugging, cloud database management, and observability.
Let’s highlight some major areas of interest.
The code shown in the screenshot below demonstrates transactions, as well as HTTP serving using GET. It’s worthwhile to read the code closely. It’s only 18 lines, not counting blank lines.
The first import (line 1) brings in the DBOS SDK classes that we’ll need. The second import (line 2) brings in the Knex.js SQL query builder, which handles sending the parameterized query to the Postgres database and returning the resulting rows. The database table schema is defined in lines 4 through 8; the only columns are a name
string and a greet_count
integer.
There is only one method in the Hello
class, helloTransaction
. It is wrapped in @GetApi
and @Transaction
decorators, which respectively cause the method to be served in response to an HTTP GET
request on the path /greeting/
followed by the username parameter you want to pass in and wrap the database call in a transaction, so that two instances can’t update the database simultaneously.
The database query string (line 16) uses PostgreSQL syntax to try to insert a row into the database for the supplied name and an initial count of 1. If the row already exists, then the ON CONFLICT
trigger runs an update operation that increments the count in the database.
Line 17 uses Knex.js to send the SQL query to the DBOS system database and retrieves the result. Line 18 pulls the count out of the first row of results and returns the greeting string to the calling program.
The use of SQL and a database for what feels like should be a core in-memory system API, such as a Linux atomic counter or a Windows interlocked variable, seems deeply weird. Nevertheless, it works.
When you run an application in DBOS Cloud it records every step and change it makes (the workflow) in the database. You can debug that using Visual Studio Code and the DBOS Time Travel Debugger extension. The time-travel debugger allows you to debug your DBOS application against the database as it existed at the time the selected workflow originally executed.
The DBOS Quickstart tutorial requires Node.js 20 or later and a PostgreSQL database you can connect to, either locally, in a Docker container, or remotely. I already had Node.js v20.9.0 installed on my M1 MacBook, but I upgraded it to v20.12.1 from the Node.js website.
I didn’t have PostgreSQL installed, so I downloaded and ran the interactive installer for v16.2 from EnterpriseDB. This installer creates a full-blown macOS server and applications. If I had used Homebrew instead, it would have created command-line applications, and if I had used Postgres.app, I would have gotten a menu-bar app.
The Quickstart proper starts by creating a DBOS app directory using Node.js.
martinheller@Martins-M1-MBP ~ % npx -y @dbos-inc/create@latest -n myapp Merged .gitignore files saved to myapp/.gitignore added 590 packages, and audited 591 packages in 25s found 0 vulnerabilities added 1 package, and audited 592 packages in 1s found 0 vulnerabilities added 129 packages, and audited 721 packages in 5s found 0 vulnerabilities Application initialized successfully!
Then you configure the app to use your Postgres server and export your Postgres password into an enviroment variable.
martinheller@Martins-M1-MBP ~ % cd myapp martinheller@Martins-M1-MBP myapp % npx dbos configure ? What is the hostname of your Postgres server? localhost ? What is the port of your Postgres server? 5432 ? What is your Postgres username? postgres martinheller@Martins-M1-MBP myapp % export PGPASSWORD=*********
After that, you create a “Hello” database using Node.js and Knex.js.
martinheller@Martins-M1-MBP myapp % npx dbos migrate 2024-04-09 15:01:42 [info]: Starting migration: creating database hello if it does not exist 2024-04-09 15:01:42 [info]: Database hello does not exist, creating... 2024-04-09 15:01:42 [info]: Executing migration command: npx knex migrate:latest 2024-04-09 15:01:43 [info]: Batch 1 run: 1 migrations 2024-04-09 15:01:43 [info]: Creating DBOS tables and system database. 2024-04-09 15:01:43 [info]: Migration successful!
With that complete, you build and run the DBOS app locally.
martinheller@Martins-M1-MBP myapp % npm run build npx dbos start > myapp@0.0.1 build > tsc 2024-04-09 15:02:30 [info]: Workflow executor initialized 2024-04-09 15:02:30 [info]: HTTP endpoints supported: 2024-04-09 15:02:30 [info]: GET : /greeting/:user 2024-04-09 15:02:30 [info]: DBOS Server is running at http://localhost:3000 2024-04-09 15:02:30 [info]: DBOS Admin Server is running at http://localhost:3001 ^C
At this point, you can browse to http://localhost:3000 to test the application. That done, you register for the DBOS Cloud and provision your own database there.
martinheller@Martins-M1-MBP myapp % npx dbos-cloud register -u meheller 2024-04-09 15:11:35 [info]: Welcome to DBOS Cloud! 2024-04-09 15:11:35 [info]: Before creating an account, please tell us a bit about yourself! Enter First/Given Name: Martin Enter Last/Family Name: Heller Enter Company: self 2024-04-09 15:12:06 [info]: Please authenticate with DBOS Cloud! Login URL: https://login.dbos.dev/activate?user_code=QWKW-TXTB 2024-04-09 15:12:12 [info]: Waiting for login... 2024-04-09 15:12:17 [info]: Waiting for login... 2024-04-09 15:12:22 [info]: Waiting for login... 2024-04-09 15:12:27 [info]: Waiting for login... 2024-04-09 15:12:32 [info]: Waiting for login... 2024-04-09 15:12:38 [info]: Waiting for login... 2024-04-09 15:12:44 [info]: meheller successfully registered! martinheller@Martins-M1-MBP myapp % npx dbos-cloud db provision iw_db -U meheller Database Password: ******** 2024-04-09 15:19:22 [info]: Successfully started provisioning database: iw_db 2024-04-09 15:19:28 [info]: {"PostgresInstanceName":"iw_db","HostName":"userdb-51fcc211-6ed3-4450-a90e-0f864fc1066c.cvc4gmaa6qm9.us-east-1.rds.amazonaws.com","Status":"available","Port":5432,"DatabaseUsername":"meheller","AdminUsername":"meheller"} 2024-04-09 15:19:28 [info]: Database successfully provisioned!
Finally, you can register and deploy your app in the DBOS Cloud.
martinheller@Martins-M1-MBP myapp % npx dbos-cloud app register -d iw_db 2024-04-09 15:20:09 [info]: Loaded application name from package.json: myapp 2024-04-09 15:20:09 [info]: Registering application: myapp 2024-04-09 15:20:11 [info]: myapp ID: d8806829-c5b8-4df0-8b5a-2d1bf87c3322 2024-04-09 15:20:11 [info]: Successfully registered myapp! martinheller@Martins-M1-MBP myapp % npx dbos-cloud app deploy 2024-04-09 15:20:35 [info]: Loaded application name from package.json: myapp 2024-04-09 15:20:35 [info]: Submitting deploy request for myapp 2024-04-09 15:21:09 [info]: Submitted deploy request for myapp. Assigned version: 1712676035 2024-04-09 15:21:13 [info]: Waiting for myapp with version 1712676035 to be available 2024-04-09 15:21:21 [info]: Successfully deployed myapp! 2024-04-09 15:21:21 [info]: Access your application at https://meheller-myapp.cloud.dbos.dev/
The “Hello” application does illustrate some of the core features of DBOS Transact and the DBOS Cloud, but it’s so basic that it’s barely a toy. The Programming Quickstart adds a few more details, and it’s worth your time to go through it. You’ll learn how to use communicator functions to access third-party services (email, in this example) as well as how to compose reliable workflows. You’ll literally interrupt the workflow and restart it without re-sending the email: DBOS workflows always run to completion and each of their operations executes once and only once. That’s possible because DBOS persists the output of each step in your database.
Once you’ve understood the programming Quickstart, you’ll be ready to try out the two DBOS demo applications, which do rise to the level of being toys. Both demos use Next.js for their front ends, and both use DBOS workflows, transactions, and communicators.
The first demo, E-Commerce, is a web shopping and payment processing system. It’s worthwhile reading the Under the Covers section of the README in the demo’s repository to understand how it works and how you might want to upgrade it to, for example, use a real-world payment provider.
The second demo, YKY Social, simulates a simple social network, and uses TypeORM rather than Knex.js for its database code. It also uses Amazon S3 for profile photos. If you’re serious about using DBOS yourself, you should work though both demo applications.
I have to say that DBOS and DBOS Cloud look very interesting. Reliable execution and time-travel debugging, for example, are quite desirable. On the other hand, I wouldn’t want to build a real application on DBOS or DBOS Cloud at this point. I have lots of questions, starting with “How does it scale in practice?” and probably ending with “How much will it cost at X scale?”
I mentioned earlier that DBOS code looks weird but works. I would imagine that any programming shop considering writing an application on it would be discouraged or even repelled by the “it looks weird” part, as developers tend to be set in their ways until what they are doing no longer works.
I also have to point out that the current implementation of DBOS is very far from the system diagram you saw near the beginning of this review. Where’s the minimal kernel? DBOS currently runs on macOS, Linux, and Windows. None of those are minimal kernels. DBOS Cloud currently runs on AWS. Again, not a minimal kernel.
So, overall, DBOS is a tantalizing glimpse of something that may eventually turn out to be cool. It’s new and shiny, and it comes from smart people, but it will be awhile before it could possibly become a mainstream system.
—
Cost: Free with usage limits; paid plans require you to contact sales.
Platform: macOS, Linux, Windows, AWS.
Next read this:
Copyright 2015 - InnovatePC - All Rights Reserved
Site Design By Digital web avenue