Dell Latitude E6410 Notebook| Quantity Available: 40+
This post is intended for businesses and other organizations interested... Read more →
Posted by Richy George on 30 July, 2024
This post was originally published on this sitevideo
In this Linux tip, we will try out the watch command. It’s a command that will run repeatedly, overwriting its previous output until you stop it with a ^c (Ctrl + “c”) command. It can be used to sit and wait for some change in the output that you’re waiting to see.
By default, a command that is run through watch will run two seconds. You can change the time with the -t option. If you, for example, use the command “watch who”, the output will not change except for the date/time in the upper right corner – at least not until someone logs in or out of the system.
Every 2.0s: who fedora: Sat May 25 15:11:22 2024
fedora seat0 2024-05-25 14:24 (login screen)
fedora tty2 2024-05-25 14:24 (tty2)
shs pts/1 2024-05-25 14:25 (192.168.0.11)
Once another person logs in or someone logs out, a line will be added or removed from the list of logged in users.
Closing: Well, that’s your Linux tip for the watch command. It can be useful when you’re waiting for some change to happen on your Linux system.
If you have questions or would like to suggest a topic, please add a comment below. And don’t forget to subscribe to the InfoWorld channel on YouTube.
If you like this video, please hit the like and share buttons. For more Linux tips, be sure to follow us on Facebook, YouTube and NetworkWorld.com.
Jul 30, 2024 2 mins
Open Source
Posted by Richy George on 30 July, 2024
This post was originally published on this sitevideo
In this Linux tip, we will try out the watch command. It’s a command that will run repeatedly, overwriting its previous output until you stop it with a ^c (Ctrl + “c”) command. It can be used to sit and wait for some change in the output that you’re waiting to see.
By default, a command that is run through watch will run two seconds. You can change the time with the -t option. If you, for example, use the command “watch who”, the output will not change except for the date/time in the upper right corner – at least not until someone logs in or out of the system.
Every 2.0s: who fedora: Sat May 25 15:11:22 2024
fedora seat0 2024-05-25 14:24 (login screen)
fedora tty2 2024-05-25 14:24 (tty2)
shs pts/1 2024-05-25 14:25 (192.168.0.11)
Once another person logs in or someone logs out, a line will be added or removed from the list of logged in users.
Closing: Well, that’s your Linux tip for the watch command. It can be useful when you’re waiting for some change to happen on your Linux system.
If you have questions or would like to suggest a topic, please add a comment below. And don’t forget to subscribe to the InfoWorld channel on YouTube.
If you like this video, please hit the like and share buttons. For more Linux tips, be sure to follow us on Facebook, YouTube and NetworkWorld.com.
Jul 30, 2024 2 mins
Open Source
Posted by Richy George on 4 July, 2024
This post was originally published on this siteOracle celebrated the beginning of July with the general availability of three releases of its open source database, MySQL: MySQL 8.0.38, the first update of its long-term support (LTS) version, MySQL 8.4, and the first major version of its 9.x innovation release, MySQL 9.0.
While the v8 releases are bug fixes and security releases only, MySQL 9.0 Innovation is a shiny new version with additional features, as well as some changes that may require attention when upgrading from a previous version.
The new 9.0 versions of MySQL Clients, Tools, and Connectors are also live, and Oracle recommends that they be used with MySQL Server 8.0, and 8.4 LTS as well as with 9.0 Innovation.
This initial 9.x Innovation release, Oracle says, is preparation for new features in upcoming releases. But it still contains useful things and can be upgraded to from MySQL 8.4 LTS; the MySQL Configurator automatically does the upgrade without user intervention during MSI installations on Windows.
The major changes include:
Insecure and elderly SHA-1, after being deprecated in MySQL 8, is gone, and the server now rejects mysql_native authentication requests from older client programs which do not have CLIENT_PLUGIN_AUTH capability. Before upgrading to 9.0, Oracle says, user accounts in 8.0 and 8.4 must be altered from mysql_native_password to caching_sha2_password.
In the Optimizer, ER_SUBQUERY_NO_1_ROW has been removed from the list of errors which are ignored by statements which include the IGNORE keyword. This change can make an UPDATE, DELETE, or INSERT statement which includes the IGNORE keyword raise errors if it contains a SELECT statement with a scalar subquery that produces more than one row.
MySQL is now on a three-month release cadence, with major LTS releases every two years. In October, Oracle says we can expect bug and security releases MySQL 8.4.2 LTS and MySQL 8.0.39, and the MySQL 9.1 Innovation release, with new features as well as bug and security fixes.
Next read this:
Posted by Richy George on 2 July, 2024
This post was originally published on this siteOpen-source vector database provider Qdrant has launched BM42, a vector-based hybrid search algorithm intended to provide more accurate and efficient retrieval for retrieval-augmented generation (RAG) applications. BM42 combines the best of traditional text-based search and vector-based search to lower the costs for RAG and AI applications, Qdrant said.
Qdrant’s BM42 was announced July 2. Traditional keyword search engines, using algorithms such as BM25, have been around for more than 50 years and are not optimized for the precise retrieval needed in modern applications, according to Qdrant. As a result they struggle with specific RAG demands, particularly with short segments requiring further context to inform successful search and retrieval. Moving away from a keyword-based search to a fully vectorized based offers a new industry standard, Qdrant said.
“BM42, for short texts which are more prominent in RAG scenarios, provides the efficiency of traditional text search approaches, plus the context of vectors, so is more flexible, precise, and efficient,” Andrey Vasnetsov, Qdrant CTO and co-founder, said. This helps to make vector search more universally applicable, he added.
Unlike traditional keyword-based search suited for long-form content, BM42 integrates sparse and dense vectors to pinpoint relevant information within a document. A sparse vector handles exact term matching, while dense vectors handle semantic relevance and deep meaning, according to the company.
Next read this:
Posted by Richy George on 26 June, 2024
This post was originally published on this siteBuoyed by customer demand, SingleStore, the company behind the relational database SingleStoreDB, has decided to natively integrate Apache Iceberg into its offering to help its enterprise customers make use of data stored in data lakehouses.
“With this new integration, SingleStore aims to transform the dormant data inside lakehouses into a valuable real-time asset for enterprise applications. Apache Iceberg, a popular open standard for data lakehouses, provides CIOs with cost-efficient storage and querying of large datasets,” said Dion Hinchcliffe, senior analyst at The Futurum Group.
Hinchcliffe pointed out that SingleStore’s integration includes updates that help its customers bypass the challenges that they may typically face when adopting traditional methods to make the data in Iceberg tables more immediate.
These challenges include complex, extensive ETL (extract, transform, load) workflows and compute-intensive Spark jobs.
Some of the key features of the integration are low-latency ingestion, bi-directional data flow, and real-time performance at lower costs, the company said.
Explaining how SingleStore achieves low latency across queries and updates, IDC research vice president Carl Olofson said that the company —formerly known as MemSQL — a memory-optimized and high-performance version of the relational database management system — uses memory features as a sort of cache.
“By doing so, the company can dramatically improve the speed with which Iceberg tables can be queried and updated,” Olofson explained, adding that the company might be proactively loading data from Iceberg into their internal memory-optimized format.
Before the Iceberg integration, SingleStore held data in a form or format that is optimized for rapid swapping into memory, where all data processing took place, the analyst said.
Several other database vendors, notably Databricks, have made attempts to adopt the Apache Iceberg table format due to its rising popularity with enterprises.
Earlier this month, Databricks agreed to acquire Tabular, the storage platform vendor led by the creators of Apache Iceberg, in order to promote data interoperability in lakehouses.
Another data lakehouse format — Delta Live Tables — developed by Databricks and later open sourced via The Linux Foundation, competes with Iceberg tables.
Currently, the company is working on another format that allows enterprises to use both Iceberg and Delta Live tables.
Both Olofson and Hinchcliffe pointed out that several vendors and offerings — such as Google’s BigQuery, Starburst, IBM’s Watsonx.data, SAP’s DataSphere, Teradata, Cloudera, Dremio, Presto, Hive, Impala, StarRocks, and Doris — have integrated Iceberg as an open source analytics table format for very large datasets.
The native integration of Iceberg into SingleStoreDB is currently in public preview.
As part of the updates to SingleStoreDB, the company is adding new capabilities to its full-text search feature that improve relevance scoring, phonetic similarity, fuzzy matching, and keyword proximity-based ranking.
The combination of these capabilities allows enterprises to eliminate the need for additional specialty databases to build generative AI-based applications, the company explained.
Additionally, the company has introduced an autoscaling feature in public preview that allows enterprises to manage workloads or applications by scaling compute resources up or down.
It also lets users define thresholds for CPU and memory usage for autoscaling, to avoid any unnecessary consumption.
Further, the company said it is introducing a new deployment option for the database via Helios -BYOC, which is a managed version of the database via a virtual private cloud.
This offering is now available in private preview in AWS and enterprise customers can run SingleStore in their own tenants while complying with data residency and governance policies, the company said.
Next read this:
Posted by Richy George on 26 June, 2024
This post was originally published on this siteOracle is adding new generative AI-focused features to its Heatwave data analytics cloud service, previously known as MySQL HeatWave.
The new name highlights how HeatWave offers more than just MySQL support, and also includes HeatWave Gen AI, HeatWave Lakehouse, and HeatWave AutoML, said Nipun Agarwal, senior vice president of HeatWave at Oracle.
At its annual CloudWorld conference in September 2023, Oracle previewed a series of generative AI-focused updates for what was then MySQL HeatWave.
These updates included an interface driven by a large language model (LLM), enabling enterprise users to interact with different aspects of the service in natural language, a new Vector Store, Heatwave Chat, and AutoML support for HeatWave Lakehouse.
Some of these updates, along with additional capabilities, have been combined to form the HeatWave Gen AI offering inside HeatWave, Oracle said, adding that all these capabilities and features are now generally available at no additional cost.
In a first among database vendors, Oracle has added support for LLMs inside a database, analysts said.
HeatWave Gen AI’s in-database LLM support, which leverages smaller LLMs with fewer parameters such as Mistral-7B and Meta’s Llama 3-8B running inside the database, is expected to reduce infrastructure cost for enterprises, they added.
“This approach not only reduces memory consumption but also enables the use of CPUs instead of GPUs, making it cost-effective, which given the cost of GPUs will become a trend at least in the short term until AMD and Intel catch up with Nvidia,” said Ron Westfall, research director at The Futurum Group.
Another reason to use smaller LLMs inside the database is the ability to have more influence on the model with fine tuning, said David Menninger, executive director at ISG’s Ventana Research.
“With a smaller model the context provided via retrieval augmented generation (RAG) techniques has a greater influence on the results,” Menninger explained.
Westfall also gave the example of IBM’s Granite models, saying that the approach to using smaller models, especially for enterprise use cases, was becoming a trend.
The in-database LLMs, according to Oracle, will allow enterprises to search data, generate or summarize content, and perform RAG with HeatWave’s Vector Store.
Separately, HeatWave Gen AI also comes integrated with the company’s OCI Generative Service, providing enterprises with access to pre-trained and other foundational models from LLM providers.
A number of database vendors that didn’t already offer specialty vector databases have added vector capabilities to their wares over the last 12 months—MongoDB, DataStax, Pinecone, and CosmosDB for NoSQL among them — enabling customers to build AI and generative AI-based use cases over data stored in these databases without moving data to a separate vector store or database.
Oracle’s Vector Store, already showcased in September, automatically creates embeddings after ingesting data in order to process queries faster.
Another capability added to HeatWave Gen AI is scale-out vector processing that will allow HeatWave to support VECTOR as a data type and in turn help enterprises process queries faster.
“Simply put, this is like adding RAG to a standard relational database,” Menninger said. “You store some text in a table along with an embedding of that text as a VECTOR data type. Then when you query, the text of your query is converted to an embedding. The embedding is compared to those in the table and the ones with the shortest distance are the most similar.”
Another new capability added to HeatWave Gen AI is HeatWave Chat—a Visual Code plug-in for MySQL Shell which provides a graphical interface for HeatWave GenAI and enables developers to ask questions in natural language or SQL.
The retention of chat history makes it easier for developers to refine search results iteratively, Menninger said.
HeatWave Chat comes in with another feature dubbed the Lakehouse Navigator, which allows enterprise users to select files from object storage to create a new vector store.
This integration is designed to enhance user experience and efficiency of developers and analysts building out a vector store, Westfall said.
Next read this:
Posted by Richy George on 25 June, 2024
This post was originally published on this siteDataStax is updating its tools for building generative AI-based applications in an effort to ease and accelerate application development for enterprises, databases, and service providers.
One of these tools is Langflow, which DataStax acquired in April. It is an open source, web-based no-code graphical user interface (GUI) that allows developers to visually prototype LangChain flows and iterate them to develop applications faster.
LangChain is a modular framework for Python and JavaScript that simplifies the development of applications that are powered by generative AI language models or LLMs.
According to the company’s Chief Product Officer Ed Anuff, the update to Langflow is a new version dubbed Langflow 1.0, which is the official open source release that comes after months of community feedback on the preview.
“Langflow 1.0 adds more flexible, modular components and features to support complex AI pipelines required for more advanced retrieval augmented generation (RAG) techniques and multi-agent architectures,” Anuff said, adding that Langflow’s execution engine was now Turing complete.
Turing complete or completeness is a term used in computer science to describe a programmable system that can carry out or solve any computational problem.
Langflow 1.0 also comes with LangSmith integration that will allow enterprise developers to monitor LLM-based applications and perform observability on them, the company said.
A managed version of Langflow is also being made available via DataStax in a public preview.
“Astra DB environment details will be available in Langflow and users will be able to access Langflow via the Astra Portal, and usage will be free,” Anuff explained.
DataStax has also released a new version of RAGStack, its curated stack of open-source software for implementing RAG in generative AI-based applications using Astra DB Serverless or Apache Cassandra as a vector store.
The new version, dubbed RAGStack 1.0, comes with new features such as Langflow, Knowledge Graph RAG, and ColBERT among others.
The Knowledge Graph RAG feature, according to the company, provides an alternative way to retrieve information using a graph-based representation. This alternative method can be more accurate than vector-based similarity search alone with Astra DB, it added.
Other features include the introduction of Text2SQL and Text2CQL (Cassandra Query Language) to bring all kinds of data into the generative AI flow for application development.
While DataStax offers a separate non-managed version of RAGStack 1.0 under the name Luna for RAGStack, Anuff said that the managed version offers more value for enterprises.
“RAGStack is based on open source components, and you could take all of those projects and stitch them together yourself. However, we think there is a huge amount of value for companies in getting their stack tested and integrated for them, so they can trust that it will deliver at scale in the way that they want,” the chief product officer explained.
The company has also partnered with several other companies such as Unstructured to help developers extract and transform data to be stored in AstraDB for building generative AI-based applications.
“The partnership with Unstructured provides DataStax customers with the ability to use the latter’s capabilities to extract and transform data in multiple formats – including HTML, PDF, CSV, PNG, PPTX – and convert it into JSON files for use in AI initiatives,” said Matt Aslett, director at ISG’s Ventana Research.
Other partnerships include collaboration with the top embedding providers, such as OpenAI, Hugging Face, Mistral AI, and Nvidia among others.
Next read this:
Posted by Richy George on 13 June, 2024
This post was originally published on this site35% of enterprise leaders will consider Postgres for their next project, based on this research conducted by EDB, which also revealed that out of this group, the great majority believe that AI is going mainstream in their organization. Add to this, for the first time ever, analytical workloads have begun to surpass transactional workloads.
Enterprises see the potential of Postgres to fundamentally transform the way they use and manage data, and they see AI as a huge opportunity and advantage. But the diverse data teams within these organizations face increasing fragmentation and complexity when it comes to their data. To operationalize data for AI apps, they demand better observability and control across the data estate, not to mention a solution that works seamlessly across clouds.
It’s clear that Postgres has the right to play and deliver for the AI generation of apps, and EDB has taken recent strides to do just this with the release of EDB Postgres AI, an intelligent platform for transactional, analytical, and AI workloads.
The new platform product offers a unified approach to data management and is designed to streamline operations across hybrid cloud and multi-cloud environments, meeting enterprises wherever they are in their digital transformation journey.
EDB Postgres AI helps elevate data infrastructure to a strategic technology asset, by bringing analytical and AI systems closer to customers’ core operational and transactional data—all managed through the popular open source database, Postgres.
Let’s take a look at the key features and advantages of EDB Postgres AI.
Analysts and data scientists need to launch critical new projects, and they need access to up-to-the-second transactional and operational data within their core Postgres databases. Yet these teams are often forced to default to clunky ETL or ELT processes that result in latency, data inconsistency, and quality issues that hamper efficiency-extracting insights.
EDB Postgres AI introduces a simple platform for deploying new analytics and data science projects rapidly, without the need for operationally expensive data pipelines and multiple platforms. EDB Postgres AI’s Lakehouse capabilities allow for the rapid execution of analytical queries on transactional data without impacting performance, all using the same intuitive interface. By storing operational data in a columnar format, EDB Postgres AI boosts query speeds by up to 30x faster compared to standard Postgres and reduces storage costs, making real-time analytics more accessible.
Even if data teams have made Postgres their primary database, chances are their data estate is still sprawled across a diverse mix of fully-managed and self-managed Postgres deployments. Managing these systems becomes increasingly difficult and costly, particularly when it comes to ensuring uptime, security and compliance.
The new capabilities of the recent EDB release will help customers create and deliver value greater than the sum of all the data parts, no matter where it is. EDB Postgres AI provides comprehensive observability tools that offer a unified view of Postgres deployments across different environments. This means that users can monitor and tune their databases, with automatic suggestions on improving query performance, AI-driven event detection and log analysis, and smart alerting when metrics exceed configurable thresholds.
With the surge in AI advancements, EDB sees a significant opportunity to enhance data management for our customers through AI integration. The strategy of the new platforms is twofold: integrate AI capabilities into Postgres, and simultaneously, optimize Postgres for AI workloads.
Firstly, this release includes an AI-driven migration copilot, which is trained on EDB documentation and knowledge bases and helps answer common questions about migration errors including command line and schema issues, with instant error resolution and guidance tailored to database needs.
In addition, EDB remains focused on optimizing Postgres for AI workloads through support for vector databases and AI workloads. With capabilities like the pgvector extension and EDB’s pgai extension, the platform enables the storage and querying of vector embeddings, crucial for AI applications. This support allows developers to build sophisticated AI models directly within the Postgres ecosystem.
In addition, EDB remains focused on optimizing Postgres for AI workloads through support for vector databases and AI workloads. The EDB Postgres AI platform streamlines capabilities by providing a single place for storing vector embeddings and doing similarity search with both pgai and pgvector, which simplifies the AI application pipeline for builders. This support allows developers to build sophisticated AI models directly within the Postgres ecosystem. The platform also enables users to leverage the mature data management features of PostgreSQL such as reliability with high availability, security with Transparent Data Encryption (TDE), and scalability with on-premises, hybrid, and cloud deployments.
EDB Postgres AI transforms unstructured data management with its new powerful “retriever” functionality that enables similarity search across vector data. The auto embedding feature automatically generates AI embeddings for data in Postgres tables, keeping them up-to-date via triggers. Coupled with the retriever’s ability to create embeddings for Amazon S3 data on demand, pgai provides a seamless solution to making unstructured sources searchable by similarity. Users can also leverage a broad list of state-of-the-art encoder models like Hugging Face and OpenAI. With just pgai.create_retriever()
and pgai.retrieve()
, developers gain vector similarity capabilities within their trusted Postgres database.
This dual approach ensures that Postgres becomes a comprehensive solution for both traditional and AI-driven data management needs.
EDB Postgres AI maintains the critical, enterprise-grade capabilities that EDB is known for. This includes the comprehensive Oracle Compatibility Mode, which helps customers break free from legacy systems while lowering TCO by up to 80% compared to legacy commercial databases. The product also supports EDB’s geo-distributed high-availability solutions, meaning customers can deploy multi-region clusters with five-nines availability to guarantee that data is consistent, timely, and complete—even during disruptions.
The release of EDB Postgres AI marks EDB’s 20th year as a leader of enterprise-grade Postgres and introduces the next evolution of the company—one even more proudly associated with Postgres. Why? Because we know that the flexibility and extensibility make Postgres uniquely positioned to solve for the most complex and critical data challenges. Learn more about how EDB can help you use EDB Postgres AI for your most demanding applications.
Aislinn Shea Wright is VP of product management at EDB.
—
New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.
Next read this:
Posted by Richy George on 12 June, 2024
This post was originally published on this siteOracle is connecting its cloud to Google’s to offer Google customers high-speed access to database services. The move comes just nine months after it struck a similar deal with Microsoft to offer its database services on Azure. Separately, Microsoft is extending its Azure platform into Oracle’s cloud to give OpenAI access to more computing capacity on which to train its models.
“What started as a simple interconnect is becoming a more defined multicloud strategy for Oracle. The announcement is the beginning of a new trend—cloud providers are willing to work together to serve the needs of shared customers,” said Dave McCarthy, Research Vice President at IDC.
The Oracle-Google partnership will see the companies create a series of points of interconnect enabling customers of one to access services in the other’s cloud. Customers will be able to deploy general-purpose workloads with no cross-cloud data transfer charges, the companies said.
The two clouds will initially interconnect in 11 regions: Sydney, Melbourne, São Paulo, Montreal, Frankfurt, Mumbai, Tokyo, Singapore, Madrid, London, and Ashburn.
Oracle also plans to collocate its database hardware and software in Google’s datacenters, initially in North America and Europe, making it possible for joint customers to deploy, manage, and use Oracle database instances on Google Cloud without having to retool applications.
The two companies will market that service under the catchy name of Oracle Database@Google Cloud. Oracle Exadata Database Service, Oracle Autonomous Database Service, and Oracle Real Application Clusters (RAC) will all launch later this year across four regions: US East, US West, UK South, and Germany Central, with more planned later.
Oracle Database@Google Cloud customers will have access to a unified support service, and will be able to make purchases via the Google Cloud Marketplace using their existing Google Cloud commitments and Oracle license benefits.
The partnership with Google Cloud can be seen as a continuation of Oracle’s multicloud strategy that it started executing with the Microsoft partnership, analysts said, adding that Oracle expects that the new offerings will help many of its customers fully migrate from on-premises infrastructure to the cloud.
By adopting a multicloud approach, Oracle avoids going head-to-head “entrenched” cloud providers. Instead, McCarthy said, Oracle is leveraging its strengths in data management to solve problems that other cloud providers cannot.
Oracle may have been swayed by the experience of its partnership with Microsoft Azure, dbInsight’s chief analyst Tony Baer said. Although AWS may have been a more obvious target to partner with next due to its reach, Google Cloud was probably “more hungry” for a partnership, he said.
McCarthy expected AWS to soon start exploring a similar partnership with Oracle as the Azure and Google Cloud partnerships will put pressure on the hyperscaler.
“AWS faces the same challenges as the other clouds when it comes to Oracle workloads. I expect this increased competition from Azure and Google Cloud will force them to explore a similar route,” he said, adding that migrating Oracle workloads has always been tricky and cloud providers need to offer the combination of Oracle’s hardware and software to allow enterprises to unlock top notch performance across workloads.
Separately, Oracle is partnering with Microsoft to provide additional capacity for OpenAl by extending Microsoft’s Azure Al platform to Oracle Cloud Infrastructure (OCI).
“OCI will extend Azure’s platform and enable OpenAI to continue to scale,” said OpenAI CEO Sam Altman in a statement.
This partnership, according to independent semiconductor technology analyst Mike Demler, is all about increasing compute capacity.
“OpenAI runs on Microsoft’s Azure AI platform, and the models they’re creating continue to grow in size exponentially from one generation to the next,” Demler said.
While GPT-3 uses 175 billion parameters, the latest GPT-MoE (Mixture of Experts) is 10 times that large, with 1.8 trillion parameters, the independent analyst said, adding that the latter needs a lot more GPUs than Microsoft alone can supply in its cloud platform.
Next read this:
Posted by Richy George on 29 May, 2024
This post was originally published on this siteLike every other programming environment, you need a place to store your data when coding in the browser with JavaScript. Beyond simple JavaScript variables, there are a variety of options ranging in sophistication, from using localStorage
to cookies to IndexedDB
and the service worker cache API. This article is a quick survey of the common mechanisms for storing data in your JavaScript programs.
You are probably already familiar with JavaScript’s set of highly flexible variable types. We don’t need to review them here; they are very powerful and capable of modeling any kind of data from the simplest numbers to intricate cyclical graphs and collections.
The downside of using variables to store data is that they are confined to the life of the running program. When the program exits, the variables are destroyed. Of course, they may be destroyed before the program ends, but the longest-lived global variable will vanish with the program. In the case of the web browser and its JavaScript programs, even a single click of the refresh button annihilates the program state. This fact drives the need for data persistence; that is, data that outlives the life of the program itself.
An additional complication with browser JavaScript is that it’s a sandboxed environment. It doesn’t have direct access to the operating system because it isn’t installed. A JavaScript program relies on the agency of the browser APIs it runs within.
The other end of the spectrum from using built-in variables to store JavaScript data objects is sending the data off to a server. You can do this readily with a fetch() POST
request. Provided everything works out on the network and the back-end API, you can trust that the data will be stored and made available in the future with another GET
request.
So far, we’re choosing between the transience of variables and the permanence of server-side persistence. Each approach has a particular profile in terms of longevity and simplicity. But a few other options are worth exploring.
There are two types of built-in “web storage” in modern browsers: localStorage
and sessionStorage
. These give you convenient access to longer-lived data. They both give you a key-value and each has its own lifecycle that governs how data is handled:
localStorage
saves a key-value pair that survives across page loads on the same domain.sessionStorage
operates similarly to localStorage
but the data only lasts as long as the page session.In both cases, values are coerced to a string, meaning that a number will become a string version of itself and an object will become “[object Object]
.” That’s obviously not what you want, but if you want to save an object, you can always use JSON.stringify()
and JSON.parse()
.
Both localStorage
and sessionStorage
use getItem
and setItem
to set and retrieve values:
localStorage.setItem("foo","bar");
sessionStorage.getItem("foo"); // returns “bar”
You can most clearly see the difference between the two by setting a value on them and then closing the browser tab, then reopening a tab on the same domain and checking for your value. Values saved using localStorage
will still exist, whereas sessionStorage
will be null. You can use the devtools console to run this experiment:
localStorage.setItem("foo",”bar”);
sessionStorage.setItem("foo","bar");
// close the tab, reopen it
localStorage.getItem('bar2'); // returns “bar”
sessionStorage.getItem("foo") // returns null
Whereas localStorage
and sessionStorage
are tied to the page and domain, cookies give you a longer-lived option tied to the browser itself. They also use key-value pairs. Cookies have been around for a long time and are used for a wide range of cases, including ones that are not always welcome. Cookies are useful for tracking values across domains and sessions. They have specific expiration times, but the user can choose to delete them anytime by clearing their browser history.
Cookies are attached to requests and responses with the server, and can be modified (with restrictions governed by rules) by both the client and the server. Handy libraries like JavaScript Cookie simplify dealing with cookies.
Cookies are a bit funky when used directly, which is a legacy of their ancient origins. They are set for the domain on the document.cookie
property, in a format that includes the value, the expiration time (in RFC 5322 format), and the path. If no expiration is set, the cookie will vanish after the browser is closed. The path sets what path on the domain is valid for the cookie.
Here’s an example of setting a cookie value:
document.cookie = "foo=bar; expires=Thu, 18 Dec 2024 12:00:00 UTC; path=/";
And to recover the value:
function getCookie(cname) {
const name = cname + "=";
const decodedCookie = decodeURIComponent(document.cookie);
const ca = decodedCookie.split(';');
for (let i = 0; i < ca.length; i++) {
let c = ca[i];
while (c.charAt(0) === ' ') {
c = c.substring(1);
}
if (c.indexOf(name) === 0) {
return c.substring(name.length, c.length);
}
}
return "";
}
const cookieValue = getCookie("foo");
console.log("Cookie value for 'foo':", cookieValue);
In the above, we use decodeURIComponent
to unpack the cookie and then break it along its separator character, the semicolon (;), to access its component parts. To get the value we match on the name of the cookie plus the equals sign.
An important consideration with cookies is security, specifically cross-site scripting (XSS) and cross-site request forgery (CSRF) attacks. (Setting HttpOnly on a cookie makes it only accessible on the server, which increases security but eliminates the cookie’s utility on the browser.)
IndexedDB
is the most elaborate and capable in-browser data store. It’s also the most complicated. IndexedDB
uses asynchronous calls to manage operations. That’s good because it lets you avoid blocking the thread, but it also makes for a somewhat clunky developer experience.
IndexedDB
is really a full-blown object-oriented database. It can handle large amounts of data, modeled essentially like JSON. It supports sophisticated querying, sorting, and filtering. It’s also available in service workers as a reliable persistence mechanism between thread restarts and between the main and workers threads.
When you create an object store in IndexedDB
, it is associated with the domain and lasts until the user deletes it. It can be used as an offline datastore to handle offline functionality in progressive web apps, in the style of Google Docs.
To get a flavor of using IndexedDB
, here’s how you might create a new store:
let db = null; // A handle for the DB instance
llet request = indexedDB.open("MyDB", 1); // Try to open the “MyDB” instance (async operation)
request.onupgradeneeded = function(event) { // onupgradeneeded is the event indicated the MyDB is either new or the schema has changed
db = event.target.result; // set the DB handle to the result of the onupgradeneeded event
if (!db.objectStoreNames.contains("myObjectStore")) { // Check for the existence of myObjectStore. If it doesn’t exist, create it in the next step
let tasksObjectStore = db.createObjectStore("myObjectStore", { autoIncrement: true }); // create myObjectStore
}
};
The call to request.onsuccess = function(event) { db = event.target.result; }; // onsuccess
fires when the database is successfully opened. This will fire without onupgradeneeded
firing if the DB
and Object
store already exist. In this case, we save the db
reference:
request.onerror = function(event) { console.log("Error in db: " + event); }; // If an error occurs, onerror will fire
The above IndexedDB
code is simple—it just opens or creates a database and object store—but the code gives you a sense of IndexedDB
‘s asynchronous nature.
Service workers include a specialized data storage mechanism called cache. Cache makes it easy to intercept requests, save responses, and modify them if necessary. It’s primarily designed to cache responses (as the name implies) for offline use or to optimize response times. This is something like a customizable proxy cache in the browser that works transparently from the viewpoint of the main thread.
Here’s a look at caching a response using a cache-first strategy, wherein you try to get the response from the cache first, then fallback to the network (saving the response to the cache):
self.addEventListener('fetch', (event) => {
const request = event.request;
const url = new URL(request.url);
// Try serving assets from cache first
event.respondWith(
caches.match(request)
.then((cachedResponse) => {
// If found in cache, return the cached response
if (cachedResponse) {
return cachedResponse;
}
// If not in cache, fetch from network
return fetch(request)
.then((response) => {
// Clone the response for potential caching
const responseClone = response.clone();
// Cache the new response for future requests
caches.open('my-cache')
.then((cache) => {
cache.put(request, responseClone);
});
return response;
});
})
);
});
This gives you a highly customizable approach because you have full access to the request and response objects.
We’ve looked at the commonly used options for persisting data in the browser of varying profiles. When deciding which one to use, a useful algorithm is: What is the simplest option that meets my needs? Another concern is security, especially with cookies.
Other interesting possibilities are emerging with using WebAssembly for persistent storage. Wasm’s ability to run natively on the device could give performance boosts. We’ll look at using Wasm for data persistence another day.
Next read this:
Copyright 2015 - InnovatePC - All Rights Reserved
Site Design By Digital web avenue