Category Archives: Tech Reviews

Deno vs. Node.js: Which is better?

Posted by on 10 August, 2022

This post was originally published on this site

In this article, you’ll learn about Node.js and Deno, the differences between CommonJS and ECMAScript modules, using TypeScript with Deno, and faster deployments with Deno Deploy. We’ll conclude with notes to help you decide between using Node.js or Deno for your next development project.

What is Node.js?

Node.js is a cross-platform JavaScript runtime environment that is useful for both servers and desktop applications. It runs a single-threaded event loop registered with the system to handle connections, and each new connection causes a JavaScript callback function to fire. The callback function can handle requests with non-blocking I/O calls. If necessary, it can spawn threads from a pool to execute blocking or CPU-intensive operations and to balance the load across CPU cores.

Node’s approach to scaling with callback functions requires less memory to handle more connections than most competitive architectures that scale with threads, including Apache HTTP Server, the various Java application servers, IIS and ASP.NET, and Ruby on Rails.

Node applications aren’t limited to pure JavaScript. You can use any language that transpiles to JavaScript; for example, TypeScript and CoffeeScript. Node.js incorporates the Google Chrome V8 JavaScript engine, which supports ECMAScript 2015 (ES6) syntax without any need for an ES6-to-ES5 transpiler such as Babel.

Much of Node’s utility comes from its large package library, which is accessible from the npm command. NPM, the Node Package Manager, is part of the standard Node.js installation, although it has its own website.

The JavaScript-based Node.js platform was introduced by Ryan Dahl in 2009. It was developed as a more scalable alternative to the Apache HTTP Server for Linux and MacOS. NPM, written by Isaac Schlueter, launched in 2010. A native Windows version of Node.js debuted in 2011.

What is Deno?

Deno is a secure runtime for JavaScript and TypeScript that has been extended for WebAssembly, JavaScript XML (JSX), and its TypeScript extension, TSX. Developed by the creator of Node.js, Deno is an attempt to reimagine Node to leverage advances in JavaScript since 2009, including the TypeScript compiler.

Like Node.js, Deno is essentially a shell around the Google V8 JavaScript engine. Unlike Node, it includes the TypeScript compiler in its executable image. Dahl, who created both runtimes, has said that Node.js suffers from three major issues: a poorly designed module system based on centralized distribution; lots of legacy APIs that must be supported; and lack of security. Deno fixes all three problems.

Node’s module system problem was solved by an update in mid-2022.

CommonJS and ECMAScript modules

When Node was created, the de-facto standard for JavaScript modules was CommonJS, which is what npm originally supported. Since then, the ECMAScript committee officially blessed ECMAScript modules, also known as ES modules, which is supported by the jspm package manager. Deno also supports ES modules.

Experimental support for ES modules was added in Node.js 12.12 and is stable from Node.js 16 forward. TypeScript 4.7 also supports ES modules for Node.js 16.

The way to load a CommonJS module in JavaScript is to use the require statement. The way to load an ECMAScript module is to use an import statement along with a matching export statement.

The latest Node.js has loaders for both CommonJS and ES modules. How are they different? The CommonJS loader is fully synchronous; is responsible for handling require() calls; supports folders as modules; and tries adding extensions (.js, .json, or .node) if one was omitted from the require() call. The CommonJS loader cannot be used to load ECMAScript modules. The ES modules loader is asynchronous; is responsible for handling both import statements and import() expressions; does not support folders as modules (directory indexes such as ./startup/index.js must be fully specified); does not search for extensions; and accepts only .js, .mjs, and .cjs extensions for JavaScript text files. ES modules can be used to load JavaScript CommonJS modules.

Why Deno is better for security

It is well known that Deno improves security over Node. Mainly, this is because Deno, by default, does not let a program access disk, network, subprocesses, or environmental variables. When you need to allow any of these, you can opt-in with a command-line flag, which can be as granular as you like; for example, --allow-read=/tmp or Another security improvement in Deno is that it always dies on uncaught errors. Node, by contrast, will allow execution to proceed after an uncaught error, with unpredictable results.

Can you use Node.js and Deno together?

As you consider whether to use Node.js or Deno for your next server-side JavaScript project, you’ll probably wonder whether you can combine them. The answer to that is a definite “maybe.”

First off, many times, using Node packages from Deno just works. Even better, there are workarounds for many of the common stumbling blocks. These include using the std/node modules of the Deno standard library to “polyfill” the built-in modules of Node; using CDNs to access the vast majority of npm packages in ways that work under Deno; and using import maps. Moreover, Deno has a Node compatibility mode starting with Deno 1.15.

On the downside, Node’s plugin system is incompatible with Deno; Deno’s Node compatibility mode doesn’t support TypeScript; and a few built-in Node modules (such as vm) are incompatible with Deno.

 If you’re a Node user thinking of switching to Deno, here’s a cheat sheet to help you.

Using TypeScript with Deno

Deno treats TypeScript as a first-class language, just like JavaScript or WebAssembly. It converts TypeScript (as well as TSX and JSX) into JavaScript, using a combination of the TypeScript compiler, which is built into Deno, and a Rust library called swc. When the code has been type-checked (if checking is enabled) and transformed, it is stored in a cache. In other words, unlike Node.js or a browser, you don’t need to manually transpile your TypeScript for Deno with the tsc compiler.

As of Deno 1.23, there is no TypeScript type-checking in Deno by default. Since most developers interact with the type-checker through their editor, type-checking again when Deno starts up doesn’t make a lot of sense. That said, you can enable type-checking with the --check flag to Deno.

Deno Deploy for faster deployments

Deno Deploy is a distributed system that allows you to run JavaScript, TypeScript, and WebAssembly close to users, at the edge, worldwide. Deeply integrated with the V8 runtime, Deno Deploy servers provide minimal latency and eliminate unnecessary abstractions. You can develop your script locally using the Deno CLI, and then deploy it to Deno Deploy’s managed infrastructure in less than a second, with no need to configure anything.

Built on the same modern systems as the Deno CLI, Deno Deploy provides the latest and greatest in web technologies in a globally scalable way:

  • Builds on the web: Use fetch, WebSocket, or a URL just like in the browser.
  • Built-in support for TypeScript and JSX: type-safe code, and intuitive server-side rendering without a build step.
  • Web-compatible ECMAScript modules: Import dependencies just like in a browser, without the need for explicit installations.
  • GitHub integration: Push to a branch, review a deployed preview, and merge to release to production.
  • Extremely fast: Deploy in less than a second; serve globally close to users.
  • Deploy from a URL: Deploy code with nothing more than a URL.

Deno Deploy has two tiers. The free tier is limited to 100,000 requests per day, 100 GiB data transfer per month, and 10ms CPU time per request. The pro tier costs $10 per month including 5 million requests per month and 100 GiB data transfer, plus $2-per-million additional requests per month and $0.30/GiB data transfer over the included quota; the pro tier allows 50ms CPU time per request.

Which to choose: Node.js or Deno?

As you might expect, the answer of which technology is better for your use case depends on many factors. My bottom line: If you have an existing Node.js deployment that isn’t broken, then don’t fix it. If you have a new project that you intend to write in TypeScript, then I’d strongly consider Deno. However, if your TypeScript project needs to use multiple Node.js packages that do not have Deno equivalents, you will need to weigh the Deno project’s feasibility. Starting with a proof-of-concept is pretty much mandatory: It’s hard to predict whether you can make a given Node.js package work in Deno without trying it.

Review: Snowflake aces Python machine learning

Posted by on 3 August, 2022

This post was originally published on this site

Last year I wrote about eight databases that support in-database machine learning. In-database machine learning is important because it brings the machine learning processing to the data, which is much more efficient for big data, rather than forcing data scientists to extract subsets of the data to where the machine learning training and inference run.

These databases each work in a different way:

  • Amazon Redshift ML uses SageMaker Autopilot to automatically create prediction models from the data you specify via a SQL statement, which is extracted to an Amazon S3 bucket. The best prediction function found is registered in the Redshift cluster.
  • BlazingSQL can run GPU-accelerated queries on data lakes in Amazon S3, pass the resulting DataFrames to RAPIDS cuDF for data manipulation, and finally perform machine learning with RAPIDS XGBoost and cuML, and deep learning with PyTorch and TensorFlow.
  • BigQuery ML brings much of the power of Google Cloud Machine Learning into the BigQuery data warehouse with SQL syntax, without extracting the data from the data warehouse.
  • IBM Db2 Warehouse includes a wide set of in-database SQL analytics that includes some basic machine learning functionality, plus in-database support for R and Python.
  • Kinetica provides a full in-database lifecycle solution for machine learning accelerated by GPUs, and can calculate features from streaming data.
  • Microsoft SQL Server can train and infer machine learning models in multiple programming languages.
  • Oracle Cloud Infrastructure can host data science resources integrated with its data warehouse, object store, and functions, allowing for a full model development lifecycle.
  • Vertica has a nice set of machine learning algorithms built-in, and can import TensorFlow and PMML models. It can do prediction from imported models as well as its own models.

Now there’s another database that can run machine learning internally: Snowflake.

Snowflake overview

Snowflake is a fully relational ANSI SQL enterprise data warehouse that was built from the ground up for the cloud. Its architecture separates compute from storage so that you can scale up and down on the fly, without delay or disruption, even while queries are running. You get the performance you need exactly when you need it, and you only pay for the compute you use.

Snowflake currently runs on Amazon Web Services, Microsoft Azure, and Google Cloud Platform. It has recently added External Tables On-Premises Storage, which lets Snowflake users access their data in on-premises storage systems from companies including Dell Technologies and Pure Storage, expanding Snowflake beyond its cloud-only roots.

Snowflake is a fully columnar database with vectorized execution, making it capable of addressing even the most demanding analytic workloads. Snowflake’s adaptive optimization ensures that queries automatically get the best performance possible, with no indexes, distribution keys, or tuning parameters to manage.

Snowflake can support unlimited concurrency with its unique multi-cluster, shared data architecture. This allows multiple compute clusters to operate simultaneously on the same data without degrading performance. Snowflake can even scale automatically to handle varying concurrency demands with its multi-cluster virtual warehouse feature, transparently adding compute resources during peak load periods and scaling down when loads subside.

Snowpark overview

When I reviewed Snowflake in 2019, if you wanted to program against its API you needed to run the program outside of Snowflake and connect through ODBC or JDBC drivers or through native connectors for programming languages. That changed with the introduction of Snowpark in 2021.

Snowpark brings to Snowflake deeply integrated, DataFrame-style programming in the languages developers like to use, starting with Scala, then extending to Java and now Python. Snowpark is designed to make building complex data pipelines a breeze and to allow developers to interact with Snowflake directly without moving data.

The Snowpark library provides an intuitive API for querying and processing data in a data pipeline. Using this library, you can build applications that process data in Snowflake without moving data to the system where your application code runs.

The Snowpark API provides programming language constructs for building SQL statements. For example, the API provides a select method that you can use to specify the column names to return, rather than writing 'select column_name' as a string. Although you can still use a string to specify the SQL statement to execute, you benefit from features like intelligent code completion and type checking when you use the native language constructs provided by Snowpark.

Snowpark operations are executed lazily on the server, which reduces the amount of data transferred between your client and the Snowflake database. The core abstraction in Snowpark is the DataFrame, which represents a set of data and provides methods to operate on that data. In your client code, you construct a DataFrame object and set it up to retrieve the data that you want to use.

The data isn’t retrieved at the time when you construct the DataFrame object. Instead, when you are ready to retrieve the data, you can perform an action that evaluates the DataFrame objects and sends the corresponding SQL statements to the Snowflake database for execution.

snowpark python 01IDG

Snowpark block diagram. Snowpark expands the internal programmability of the Snowflake cloud data warehouse from SQL to Python, Java, Scala, and other programming languages.

Snowpark for Python overview

Snowpark for Python is available in public preview to all Snowflake customers, as of June 14, 2022. In addition to the Snowpark Python API and Python Scalar User Defined Functions (UDFs), Snowpark for Python supports the Python UDF Batch API (Vectorized UDFs), Table Functions (UDTFs), and Stored Procedures.

These features combined with Anaconda integration provide the Python community of data scientists, data engineers, and developers with a variety of flexible programming contracts and access to open source Python packages to build data pipelines and machine learning workflows directly within Snowflake.

Snowpark for Python includes a local development experience you can install on your own machine, including a Snowflake channel on the Conda repository. You can use your preferred Python IDEs and dev tools and be able to upload your code to Snowflake knowing that it will be compatible.

By the way, Snowpark for Python is free open source. That’s a change from Snowflake’s history of keeping its code proprietary.

The following sample Snowpark for Python code creates a DataFrame that aggregates book sales by year. Under the hood, DataFrame operations are transparently converted into SQL queries that get pushed down to the Snowflake SQL engine.

from snowflake.snowpark import Session
from snowflake.snowpark.functions import col

# fetch snowflake connection information
from config import connection_parameters

# build connection to Snowflake
session = Session.builder.configs(connection_parameters).create()

# use Snowpark API to aggregate book sales by year
booksales_df = session.table("sales")
booksales_by_year_df = booksales_df.groupBy(year("sold_time_stamp")).agg([(col("qty"),"count")]).sort("count", ascending=False)

Getting started with Snowpark Python

Snowflake’s “getting started” tutorial demonstrates an end-to-end data science workflow using Snowpark for Python to load, clean, and prepare data and then deploy the trained model to Snowflake using a Python UDF for inference. In 45 minutes (nominally), it teaches:

  • How to create a DataFrame that loads data from a stage;
  • How to perform data and feature engineering using the Snowpark DataFrame API; and
  • How to bring a trained machine learning model into Snowflake as a UDF to score new data.

The task is the classic customer churn prediction for an internet service provider, which is a straightforward binary classification problem. The tutorial starts with a local setup phase using Anaconda; I installed Miniconda for that. It took longer than I expected to download and install all the dependencies of the Snowpark API, but that worked fine, and I appreciate the way Conda environments avoid clashes among libraries and versions.

This quickstart begins with a single Parquet file of raw data and extracts, transforms, and loads the relevant information into multiple Snowflake tables.

snowpark python 03IDG

We’re looking at the beginning of the “Load Data with Snowpark” quickstart. This is a Python Jupyter Notebook running on my MacBook Pro that calls out to Snowflake and uses the Snowpark API. Step 3 originally gave me problems, because I wasn’t clear from the documentation about where to find my account ID and how much of it to include in the account field of the config file. For future reference, look in the “Welcome To Snowflake!” email for your account information.

snowpark python 04IDG

Here we are checking the loaded table of raw historical customer data and beginning to set up some transformations.

snowpark python 05IDG

Here we’ve extracted and transformed the demographics data into its own DataFrame and saved that as a table.

snowpark python 06IDG

In step 12, we extract and transform the fields for a location table. As before, this is done with a SQL query into a DataFrame, which is then saved as a table.

snowpark python 07IDG

Here we extract and transform data from the raw DataFrame into a Services table in Snowflake.

snowpark python 08IDG

Next we extract, transform, and load the final table, Status, which shows the churn status and the reason for leaving. Then we do a quick sanity check, joining the Location and Services tables into a Join DataFrame, then aggregating total charges by city and type of contract for a Result DataFrame.

snowpark python 09IDG

In this step we join the Demographics and Services tables to create a TRAIN_DATASET view. We use DataFrames for intermediate steps, and use a select statement on the joined DataFrame to reorder the columns.

Now that we’ve finished the ETL/data engineering phase, we can move on to the data analysis/data science phase.

snowpark python 10IDG

This page introduces the analysis we’re about to perform.

snowpark python 11IDG

We start by pulling in the Snowpark, Pandas, Scikit-learn, Matplotlib, datetime, NumPy, and Seaborn libraries, as well as reading our configuration. Then we establish our Snowflake database session, sample 10K rows from the TRAIN_DATASET view, and convert that to Pandas format.

snowpark python 12IDG

We continue with some exploratory data analysis using NumPy, Seaborn, and Pandas. We look for non-numerical variables and classify them as categories.

snowpark python 13IDG

Once we have found the categorical variables, then we identify the numerical variables and plot some histograms to see the distribution.

snowpark python 14IDG

All four histograms.

snowpark python 15IDG

Given the assortment of ranges we saw in the previous screen, we need to scale the variables for use in a model.

snowpark python 16IDG

Having all the numerical variables lie in the range from 0 to 1 will help immensely when we build a model.

snowpark python 17IDG

Three of the numerical variables have outliers. Let’s drop them to avoid having them skew the model.

snowpark python 18IDG

If we look at the cardinality of the categorical variables, we see they range from 2 to 4 categories.

snowpark python 19IDG

We pick our variables and write the Pandas data out to a Snowflake table, TELCO_TRAIN_SET.

Finally we create and deploy a user-defined function (UDF) for prediction, using more data and a better model.

snowpark python 20IDG

Now we set up for deploying a predictor. This time we sample 40K values from the training dataset.

snowpark python 21IDG

Now we’re setting up for model fitting, on our way to deploying a predictor. Splitting the dataset 80/20 is standard stuff.

snowpark python 22IDG

This time we’ll use a Random Forest classifier and set up a Scikit-learn pipeline that handles the data engineering as well as doing the fitting.

snowpark python 23IDG

Let’s see how we did. The accuracy is 99.38%, which isn’t shabby, and the confusion matrix shows relatively few false predictions. The most important feature is whether there is a contract, followed by tenure length and monthly charges.

snowpark python 24IDG

Now we define a UDF to predict churn and deploy it into the data warehouse.

snowpark python 25IDG

Step 18 shows another way to register the UDF, using session.udf.register() instead of a select statement. Step 19 shows another way to run the prediction function, incorporating it into a SQL select statement instead of a DataFrame select statement.

You can go into more depth by running Machine Learning with Snowpark Python, a 300-level quickstart, which analyzes Citibike rental data and builds an orchestrated end-to-end machine learning pipeline to perform monthly forecasts using Snowflake, Snowpark Python, PyTorch, and Apache Airflow. It also displays results using Streamlit.

Overall, Snowpark for Python is very good. While I stumbled over a couple of things in the quickstart, they were resolved fairly quickly with help from Snowflake’s extensibility support.

I like the wide range of popular Python machine learning and deep learning libraries and frameworks included in the Snowpark for Python installation. I like the way Python code running on my local machine can control Snowflake warehouses dynamically, scaling them up and down at will to control costs and keep runtimes reasonably short. I like the efficiency of doing most of the heavy lifting inside the Snowflake warehouses using Snowpark. I like being able to deploy predictors as UDFs in Snowflake without incurring the costs of deploying prediction endpoints on major cloud services.

Essentially, Snowpark for Python gives data engineers and data scientists a nice way to do DataFrame-style programming against the Snowflake enterprise data warehouse, including the ability to set up full-blown machine learning pipelines to run on a recurrent schedule.

Cost: $2 per credit plus $23 per TB per month storage, standard plan, prepaid storage. 1 credit = 1 node*hour, billed by the second. Higher level plans and on-demand storage are more expensive. Data transfer charges are additional, and vary by cloud and region. When a virtual warehouse is not running (i.e., when it is set to sleep mode), it does not consume any Snowflake credits. Serverless features use Snowflake-managed compute resources and consume Snowflake credits when they are used.

Platform: Amazon Web Services, Microsoft Azure, Google Cloud Platform.

JetBrains Fleet: The future of IDEs?

Posted by on 22 June, 2022

This post was originally published on this site

JetBrains Fleet is a new multi-language programming editor and IDE that represents JetBrains’ attempt to rebuild the entire integrated development environment from scratch. Fleet is separate from JetBrains’ effort to overhaul the user interfaces and user experiences of its existing IDEs, such as IntelliJ IDEA, without changing the IDEs’ code-centric features and integrations. Fleet will not replace any existing JetBrains IDEs.

JetBrains says Fleet was “built from scratch,” based on its 20 years of experience developing IDEs, and featurs “a distributed IDE architecture and a reimagined UI.” For Java, Fleet uses the IntelliJ code-processing engine. For some other languages, Fleet uses a language server, à la Visual Studio Code, instead of the IntelliJ engine.

I said earlier that Fleet is an editor and IDE. When you start it up, Fleet is a lightweight code editor. Once you’ve loaded a code directory, you can turn on “smart” mode, which indexes your code and enables IDE functionality, such as project and context-aware code completion, navigation to definitions and usages, on-the-fly code quality checks, and quick fixes. Indexing a large project can take awhile.

In many ways, the most direct competitor to Fleet is Visual Studio Code, with its language server architecture and large ecosystem of plugins. Fleet already has a language server architecture, but its plugin architecture is still being developed.

Fleet architecture

Fleet uses a distributed architecture that aims at simplicity of use for standalone instances, while also supporting collaborative development, remote/cloud IDEs, and multiple target file systems.

As shown in Figure 1 below, the Fleet architecture includes:

  • Front end – delivers the UI, parses the files, and provides limited highlighting for supported file types. There can be more than one front end attached to a workspace, allowing for collaborative development.
  • Workspace – the component whose main purpose is maintaining the front ends’ shared state when there are several of them. It also registers other components to provide information on the available services and APIs.
  • Back end – a headless service that does the heavy lifting: indexing, static analysis, advanced search, navigation, and the like. Every such operation is initiated by a request from the workspace, which then processes the response and dispatches the data to the components that require it. As a back end, you can use a headless IntelliJ IDEA or a language server. Note that back ends may have different requirements. For example, language servers need to run on the same machine where the source code is located, as shown in the diagram.
  • FSD (Fleet System Daemon) – a Fleet agent typically attached to the system where source code and SDKs reside. It is used to build the project, run code, execute terminal commands, and perform other actions in the target environment on behalf of Fleet.

Fleet is mainly written in Kotlin, which means it runs on the JVM. The UI framework is a home-grown solution using Skia (via Skiko). Fleet uses Rust for the Fleet System Daemon.

jetbrains fleet 01 IDG

The diagram shows the architecture of Fleet. Multiple front ends correspond to multiple users, and multiple back ends perform different functions. FSD is the Fleet System Daemon, which is an agent used to build the project, run code, and execute terminal commands. IJ is an IntelliJ engine, and LSP is a Language Server Protocol instance.

Fleet language support

Fleet currently supports development in Java, Kotlin, Go, Python, JavaScript, JSON, TypeScript, and Rust. Support for PHP, C++, C#, and HTML will be available “soon.” Fleet doesn’t yet support Scala, Groovy, or any other programming language not mentioned above.

Installing Fleet

Because Fleet is currently in a closed beta, you need to apply for permission to try it out. Once Fleet is ready, which may be some months, it will be opened up to public preview. Once I had permission to join the closed preview, I installed Fleet from the JetBrains Toolbox app on my MacBook Pro.

Fleet smart mode

Smart mode is required for semantic highlighting, code completion, code refactoring, navigation, find usages, and type information retrieval for parameters and expressions. This list is not exhaustive and may vary for different languages and plugins.

According to the documentation, for Fleet’s smart mode features to work, it may need to execute project code, which might pose a problem when its source is untrusted. Actions like importing projects, running scripts, and executing Git commands may run malicious code. For this reason it is important to enable smart mode only when you trust the code authors.

I consider the whole concept of trusting the code authors to be a bit tenuous for open source projects. There have been several egregious cases where trusted, longtime repository maintainers suddenly went rogue for economic or political reasons, with serious consequences for projects that relied on their code. In general, I trust repositories that require code reviews before check-ins are committed, but the level of trust deserved by repositories with a single contributor is often uncertain.

Fleet code intelligence features are provided by components called back ends. Architecturally, they are separate from other components, so they may run both locally or remotely. Fleet identifies two types of back ends, IntelliJ IDEA-based (a headless instance of IntelliJ IDEA with plugins) and LSP-based (a server that talks to Fleet via the Language Server Protocol).

When you enable smart mode, Fleet launches a particular type of back end depending on the language. For example, Java is handled by IntelliJ IDEA, whereas Rust support is provided by a LSP server.

jetbrains fleet 02IDG

Fleet’s Git integration pane is independent of smart mode. Here we see the summary of repository changes in the left-hand tab, despite seeing that smart mode is off at the top right.

jetbrains fleet 03IDG

Here we see the progress of a Git pull at the upper right. Smart mode is still off.

jetbrains fleet 04IDG

Smart mode enables many code analysis functions in Fleet, such as the usage pop-up shown in the middle of the right-hand pane.

Distributed Fleet configurations

Fleet’s architecture is designed to support a range of configurations and workflows. You can run Fleet only on your machine, or move some of the processes elsewhere—for example by locating the code processing in the cloud.

The two distributed options that currently work are using JetBrains Space as the remote environment, and using a remote machine. Future options include running Fleet in one or more Docker containers, and running Fleet in cloud virtual machines.

The IDE to beat

Overall, JetBrains Fleet is very promising, but not yet ready for wide release. It has several ambitious features that could make it a strong competitor to Visual Studio Code, although Visual Studio Code has the advantages of being free, mature, very popular, and widely supported.

JetBrains has been successful in the past in the face of good, free competition. One need only look at IntelliJ IDEA, which managed to carve a niche for itself in the Java/Scala/Groovy world despite the existence of two free, popular, mature alternatives—Eclipse and NetBeans.

Currently, the most desired missing feature in Fleet (for me) is Markdown preview support. The minute Fleet starts to support old-style JetBrains plugins, it will get Markdown preview along with hundreds of other features. Nevertheless, the roadmap for Fleet includes a new plugin architecture. I’m sure there is a good reason for that, having to do with the updated architecture of Fleet, but I expect that support for old-style JetBrains plugins will happen soon for bootstrap purposes.

Cost: Not yet announced.

Platform: Windows, macOS, Linux.

Review: Visual Studio Code shines for Java

Posted by on 8 June, 2022

This post was originally published on this site

There was a time when your choices for Java IDEs were Eclipse, NetBeans, or IntelliJ IDEA. That has changed somewhat. Among other innovations, Visual Studio Code now has good support for editing, running, and debugging Java code through a set of Java-specific extensions.

Visual Studio Code is a free, lightweight but powerful source code editor that runs on your desktop and on the web and is available for Windows, macOS, Linux, and Raspberry Pi OS. It comes with built-in support for JavaScript, TypeScript, and Node.js and has a rich ecosystem of extensions for other programming languages (such as Java, C++, C#, Python, PHP, and Go), runtimes (such as .NET and Unity), environments (such as Docker and Kubernetes), and clouds (such as Amazon Web Services, Microsoft Azure, and Google Cloud Platform).

Aside from the whole idea of being lightweight and starting quickly, Visual Studio Code has IntelliSense code completion for variables, methods, and imported modules; graphical debugging; linting, multi-cursor editing, parameter hints, and other powerful editing features; snazzy code navigation and refactoring; and built-in source code control including Git support. Much of this was adapted from Visual Studio technology.

Extensions to Visual Studio Code can use the Language Server Protocol, which defines the protocol used between an editor or IDE and a language server that provides language features like auto complete, go to definition, find all references, etc. A Language Server is meant to provide the language-specific smarts and communicate with development tools over a protocol that enables inter-process communication.

In addition, extensions can use the Debug Adapter Protocol (DAP), which defines the abstract protocol used between a development tool (e.g. IDE or editor) and a debugger. The Debug Adapter Protocol makes it possible to implement a generic debugger for a development tool that can communicate with different debuggers via Debug Adapters.

Java extensions to Visual Studio Code

Visual Studio Code has a long list of Java extensions, not all of which are compatible with each other. The easiest way to get started is to install the Coding Pack for Java on Windows or macOS. The next easiest way on Windows and macOS, and the easiest way on Linux, is to install a JDK, VS Code, and Java extensions.

Extension Pack for Java

The Extension Pack for Java bundles six compatible Java extensions, one from Red Hat and the rest from Microsoft. It includes Language Support for Java by Red Hat, Debugger for Java, Test Runner for Java, Maven for Java, Project Manager for Java, and Visual Studio IntelliCode. Each of these is described below. The features of the Extension Pack for Java that were added in 2018 are illustrated with screen video captures in a Microsoft blog post.

visual studio code for java 01 IDG

Visual Studio Code Extension Pack for Java. All extensions containing “Java” are shown at the left; the Extension Pack for Java is shown at the right.

Language Support for Java by Red Hat

The Language Support for Java by Red Hat extension provides Java language support via Eclipse JDT Language Server, which in turn utilizes Eclipse JDT, M2Eclipse, and Buildship. The Java Language support goes all the way up to refactoring, which can be found in the context menus.

The Eclipse JDT Language Server is a Java language specific implementation of the language server protocol. It implements the language server protocol and may implement extensions when it is deemed necessary. It also provides project translation from build systems such as Maven—through the use of M2E project—to JDT project structure. Half the contributions to the Eclipse JDT Language Server have come from Red Hat, and about a third have come from Microsoft.

visual studio code for java 02 IDG

In the main panel, we’re looking at the source code of one Java file in the context of a large AI program. The pop-up in the upper middle is a peek screen triggered by hovering over the method name.

Debugger for Java

Debugger for Java is a lightweight Java Debugger based on Java Debug Server, which extends the Language Support for Java by Red Hat. Features include launch and attach; breakpoints, conditional breakpoints, and logpoints; pause and continue; step in, out, and over; exceptions, variables, call stacks, and threads; evaluation; and Hot Code Replace (the Java equivalent of Visual Studio’s Edit and Continue).

visual studio code for java 03 IDG

Debugging a Java program. Notice the orange highlights to show variable values on the right, and the local variables panel at the top left. The lighting bolt on the right end of the floating debug control part at the top center is the Java Hot Code Replace button, which is similar to Visual Studio’s Edit and Continue feature.

Test Runner for Java

Test Runner for Java is a lightweight extension to run and debug Java test cases in Visual Studio Code. The extension supports the JUnit 4 (v4.8.0+), JUnit 5 (v5.1.0+), and TestNG (v6.8.0+) test frameworks.

Maven for Java

The Maven extension for VS Code provides a project explorer and shortcuts to execute Maven commands. It allows you to generate projects from Maven Archetypes, and generate POMs (Project Object Models); provides shortcuts to common goals, plugin goals, and customized commands; and preserves command history for fast re-runs.

Project Manager for Java

Project Manager for Java is a lightweight extension to provide additional Java project explorer features. It works with Language Support for Java by Red Hat to provide a Java project view, create Java projects, export JARs, and manage dependencies.

Visual Studio IntelliCode

The Visual Studio IntelliCode extension provides AI-assisted development features for Python, TypeScript/JavaScript, and Java developers in Visual Studio Code, with insights based on understanding your code context combined with machine learning. Contextual recommendations are based on practices developed in thousands of high quality, open source projects on GitHub each with high star ratings. This means you get context-aware code completions, tool tips, and signature help rather than alphabetical or most-recently-used lists. By predicting the most likely member in the list based on your coding context, AI-assisted IntelliSense stops you having to hunt through the list yourself.

Other Java extensions of note

Check out Tomcat and Jetty if you’re working with those technologies.

If you’re working on Spring Boot, great support is provided by Pivotal and Microsoft in the form of Spring Boot Tools, Spring Initializr, and Spring Boot Dashboard.

And you might find Checkstyle handy when you need coherent code style, especially across multiple team members.

Running Visual Studio Code

There are currently at least four ways to run Visual Studio Code: the original desktop app, which runs on Windows, macOS, and Linux; online in a browser, with reduced functionality; online with Gitpod; and online with GitHub Codespaces. A fifth possibility is to use Visual Studio Code Remote – Containers; I won’t show you that because it looks essentially the same as using Gitpod and Visual Studio Code, with the difference that it uses a local instance of Docker.

Visual Studio Code Desktop

This is the OG version of VS Code, with full features.

visual studio code for java 04 IDG

Visual Studio Code editing and running a ShellSort implementation in Java locally, after checking out the TheAlgorithms/Java project from GitHub. We’re seeing the project structure in two views (files and classes) on the left, the source code on the top right, and the output on the bottom right.

Visual Studio Code for the Web

This is a reduced-functionality, web-hosted VS Code editor. It can only run a few extensions, and can’t debug or run your code. It’s still useful for making small changes to the code directly in the repository without installing anything.

You can activate Visual Studio Code for the Web by browsing to, or by changing the “.com” domain in repository address to “.dev” for supported sites, such as GitHub. To switch to a full-featured environment from Visual Studio Code for the Web, you can use the “Remote Repositories: Continue Working On…” item from the command palette.

visual studio code for java 05 IDG

Visual Studio Code Online. Notice the “dev” domain. You can edit in this environment, but most VS Code extensions won’t install and you can’t run or debug your code.

Visual Studio Code in Gitpod

Gitpod is a GitHub, GitLab, and Bitbucket add-on that can open a development environment for you directly from a repository. Visual Studio Code is only one of the IDEs that Gitpod supports, and it can install extensions, run code, and debug. Gitpod can open VS Code workspaces online in a browser, or in an instance of VS Code connecting remotely to the repository as shown below.

In addition to VS Code, Gitpod supports IntelliJ IDEA, command-line editors such as Vim, and editors running in Docker containers for Java development.

visual studio code for java 06 IDG

We’re looking at Visual Studio Code using SSH to connect to a GitHub repository under the control of GitPod. We’re not editing a local checkout of the repository; instead, we’re using a local instance of VS Code to work with the repository directly.

GitHub Codespaces

GitHub Codespaces (beta) offers a development environment that’s hosted in the cloud. You can customize your project for Codespaces by committing configuration files to your repository (often known as “configuration as code”), which creates a repeatable codespace configuration for all users of your project.

Codespaces run on a variety of VM-based compute options hosted by, which you can configure from two-core machines up to 32-core machines. You can connect to your codespaces from the browser or locally using Visual Studio Code.

visual studio code for java 07 IDG

Invoking a cloud GitHub Codespace from GitHub. Drop down the code menu and pick the Codespaces pane, then click the green button at the bottom.

visual studio code for java 08 IDG

Debugging a Java program using a Codespace in a browser. I used the default four-core workspace size and started with, which does nothing more than print a line. Here I’ve stepped into the library code. Note the call stack and the local variables at left.

visual studio code for java 09 IDG

Here we’re almost finished debugging, and the console shows the printed line.

VS Code for Java?

Overall, Visual Studio Code is very good as a Java IDE if you install the Extension Pack for Java. It’s merely OK as a Java editor without the extension pack, as becomes obvious when you run Visual Studio Code for the Web.

It speaks highly of Visual Studio Code that it has inspired so much energy from its open source community, even to the point where Red Hat has contributed heavily to its Java support. It also speaks highly of Visual Studio Code that it has been adopted for a third-party product like Gitpod, and for GitHub Codespaces. (GitHub is a Microsoft subsidiary.) I’m actually more impressed that VS Code has been adopted across groups at Microsoft than I am at the open source contributions, as the company has historically had more than its share of internal inter-group rivalries.

Would I drop my current Java IDE in favor of Visual Studio Code? Probably not. I’ve had large Java projects that wouldn’t build in VS Code on my 8 GB MacBook Pro—it ran out of memory. The same projects built just fine in Eclipse, NetBeans, and IntelliJ IDEA on the same machine with the same background programs running. 

On the other hand, I prefer Visual Studio Code for quick edits and work on small projects. You might prefer it for full-time Java work. It’s certainly worth trying out.

Cost: Free.

Platform: Windows, macOS, Linux.

Page 2 of 212

Social Media

Bulk Deals

Subscribe for exclusive Deals

Recent Post



Subscribe for exclusive Deals

Copyright 2015 - InnovatePC - All Rights Reserved

Site Design By Digital web avenue