Category Archives: Tech Reviews

Amazon Bedrock: A solid generative AI foundation

Posted by on 27 February, 2024

This post was originally published on this site

Amazon Web Services’ fully managed service for building, deploying, and scaling generative AI applications, Amazon Bedrock offers a catalog of foundation models, implements retrieval-augmented generation (RAG) and vector embeddings, hosts knowledge bases, implements fine-tuning of foundation models, and allows continued pre-training of selected foundation models.

Amazon Bedrock complements the almost 30 other Amazon machine learning services available, including Amazon Q, the AWS generative AI assistant.

There are currently six major features in Amazon Bedrock:

  • Experiment with different models: Use the API or GUI in the console to test various prompts and configurations with different foundation models.
  • Integrate external data sources: Improve response generation by incorporating external data sources into knowledge bases, which can be queried to augment the responses from foundation models.
  • Develop customer support applications: Build applications that use foundation models, API calls, and knowledge bases to reason and execute tasks for customers.
  • Customize models: Tailor a foundation model for particular tasks or domains by providing training data for fine-tuning or additional pretraining.
  • Boost application efficiency: Optimize the performance of foundation model-based applications by purchasing provisioned throughput.
  • Choose the most suitable model: Compare the outputs of various models using standard or custom prompt data sets to choose the model that best aligns with the requirements of your application.

One major competitor to Amazon Bedrock is Azure AI Studio, which, while still in preview and somewhat under construction, checks most of the boxes for a generative AI application builder. Azure AI Studio is a nice system for picking generative AI models, grounding them with RAG using vector embeddings, vector search, and data, and fine-tuning them, all to create what Microsoft calls copilots, or AI agents.

Another major competitor is Google Vertex AI’s Generative AI Studio, which allows you to tune foundation models with your own data, using tuning options such as adapter tuning and reinforcement learning from human feedback (RLHF), or style and subject tuning for image generation. Generative AI Studio complements the Vertex AI model garden and foundation models as APIs.

Other possible competitors include LangChain (and LangSmith), Poe, and the ChatGPT GPT Builder. LangChain does require you to do some programming.

Amazon Bedrock model setup

There are two setup tasks for Bedrock: model setup and API setup. You need to request access to models before you can use them. If you want to use the AWS command line interface or any of the AWS SDKs, you also need to install and configure the CLI or SDK.

I didn’t bother with API setup, as I’m concentrating on using the console for the purposes of this review. Completing the model access request form was easier than it looked, and I was granted access to models faster than I expected.

amazon bedrock 02 IDG

You can’t use a model in Amazon Bedrock until you’ve requested and received permission to use it. Most vendors grant access immediately. Anthropic takes a few minutes, and requires you to fill out a short questionnaire about your planned usage. This screenshot was taken just before my Claude access requests were granted.

Amazon Bedrock model inference parameters

Amazon Bedrock uses slightly different parameters to control the response of models than, say, OpenAI. Bedrock controls randomness and diversity using the temperature of the probability distribution, the top K, and the top P. It controls the length of the output with the response length, penalties, and stop sequences.

Temperature modulates the probability for the next token. A lower temperature leads to more deterministic responses, and a higher temperature leads to more random responses. In other words, choose a lower temperature to increase the likelihood of higher-probability tokens and decrease the likelihood of lower-probability tokens; choose a higher temperature to increase the likelihood of lower-probability tokens and decrease the likelihood of higher-probability tokens. For example, a high temperature would allow the completion of “I hear the hoof beats of” to include unlikely beasts like unicorns, while a low temperature would weight the output to likely ungulates like horses.

Top K is the number of most-likely candidates that the model considers for the next token. Lower values limit the options to more likely outputs, like horses. Higher values allow the model to choose less likely outputs, like unicorns.

Top P is the percentage of most-likely candidates that the model considers for the next token. As with top K, lower values limit the options to more likely outputs, and higher values allow the model to choose less likely outputs.

Response length controls the number of tokens in the generated response. Penalties can apply to length, repeated tokens, frequency of tokens, and type of tokens in a response. Stop sequences are sequences of characters that stop the model from generating further tokens.

Amazon Bedrock prompts, examples, and playgrounds

Amazon Bedrock currently displays 33 examples of generative AI model usage, and offers three playgrounds. Playgrounds provide a console environment to experiment with running inference on different models and with different configurations. You can start with one of the playgrounds (chat, text, or image), select a model, construct a prompt, and set the metaparameters. Or you can start with an example and open it in the appropriate playground with the model and metaparameters pre-selected and the prompt pre-populated. Note that you need to have been granted access to a model before you can use it in a playground.

Amazon Bedrock examples demonstrate prompts and parameters for various supported models and tasks. Tasks include summarization, question answering, problem solving, code generation, text generation, and image generation. Each example shows a model, prompt, parameters, and response, and presents a button you can press to open the example in a playground. The results you get in the playground may or may not match what is shown in the example, especially if the parameters allow for lower-probability tokens.

Our first example shows arithmetic word problem solving using a chain-of-thought prompt and the Llama 2 Chat 70B v1 model. There are several points of interest in this example. First, it works with a relatively small open-source chat model. (As an aside, there’s a related example that uses a 7B (billion) parameter model instead of the 70B parameter model used here; it also works.) Second, the chain-of-thought action is triggered by a simple addition to the prompt, “Let’s think step by step.” Note that if you remove that line, the model often goes off the rails and generates a wrong answer.

amazon bedrock 03 IDG

The chain-of-thought problem-solving example uses a Llama 2 chat model and presents a typical 2nd or 3rd grade arithmetic word problem. Note the [INST]You are a…[/INST] block at the beginning of the prompt. This seems to be specific to Llama. You’ll see other models respond to different formats for defining instructions or system prompts.

amazon bedrock 04 IDG

The chain-of-thought problem-solving example running in the Amazon Bedrock Chat playground. This particular set of prompts and hyperparameters usually gives correct answers, although not in the exact same format every time. If you remove the “Let’s think step by step” part of the prompt it usually gives wrong answers. The temperature setting of 0.5 asks for moderate randomness in the probability mass function, and the top P setting of 0.9 allows the model to consider less likely outputs.

Our second example shows contract entity extraction using Cohere’s Command text generation model. Text LLMs (large language models) often allow for many different text processing functions.

amazon bedrock 05 IDG

Amazon Bedrock contract entity extraction example using Cohere’s Command text generation model. Note that the instruction here is on the first line followed by a colon, and then the contract body follows.

amazon bedrock 06 IDG

Contract entity extraction example running in the Amazon Bedrock text playground. Note that there was an opportunity for additional interaction in the playground, which didn’t show up in the example. While the temperature of this run was 0.9, Cohere’s Command model takes temperature values up to 5. The top p value is set to 1 (and displayed at 0.99) and the top k parameter is not set. These allow for high randomness in the generated text.

Our final example shows image inpainting, an application of image generation that uses a reference image, a mask, and prompts to produce a new image. Up until now, I’ve only done AI image inpainting in Adobe Photoshop, which has had the capability for awhile.

amazon bedrock 07 IDG

Amazon Bedrock’s image inpainting example uses the Titan Image Generator G1 model. Note the reference image and mask image in the image configuration.

amazon bedrock 08 IDG

In order to actually select the flowers for inpainting, I had to move the mask from the default selection of the backpack to the area containing the white flowers in the reference image. When I didn’t do that, orange flowers were generated in front of the backpack.

amazon bedrock 09 IDG

Successful inpainting in Amazon Bedrock. Note that I could have used the mask prompt to refine the mask for complex mask selections in noncontiguous areas, for example selecting the flowers and the books. You can use the Info links to see explanations of individual hyperparameters.

Amazon Bedrock orchestration

Amazon Bedrock orchestration currently includes importing data sources into knowledge bases that you can then use for setting up RAG, and creating agents that can execute actions. These are two of the most important techniques available for building generative AI applications, falling between simple prompt engineering and expensive and time-consuming continued pre-training or fine-tuning.

Using knowledge bases takes multiple steps. Start by importing your data sources into an Amazon S3 bucket. When you do that, specify the chunking you’d like for your data. The default is approximately 300 tokens per chunk, but you can set your own size. Then set up your vector store and embeddings model in the database you prefer, or allow AWS to use its default of Amazon OpenSearch Serverless. Then create your knowledge base from the Bedrock console, ingest your data sources, and test your knowledge base. Finally, you can connect your knowledge base to a model for RAG, or take the next step and connect it to an agent. There’s a good one-hour video about this by Mani Khanuja, recorded at AWS re:Invent 2023.

Agents orchestrate interactions between foundation models, data sources, software applications, and prompts, and call APIs to take actions. In addition to the components of RAG, agents can follow instructions, use an OpenAPI schema to define the APIs that the agent can invoke, and/or invoke a Lambda function.

amazon bedrock 10 IDG

Amazon Bedrock knowledge base creation and testing starts with this screen. There are several more steps.

Amazon Bedrock model assessment and deployment

The Assessment and Deployment panel in Amazon Bedrock contains functionality for model evaluation and provisioned throughput.

Model evaluation supports automatic evaluation of a single model, manual evaluation of up to two models using your own work team, and manual evaluation of as many models as you wish using an AWS-managed work team. Automatic evaluation uses recommended metrics, which vary depending on the type of task being evaluated, and can either use your own prompt data or built-in curated prompt data sets.

Provisioned throughput allows you to purchase dedicated capacity to deploy your models. Pricing varies depending on the model that you use and the level of commitment you choose.

amazon bedrock 11 IDG

Automatic model evaluation selection in Amazon Bedrock. Bedrock can also set up human model evaluations. The metrics and data sets used vary with the task type being evaluated.

amazon bedrock 12 IDG

Amazon Bedrock’s provisioning throughput isn’t cheap, and it isn’t available for every model. Here we see an estimated monthly cost of provisioning five model units of the Llama 2 Chat 13B model for one month. It’s $77.3K. Upping the term to six months drops the monthly cost to $47.7K. You can’t edit the provisioned model units or term once you’ve purchased the throughput.

Model customization methods

It’s worth discussing ways of customizing models in general at this point. Below we’ll talk specifically about the customization methods implemented in Amazon Bedrock.

Prompt engineering, as shown above, is one of the simplest ways to customize a generative AI model. Typically, models accept two prompts, a user prompt and a system or instruction prompt, and generate an output. You normally change the user prompt all the time, and use the system prompt to define the general characteristics you want the model to take on. Prompt engineering is often sufficient to define the way you want a model to respond for a well-defined task, such as generating text in specific styles by presenting sample text or question-and-answer pairs. You can easily imagine creating a prompt for “Talk Like a Pirate Day.” Ahoy, matey.



Page 2

Retrieval-augmented generation helps to ground LLMs with specific sources, often sources that weren’t included in the models’ original training. As you might guess, RAG’s three steps are retrieval from a specified source (the knowledge base in Amazon Bedrock parlance), augmentation of the prompt with the context retrieved from the source, and then generation using the model and the augmented prompt.

RAG procedures often use embedding to limit the length and improve the relevance of the retrieved context. Essentially, an embedding function takes a word or phrase and maps it to a vector of floating point numbers; these are typically stored in a database that supports a vector search index. The retrieval step then uses a semantic similarity search, typically using the cosine of the angle between the query’s embedding and the stored vectors, to find “nearby” information to use in the augmented prompt. Search engines usually do the same thing to find their answers.

Agents, aka conversational retrieval agents, expand on the idea of conversational LLMs with some combination of tools, running code, embeddings, and vector stores. In other words, they are RAG plus additional steps. Agents often help to specialize LLMs to specific domains and to tailor the output of the LLM. Azure Copilots are usually agents; Google and Amazon use the term agents. LangChain and LangSmith simplify building RAG pipelines and agents.

Fine-tuning large language models is a supervised learning process that involves adjusting the model’s parameters to a specific task. It’s done by training the model on a smaller, task-specific data set that’s labeled with examples relevant to the target task. Fine-tuning often takes hours or days using many server-level GPUs and requires hundreds or thousands of tagged exemplars. It’s still much faster than extended pre-training.

Pre-training is the unsupervised learning process on huge text data sets that teaches LLMs the basics of language and creates a generic base model. Extended or continued pre-training adds unlabeled domain-specific or task-specific data sets to the base model to specialize the model, for example to add a language, add terms for a specialty such as medicine, or add the ability to generate code. Continued pre-training (using unsupervised learning) is often followed by fine-tuning (using supervised learning).

Customizing models in Amazon Bedrock with fine-tuning and continued pre-training

Both fine-tuning and continued pre-training tend to be expensive and lengthy processes. Even preparing the data for these can be a challenge. For fine-tuning, the challenge is getting the tagging done within budget. For continued pre-training, the challenge is to find a data set for your domain of interest that doesn’t introduce biases or toxicity of any kind.

amazon bedrock 13 IDG

Amazon Bedrock can create custom models by continued pre-training and/or with fine-tuning. You can manage your models and training jobs from this screen. Note the requirement for purchasing provisioned throughput to deploy your custom model.

amazon bedrock 14 IDG

Creating a fine-tuning job in Amazon Bedrock. Note that only certain models can currently be fine-tuned: four Amazon models, two Cohere models, and two Meta models.

amazon bedrock 15 IDG

You can manage your custom model training jobs as well as your custom models in Amazon Bedrock. Note the three status codes for jobs: failed, stopped, and complete. Only completed jobs will get a link from their custom model name. All jobs get links from their job names.

amazon bedrock 16 IDG

Digging into a training job detail in Amazon Bedrock shows you its source model, when it was started, its status, and various parameters and hyperparameters.

amazon bedrock 17 IDG

Once you have completed customizing your models in Amazon Bedrock you can manage them on the models tab. You can provision them, open them in the playground, delete them, and open their details.

amazon bedrock 18 IDG

Model details look similar to training job details in Amazon Bedrock, with a few differences, such as offering purchase and management of provisioned input.

amazon bedrock 19 IDG

While the setup of a continued pre-training job looks similar to the setup of a fine-tuning job, they have some major differences. Continued pre-training is an unsupervised learning job that needs a lot of untagged data and a lot of time. Fine-tuning is a supervised learning job that needs less data (but tagged!) and less time.

Low-code generative AI using PartyRock

To accompany Amazon Bedrock, AWS has released a mostly free low-code platform for learning generative AI and building small AI apps. The introductory PartyRock blog post is by Jeff Barr, and tells you enough that you can dive in yourself; it also supplies links to PartyRock learning resources near the end of the post. If you don’t want to build an app yourself, you can still play with the apps others have built.

Generative AI app building on Bedrock

Amazon Bedrock is a credible competitor to Azure AI Studio. If you’re already committed to AWS rather than Microsoft Azure or Google Cloud, then Bedrock will certainly be a good choice for building and scaling generative AI applications.  Bedrock offers fewer foundation models than Azure AI Studio, and furthermore lacks access to any OpenAI models, but it should do the job for most generative AI apps. Bedrock is currently a little behind Azure AI Studio when it comes to content filters, but that could easily change in the future.

Note that the cost of deploying generative AI apps tends to dwarf the cost of developing them. The cost of using Amazon Bedrock to do prompt engineering and develop RAG apps tends to be low (ignoring the people costs), and the cost of testing these in the Bedrock playgrounds is usually negligible. The cost of fine-tuning tends to be something that might give small companies pause. The cost of continued pre-training may also give medium-size companies pause. But deploy an app with a customized model at scale sufficient to serve a large audience with low lag for a long period of time, and soon you’re talking about real money.

Cost: Pricing is based on the model, the volume of input tokens and output tokens, and on whether you have purchased provisioned throughput for the model. For more information, see the Model providers page in the Amazon Bedrock console.

Platform: Browser-based, hosted on AWS. API access available.

Next read this:

Google Project IDX: A promising next-generation cloud IDE

Posted by on 29 January, 2024

This post was originally published on this site

In August 2023, a small group of Google development and UX leads bewailed the difficulty of setting up a development environment for multiplatform and full-stack apps, and offered their take on an experimental prototype intended to solve the issues. Difficulty setting up technology stacks for development is not a new problem. It has been an issue since at least the early 1980s, when personal computers became available. 

Project IDX is a browser-based development environment built on Code OSS and powered by Codey, a generative AI foundation model trained on code and built on PaLM 2. Project IDX is designed to make it easier to build, manage, and deploy full-stack web and multiplatform applications, using popular frameworks and languages.

Code OSS is the fully open-source version of Microsoft’s Visual Studio Code. The latter has a few proprietary additions, despite being free software.

At the time of its announcement in August, Project IDX was only available through a waitlist sign-up; my application was finally approved in December. Project IDX is still very much a rough-edged preview, but has an interesting design and some utility, even if it’s not yet intended for use in a production environment.

There are several products that compete with Project IDX at some level. These include AWS Cloud9, Gitpod, Online IDE, Replit, StackBlitz, Eclipse Che, Codeanywhere, and GitHub Codespaces.

Feels like Visual Studio Code

There are a number of features that make Project IDX look promising despite its rough edges and its feel of being under construction. For starters, it’s actually a familiar environment for anyone who uses Visual Studio Code. As I understand it, the portions of VS Code that aren’t included in Code OSS are the Microsoft-specific customizations, which don’t matter too much in this context.

Some of those customizations are replaced by the IDX AI powered by Codey. The IDX AI provides code suggestions as you type and offers an AI-powered code chat you can ask for help with your code, to generate new code, to translate code to another language, to explain code, and to write unit tests. Supposedly, IDX AI also highlights possible license requirements based on AI-generated code, although I haven’t seen that pop up.

google project idx 01 IDG

Project IDX will feel familiar because of its similarity to VS Code. The top left “hamburger” menu replaces the top row menu in VS Code, and offers most of the same menu items when it pops out. The icons in the vertical row below that control the contents of the next column to the right, currently showing the file explorer, the code outline for the current file, the timeline for the current file, and the dependencies for the app. The large editing pane currently showing main.dart can display up to four tabs. The preview window to the right can also display the IDX AI pane and additional code file tabs. The large area at the bottom right displays code problems, output, a debug console, and a terminal.

Runs in a cloud workstation

The IDX Code OSS editor runs in a Google Cloud VM, called a Cloud Workstation. Normally, Cloud Workstation time is billed per hour at a rate that varies with the size of the machine type, from $0.16/hour to $9.36/hour. Project IDX is currently free.

Normally, Cloud Workstations support a variety of popular IDEs and Duet AI. Project IDX supports only Code OSS, and Codey instead of Duet. (I can’t tell you the difference between Duet AI and Codey in practice, although it might be an interesting comparison to investigate.) Cloud Workstations can normally run inside your private network and in your staging environment. Project IDX is currently restricted to its own environment.

Supports many languages and frameworks

You can create projects in Project IDX with built-in templates and GitHub imports. The templates support the JavaScript, TypeScript, and Dart languages and the Angular, React, NextJS, Vue, Svelte, and Flutter frameworks. In the future, Project IDX is due to support Python, Go, and “AI.” You can optionally use Nix to customize your workspace.

google project idx 03 IDG

This menu offers you your initial choice of the kind of app you’ll generate or import. Each item (other than the “coming soon” group at the bottom) opens a secondary screen for specifying your app framework and naming your app.

google project idx 03a IDG

The second-level screen for generating a new web app currently offers a choice of six web frameworks. They are Angular, React, Next.js, Vue, Svelte, or a blank app, which implies writing your own HTML, JavaScript/TypeScript, and CSS. Nix is the file you can use to customize a workspace.

Integrates with Git and GitHub

GitHub imports can be of three types: web, Flutter, and “other,” which currently appears to mean JavaScript/TypeScript frameworks other than those explicitly listed. The frameworks explicitly supported include Angular, React, Next.js, Vue, and Svelte. 

If your GitHub project has JavaScript dependencies, you can run npm install in your IDX terminal window after your import completes. You can also turn your project into a Git repository from within IDX and sync that with GitHub.

google project idx 04 IDG

Project IDX integrates well with Git and GitHub. At left, you can see the options to initialize a Git repository and publish it to GitHub.

google project idx 04a IDG

Once you have created a repo and authenticated to GitHub, Project IDX can push the repo to GitHub. Here you can see the typical GitHub display of the file generated for the app by Angular.

Previews, deploys, and shares apps

In addition to a web preview, Project IDX presents previews in Android emulators and iOS simulators, where supported by the underlying template. All three work for a Flutter app. Only two, web preview and iOS simulator, work for an Angular app, since a stock Angular app isn’t native unless you add something like Ionic or NativeScript.

You can deploy directly from your workspace to Firebase hosting. On an experimental basis, you can share your workspace with complete shared access.

Project IDX comes with pre-installed extensions for the languages and frameworks it supports. It is supposed to support additional extensions that are available from OpenVSX, although I can’t confirm whether all of those work at this point—there are just too many (over 3,000) to check.

One current major limitation of Project IDX is that only two projects are allowed at once. You can get around this by saving projects to GitHub and juggling which you have open in IDX.

Note that there are numerous bug reports beyond the list in the FAQ.

google project idx 05 IDG

The Flutter app reported two setup errors. Here I am trying to resolve one of them with the help of IDX AI. Unfortunately, the AI’s advice to use sudo apt-get to install Chrome turned out to be useless, since the IDX VM does not currently include either sudo or apt-get. I won’t call that a hallucination, since those utilities might be planned for a future version.

Lives in the Google Cloud

Project IDX shows a lot of promise. It’s visually similar to Visual Studio Code for the Web (which, sadly, lacks a terminal and debugger). It’s both visually and functionally similar to GitHub Codespaces and Gitpod, and it’s functionally similar to Eclipse Che.

One reason you might prefer Project IDX to any of those would be its hosting in a Google Cloud Workspace, which is a big advantage if you want to integrate with any Google Cloud services, or with other programs you have running in the Google Cloud. On the other hand, if your existing code runs on AWS, you might want to consider using AWS Cloud9.

My biggest concern about making a commitment to Project IDX would be Google’s long history of killing its projects and services. Remember Google+? Freebase? The Google Search Appliance? Polymer? Google Domains? All ex-parrots, they’ve rung down the curtain and joined the choir invisible.

Nevertheless, Project IDX has its attractions. As long as you create a GitHub repository from your workspace and keep it current, it’s certainly worth a try.

Cost: Free preview

Platform: Browser-based, hosted on Google Cloud

Next read this:

Azure AI Studio: A nearly complete toolbox for AI development

Posted by on 22 January, 2024

This post was originally published on this site

On November 15, Microsoft announced Azure AI Studio, a new platform for generative AI application development, using OpenAI models such as GPT-4, as well as models from Microsoft Research, Meta, Hugging Face, and others. The motivation for the product, Microsoft said, is that “navigating the complexities of prompt engineering, vector search engines, the retrieval-augmented generation (RAG) pattern, and integration with Azure OpenAI Service can be daunting.”

It turns out that Azure AI Studio is a nice system for picking generative AI models, for grounding them with RAG using vector embeddings, vector search, and data, and for fine-tuning those models, all to create AI-powered copilots, or agents. It’s the “basement-level” tool for creating copilots, aimed at experienced developers and data scientists, while Microsoft’s Copilot Studio is a “2nd-floor level” low-code tool for customizing chatbots.

Azure AI Studio has competition from the usual suspects, plus a few you might not already know about. Amazon Bedrock competes with Azure AI Studio, and Amazon Q competes with Microsoft Copilots. Bedrock offers a catalog of foundation models, RAG and embeddings, knowledge bases, fine-tuning, and continued pretraining to build generative AI applications.

There’s a somewhat competing experiment from Google, called NotebookLM, which “only” lets you provide documents (Google docs, PDFs, and pasted text) for RAG against one large language model. I put “only” in air quotes because using RAG against one good model is often enough to produce a good generative AI application. Google has a long history of killing its experiments, so I’m not taking any bets on whether or how NotebookLM will become a product.

Google does have a professional product in this space. Google Vertex AI’s Generative AI Studio allows you to tune foundation models with your own data, using tuning options such as adapter tuning and reinforcement learning from human feedback (RLHF), or style and subject tuning for image generation. That complements the Vertex AI model garden and foundation models as APIs.

If you can write a little Python, JavaScript, or Go, you can accomplish many of the same things you can with Azure AI Studio—or possibly more—with LangChain and LangSmith. You can also accomplish some of the same things with Poe, which has a good selection of models and lets you customize bots with plain-text prompts as well as with code.

Azure AI Studio model catalog

Azure AI Studio hosts AI models from Microsoft Research, OpenAI, Meta, Hugging Face, and Databricks, as well as NVIDIA base models, so that you can find the current best model for your application, or at least one that works well enough. In addition, Azure AI Studio offers half a dozen Azure OpenAI language models, some of which have fine-tuning capabilities.

In general, the OpenAI models are offered “as a service,” meaning that they are deployed in a model pool with its own GPUs. When you provision them, you get an inference endpoint in your own subscription and possibly the ability to use them in fine-tuning and evaluation jobs. We’ll discuss fine-tuning when we talk about model customization below.

azure ai studio 02 IDG

The Azure AI Studio model catalog has a wide selection of models from multiple vendors, including OpenAI, NVIDIA, Meta, Hugging Face, Databricks, and Microsoft Research. The models are classified by their inference skills as well as by their creators.

Azure AI Studio model benchmarks

Not every generative AI model has the same capabilities or performance. Historically, better models have been priced higher, but recently some free open-source models have exhibited excellent performance on common tasks.

There are a number of standard benchmarks for LLMs, in particular, which are easier to measure automatically than models that generate media. As you can see in the chart below, GPT-4 32K is the current champion among installed models on Azure for most accuracy benchmarks, but bear in mind that the LLM performance picture changes on an almost daily basis.

As I write this, Google claims that its new Gemini model surpasses GPT-4. I haven’t been able to test it to know whether that’s true. Apparently, the “really good” Ultra version of Gemini won’t be available until next year. The Pro version I did test is roughly at the level of GPT-3.5.

In addition, at least three competitive small language models have been released recently. They include Starling-LM-7B, which uses reinforcement learning from AI feedback (RLAIF), from UC Berkeley.

azure ai studio 03 IDG

Azure AI Studio model benchmarks. Here we are comparing the model accuracy of four LLMs, GPT-3.5 Turbo, GPT-4 32K, Llama 2 70b, and Llama 2 70b chat, for question answering and text generation. Unsurprisingly, GPT-4 32K, the largest and most expensive model considered, came out on top. Note that chat models, which are optimized for interactive use, are not expected to outperform non-chat models on completion tasks.

Model as a service vs. model as a platform

Azure AI Studio offers models through two mechanisms: model as a service (MaaS), and model as a platform (MaaP). Model as a service means that you access the model through an API, and typically pay for usage as you go; the model itself lives in a central pool where it has access to GPUs. The Azure OpenAI models are all available as MaaS, which makes sense since they require so much GPU capacity to run. As I write this, six Meta Llama 2 models just became available as MaaS.

Model as a platform means that you deploy the model into VMs that belong to your Azure subscription. When I tried this I was deploying a Mistral 7B model to a single VM of type Standard_NC24ads_A100_v4, which has 24 vCPUs, 220.0 GiB of memory, one NVIDIA A100 PCIe GPU, and uses third-generation AMD EPYC 7V13 (Milan) processors. I wasn’t impressed by the ungrounded inference results from Mistral 7B on my custom prompts—the right answer was in there, but surrounded by irrelevant hallucinations—although I imagine I could fix that with prompt engineering and/or RAG. (See the “Model customization methods” section below.) There has been speculation that Mistral 7B was trained on benchmark test data, which could explain why it goes off the rails more than you would expect from its benchmark scores.

I’ve heard claims that the new Mixtral 8x7B eight-way mixture-of-experts model is much better, but it wasn’t available in the Azure AI Studio catalog when I was testing. GPT-4 is supposedly also an eight-way mixture-of-experts model, but it’s much bigger; OpenAI hasn’t yet confirmed how the model was built.

If your Azure account/subscription/region doesn’t have any GPU quotas, you can still deploy a generative AI model as a platform with shared GPU capacity. The trade-off for this is that shared GPU capacity is only good for a limited time, variously quoted as 24 or 168 hours. This is considered a stopgap until your cloud administrator can arrange some GPU quota for you.

Azure AI Studio model filtering criteria

Azure AI Studio can filter models by collections, the inference tasks they support, and the fine-tuning tasks they support. Currently there are eight collections, mostly representing model sources, such as Azure OpenAI, Meta, and Mistral AI. Currently there are 20 inference tasks, including text generation, question answering, embeddings, translation, and image classification. And there are 11 fine-tuning tasks, all drawn from the inference task list, but not including embeddings, which is more of an intermediate tool for implementing retrieval-augmented generation.

azure ai studio 05IDG

Azure AI Studio model filters. These were captured from a staging version of the product in December and are likely to change over time.

Model customization methods

It’s worth discussing ways of customizing models in general at this point. In the following section, you’ll see the tools and components in Azure AI Studio.

Prompt engineering is one of the simplest ways to customize a generative AI model. Typically, models accept two prompts, a user prompt and a system prompt, and generate an output. You normally change the user prompt all the time, and use the system prompt to define the general characteristics you want the model to take on.

Prompt engineering is often sufficient to define the way you want a model to respond for a well-defined task, such as generating text in specific styles. The image below shows the Azure AI Studio sample prompt for a Shakespearean writing assistant. You can easily imagine creating a similar prompt for “Talk Like a Pirate Day.” Ahoy, matey.

LLMs often have hyperparameters that you can set as part of your prompt. Hyperparameter tuning is as much a thing for LLM prompts as it is for training machine learning models. The usual important hyperparameters for LLM prompts are temperature, context window, maximum number of tokens, and stop sequence, but can vary from model to model.

The temperature controls the randomness of the output; depending on the model it can range from 0 to 1 or 0 to 2. Higher temperature values ask for more randomness. In some models, 0 means “set the temperature automatically.” In other models, 0 means “no randomness.”

The context window controls the number of preceding tokens (words or subwords) that the model takes into account for its answer. The maximum number of tokens limits the length of the generated answer. The stop sequence is used to suppress offensive or inappropriate content in the output.

Retrieval-augmented generation, or RAG, helps to ground LLMs with specific sources, often sources that weren’t included in the models’ original training. As you might guess, RAG’s three steps are retrieval from a specified source, augmentation of the prompt with the context retrieved from the source, and then generation using the model and the augmented prompt.

RAG procedures often use embedding to limit the length and improve the relevance of the retrieved context. Essentially, an embedding function takes a word or phrase and maps it to a vector of floating point numbers. These are typically stored in a database that supports a vector search index. The retrieval step then uses a semantic similarity search, typically using the cosine of the angle between the query’s embedding and the stored vectors, to find “nearby” information to use in the augmented prompt. Search engines usually do the same thing to find their answers.

Agents, aka conversational retrieval agents, expand on the idea of conversational LLMs with some combination of tools, running code, embeddings, and vector stores. In other words, they are RAG plus additional steps. Agents often help to specialize LLMs to specific domains and to tailor the output of the LLM. Azure Copilots are usually agents; Google and Amazon use the term agents. LangChain and LangSmith simplify building RAG pipelines and agents.

Fine-tuning large language models is a supervised learning process that involves adjusting the model’s parameters to a specific task. It’s done by training the model on a smaller, task-specific data set that’s labeled with examples relevant to the target task. Fine-tuning often takes hours or days using many server-level GPUs and requires hundreds or thousands of tagged exemplars. It’s still much faster than extended pretraining.

LoRA, or low-rank adaptation, is a method that decomposes a weight matrix into two smaller weight matrices. This approximates full supervised fine-tuning in a more parameter-efficient manner. The original Microsoft LoRA paper was published in 2021. A 2023 quantized variation on LoRA, QLoRA, reduces the amount of GPU memory required for the tuning process. LoRA and QLoRA typically reduce the number of tagged exemplars and time required compared to standard fine-tuning.

Pretraining is the unsupervised learning process on huge text data sets that teaches LLMs the basics of language and creates a generic base model. Extended or continued pretraining adds unlabeled domain-specific or task-specific data sets to the base model to specialize the model, for example to add a language, add terms for a specialty such as medicine, or add the ability to generate code. Continued pretraining (using unsupervised learning) is often followed by fine-tuning (using supervised learning).

azure ai studio 06 IDG

Prompt engineering. This is an Azure AI Studio prompt sample for a Shakespearean writing assistant. There are five parts to the prompt: the modality, the task, the system message, a sample user message, and a sample desired response.

Azure AI Studio tools and components

Earlier in this review, you saw the Azure AI Studio model catalog and model benchmarks. In addition to those, in its Explore tab, Azure AI Studio offers speech, vision, and language capabilities, responsible AI, and prompt samples, such as the Shakespearean writing assistant you saw in the previous section.

In its Build tab, Azure AI Studio offers the Playground, Evaluation, Prompt Flow, Custom Neural Voice, and Fine-tuning tools, and components for Data, Indexes, Deployments, and Content Filters. In the Manage tab, you can see your resources, and (at least on the staging site) your quotas for each subscription and region.




Page 2

Azure AI Studio includes Cognitive Service speech capabilities for building voice-enabled apps. Note that these are voice-specific models, not generative AI. The prebuilt voice services have links to samples you can run. The custom models have links to instructions for getting started, which may also have samples you can run.

The speech services include captioning, speech analytics, speech to text, translation with speech to text, and text to speech with pretrained and custom neural voices. The neural voices are very high quality, to the point where customers might not realize that they are AI-generated. The pretrained voice gallery currently includes 478 voices across 148 languages and variants; some of the voices can speak over 40 languages.

azure ai studio 07 IDG

Azure AI Studio speech capabilities for building voice-enabled apps. These are Cognitive Services, not generative AI. The prebuilt services have links to samples you can run. The custom models have links to instructions for getting started, which may also have samples you can run.


Azure AI Studio also includes vision services. They add the ability to read text, analyze images, and detect faces to your app using machine learning and OCR, not generative AI.

azure ai studio 08 IDG

Azure AI Studio vision services. As with the speech services, these are Cognitive Services, not generative AI.


Azure AI Studio unifies three individual language services in Azure AI services—Text Analytics, QnA Maker, and Language Understanding (LUIS). I honestly don’t know whether the services use the machine-learning-based language models that Microsoft has refined over the years, or new generative AI models. In either case, these models allow you to classify and summarize documents, get real-time translations, or integrate language into your bot experiences.

azure ai studio 09 IDG

Azure AI Studio language services. These include pre-built, task-optimized language models and the ability to train your own custom model for a variety of tasks.

Responsible AI

The most current iteration of Azure’s responsible AI solution is the Content Safety Studio, shown in the first screenshot below. You can use it to moderate text and image content, filter generative AI for jailbreak risk, construct metaprompts for safety, detect protected material, and monitor online activity and data.

You can set the safety levels of a model with a content filter when you deploy the model, as shown in the second screenshot below.

azure ai studio 10 IDG

Content safety is currently the only category listed under Responsible AI in Azure AI Studio. It includes options for moderating text, image, and multimodal content as well as safety solutions for generative AI, monitoring online activity, and building a custom moderation solution.

azure ai studio 10aIDG

When you configure a content filter for an AI model in Azure AI Studio, you can adjust its sensitivity to both input and output material that includes violence, hate, sexual content, and self-harm.

Prompt samples

There are currently 25 prompt samples displayed in the Prompts section. Several are quite interesting. I recommend that you examine the Apple Cycle Analyst prompt to see how you’d teach an LLM how to interpret images, and the Chain of Thought Reasoning sample to see how to teach an LLM to solve basic arithmetic word problems. Without Chain of Thought guidance, most LLMs fail spectacularly on that kind of problem.

I’ve included a Shakespearean sonnet about daylight savings time that GPT-3.5 Turbo 16k and I generated after a few iterations on the user prompt. It uses the same system message you saw above to define the Shakespearean style. I didn’t have to explain the sonnet form.

azure ai studio 11 IDG

Currently, Azure AI Studio features 25 prompt samples, including samples with text and image input.

azure ai studio 12 IDG

This is a chat session and result from the Shakespearean writing assistant prompt we saw earlier in the “Model customization methods” sections. 0.7 is a reasonable temperature to use for generating creative material.


The Shakespearean sonnet example you just saw, and all the prompt samples I tried, open in Azure AI Studio’s Playground tool. If you prefer to work in code, you can use the link at the top right to open your project in Visaul Studio Code (Web), which is nearly identical to Visual Studio Code on the desktop. The Playground is the most useful tool in Azure AI Studio as long as you’re only doing prompt engineering and hyperparameter tuning.


You can run your language models and evaluate them against industry-standard metrics with this tool. Then you can choose the best version based on your need. The metrics used are groundedness, coherence, fluency, relevance, and GPTsimilarity. To perform an evaluation you first need to create a runtime. You might want to get here via the Playground and Prompt Flow.

Prompt Flow

Prompt Flow is the place you’d go from the Playground to enhance your model into an app with RAG, content filters, embedding, code, custom voice output, and fine-tuning. If you look at the files at the upper right of the screen, you’ll see Jinja, YAML, and text files that define the prompt, the flow of execution, and any requirements you want to add. (Jinja is an open source web template engine for the Python programming language. YAML is a data serialization language used for configuration files.)

The Prompt Flow screen in Azure AI Studio is your easy entry into heavy-duty AI app engineering. Prompt Flow is also available separately as an open-source project on GitHub, with its own SDK and Visual Studio Code extension.

azure ai studio 13 IDG

Azure AI Studio’s Prompt Flow tool. I got here from the Playground, with the Shakespearean writing assistant sample prompt open. From here you can enhance the app to use data, use a vector index, do fine-tuning, do evaluation, and deploy the model.

Custom Neural Voice

Custom Neural Voice is a limited-access platform (you have to apply for permission to use it) that allows you to create a new AI voice for your application. You can design your unique voice persona and efficiently manage voice talents, data sets, models, test runs, and endpoint connections.


In the preview period, which is in effect as of this writing, you can only fine-tune Llama 2 models with this tool, and it’s only supported in projects located in the West US 3 region.


You can connect Azure AI Studio to data in Azure Blob Storage, Azure Data Lake Storage Gen 2, or Microsoft OneLake. Data can be in a single file or a folder. You can also upload data files.

You can use your own data to implement RAG (see the “Model customization methods” section above) to ground your model as long as it isn’t too long. The total data length needs to be smaller than the model’s context size, otherwise you’ll need to use an embedding and a vector search index.

In addition, you can use image files (up to 16 MB each) for GPT-4 Turbo with Vision, from the Playground. Putting the images in a Blob Storage or Data Lake folder lets you give the model a URL and avoid uploading the images individually to the Playground.


Vector indexes using embeddings and Azure AI Search (vector search) make finding relevant data more efficient, and avoid the context length problem when implementing RAG. You can connect to the data in Azure Blob Storage, Azure Data Lake Storage Gen 2, or Microsoft OneLake when you create your index, or use data you’ve already uploaded in the data section.


Azure AI Studio supports deploying large language models, flows, and web apps. You can deploy models as a service (MaaS), or models as a platform (MaaP), as discussed above.

Flows are generative AI apps consisting of a sequence of tools, including models, your own data, and possibly embeddings, vector database lookup, and custom connections. When you deploy a flow, you create an endpoint for an AI service. You can also deploy a web app that uses your AI service.

Content Filters

This area lets you list and manage the content filters you use to sanitize model input and output, as discussed in the “Responsible AI” section above.

Your resources

This area, under the Manage tab, lists the permissions, compute instances, connections, policies, and billing for each of your AI projects.


Quotas for the different models and instance sizes available are currently viewable and manageable under the Manage tab in the staging version of the Azure AI Studio preview. I don’t see them in my production subscriptions, although they are available when selecting and deploying models.

Azure AI Studio quickstarts and tutorials

The number of quickstarts and tutorials in the Azure AI Studio documentation will undoubtedly grow over time. At the time of writing there are four quickstarts:

And there are three tutorials:

Yes, Azure AI Studio was designed to be usable by the blind. 

AI development without pain

Azure AI Studio, while still in preview and somewhat under construction, checks most of the boxes for a generative AI application builder. It’s clearly making progress, based on my peek at a staging site for the product, and also on the new features that dropped while I was working on my review.

You can build generative AI web apps using Azure AI Studio without having to write code. If you can write Python, all the better. I like the way the Playground and the Prompt Flow tools work.

As I mentioned in the introduction, you can accomplish many of the same things using competing products from Amazon (Bedrock) and Google (Generative AI Studio). If you can program, you can also accomplish many of the same things using LangChain and LangSmith.

But Azure AI Studio would be a good choice. It should allow you to build your AI apps efficiently with minimal pain, whether or not you write code, as long as you understand the principles of prompt engineering, embedding, RAG, and prompt flows.

Cost: Depends on model usage and size of instances deployed

Platform: Microsoft Azure cloud

Next read this:

Happy Hacking Keyboard Studio review: Mouse and keyboard in one tiny package

Posted by on 21 November, 2023

This post was originally published on this site

The Happy Hacking Keyboard line from PFU America is aimed at users who want a compact, but powerful and customizable keyboard with a great typing feel. The latest version of the HHKB (as it’s abbreviated) is the HHKB Studio, designed to compress both keyboard and mouse functionality into the most compact footprint possible. Like its predecessors, this keyboard isn’t cheap—its list price is $385—but it offers a mix of features you won’t find in other keyboards.

Let’s take a look.

HHKB Studio test drive

The HHKB Studio uses USB-C or Bluetooth and battery-powered connections, with both cabling and batteries included. Bluetooth pairing works with up to four distinct devices, and it can be used to command both Mac and Windows systems interchangeably.

I was fond of the soft-touch, smooth-sliding linear mechanical key switch mechanisms used in the HHKB Hybrid Type-S model I previously reviewed. The Studio uses the same switches, but you can swap in your own standard MX-stem switches—for instance, to give the non-alphanumeric keys a little more click, or to make the Esc key harder to actuate. The keycaps shipped with my unit used a gloss-black over matte-black color scheme that you’ll either find classy and stylish or next to impossible to make out. There is no key backlighting, but a brightly lit room helps.

HHKB Studio keyboard IDG

HHKB Studio features a super compact keyboard layout with a pointing stick mouse and gesture pads.

The super-compact 60-key layout means no dedicated cursor controls or number pad. Key controls for the arrows are accessed by way of function key combos. Also, the left Control key now sits where Caps Lock usually does; you use FN + Tab to access Caps Lock if needed. Each FN key combo is printed on the bottom front of each keycap, but again the black keycap colors on my unit made them tough to read without direct lighting.

For cursor control, the HHKB Studio adds two other features. One is the pointing stick mouse, as popularized by the original IBM ThinkPad. It’s set between the G/H/B key cluster, and complemented with thumb-reachable mouse buttons set below the space bar. It takes some practice to work with, but for basic mousing about it’s convenient, and the keyboard comes with four replacement caps for the stick mouse.

The other cursor control feature is four “gesture pads” along the front edges and sides of the unit. Slide your fingers along the left side and left front edges to move the cursor; slide them along the right side and right front edges to scroll the current window or tab between windows. You can also freely reassign the corresponding key actions for these movements.

The gesture pads are powerful and useful enough that I rarely relied on the arrow-diamond key cluster or even the pointing stick to move the cursor. However, you can trigger the gesture pads by accident. A couple of times I innocently bumped the side of the unit when moving it, and ended up sending keystrokes to a different window.

Many hackable keyboard models use the VIA standard, meaning you can change your keyboard’s layout or behaviors through a web browser app. HHKB does not support VIA, unfortunately; the keymapping and control tool provided for it runs as an installable desktop application.

Bottom line

Like its predecessor, the Happy Hacking Keyboard Studio packs functionality and a great typing feel into a small form factor. This version ramps up the functionality even further by letting you do away with a mouse. But you’ll have to decide if $385 is a worthy price.

Next read this:

Review: 7 Python IDEs compared

Posted by on 18 October, 2023

This post was originally published on this site

Of all the metrics you could use to gauge the popularity and success of a language, one surefire indicator is the number of development environments available for it. Python’s rise in popularity has brought with it a strong wave of IDE support, with tools aimed at both the general programmer and those who use Python for tasks like scientific work and analytical programming.

These seven IDEs with Python support cover the gamut of use cases. Some are built exclusively for Python, while others are multilanguage IDEs that support Python through an add-on or have been retrofitted with Python-specific extensions. Each one has particular strengths and will likely be useful for a specific type of Python development or level of experience with Python. Many strive for universal appeal.

A good number of IDEs now are frameworks outfitted with plugins for specific languages and tasks, rather than applications written to support development in a given language. Because of that, your choice of IDE may be determined by whether or not you have experience with another IDE from the same family.

Let’s take a look at the leading IDEs for Python development today.


IDLE, the integrated development and learning environment included with almost every installation of Python, could be considered the default Python IDE. However, IDLE is by no means a substitute for full-blown development; it’s more like a fancy file editor. Still, IDLE remains one of the default options for Python developers to get a leg up with the language, and it has improved incrementally with each Python release. (See this case study in application modernization for an interesting discussion of the efforts to improve IDLE.)

IDLE is built entirely with components that ship with a default installation of Python. Aside from the CPython interpreter itself, this includes the Tkinter interface toolkit. One advantage of building IDLE this way is that it runs cross-platform with a consistent set of behaviors. As a downside, the interface can be terribly slow. Printing large amounts of text from a script into the console, for instance, is many orders of magnitude slower than running the script directly from the command line. Bear this in mind if you experience performance issues with a Python program in IDLE.

IDLE has a few immediate conveniences. It sports a built-in read-eval-print loop (REPL), or interactive console, for Python. In fact, this interactive shell is the first item presented to the user when IDLE is launched, rather than an empty editor. IDLE also includes a few tools found in other IDEs, such as providing suggestions for keywords or variables when you hit Ctrl-Space, and an integrated debugger. But the implementations for most of these features are primitive compared to other IDEs, and hidebound by Tkinter’s limited selection of UI components. And the collection of third-party add-ons available for IDLE (such as IdleX) is nowhere near as rich as you’ll find with other IDEs.

IDLE also has no concept of a project, and thus no provisions for working with a Python virtual environment. The only discernible way to do this is to create a venv and invoke IDLE from its parent installation of Python. Using any other tooling, like test suites, can only be done manually.

In sum, IDLE is best for two scenarios: The first is when you want to hack together a quick Python script, and you need a preconfigured environment to work in. The second is for beginners who are just getting started with Python. Even beginners will need to graduate to a more robust option before long.

IDLE is free with Python and best suited to beginners. IDG

IDLE is free with Python, but its minimal feature set make it best suited for beginners.

OpenKomodo IDE 12

OpenKomodoIDE is the open source version of what was ActiveState’s commercial Komodo IDE product. ActiveState ceased development on Komodo and now maintains it as an open source project. Unfortunately, that means many aspects of OpenKomodo now feel dated.

OpenKomodo works as both a standalone multi-language IDE and as a point of integration with ActiveState’s language platform. Python is one of many languages supported in Komodo, and one of many languages for which ActiveState provides custom runtime builds.

On installation, Komodo informs you about the programming languages, package managers, and other development tools it discovers on your system. This is a great way to get things configured out of the box. I could see, and be certain, that Komodo was using the right version of Python and the correct install of Git.

When you create a new project for a specific language, Komodo presents a slew of options to preconfigure that project. For Python projects, you can choose from one of several common web frameworks. A sample project contains examples and mini-tutorials for many supported languages, including Python. The bad news is many of these templates are dated—Django, for instance, is at version 1.10.

A convenient drop-down search widget gives you fast navigation to all methods and functions within a file. Key bindings are configurable and can be added by way of downloadable packages that emulate other editors (e.g., Sublime Text). For linting, Komodo can integrate with PyChecker, Pylint, pep8, or Pyflakes, although support for each of these is hard-wired separately rather than available through a generic mechanism for integrating linting tools.

OpenKomodo includes many additional tools that are useful across different languages, like the regular expression builder. Another powerful feature is the “Go to Anything” bar at the top center of the IDE, where you can search for most anything in your current project or the Komodo interface. These are great features, and also available in many other IDEs (Visual Studio Code, for instance).

Some of OpenKomodo’s most prominent features revolve around integration with the ActiveState platform. Teams can configure and build custom runtimes for languages, with all the packages they need included. This is meant to ensure that individual team members don’t have to set up the runtime and dependencies for a project; they can simply grab the same custom runtime with everything preloaded.

One major limitation is clunky support for working with Python virtual environments. One has to manually create a venv, then associate the Python runtime for a project with it. Switching virtual environments for a given project requires digging into the project settings. Also, OpenKomodos’ native Git integration is nowhere near as powerful as that of other IDEs. And while you can expand Komodo’s functionality with add-ons, there aren’t nearly as many of them for Komodo as there are for Visual Studio Code.

The OpenKomodo IDE provides strong Python support. IDG

The Python edition of the OpenKomodo IDE provides strong Python support and blends in support for other programming languages as well.

LiClipse 10.0 / PyDev

The Eclipse Foundation’s Java-powered Eclipse editor supports many languages through add-ons. Python support comes by way of an add-on named PyDev, which you can use in two ways. You can add it manually to an existing Eclipse installation, or you can download a prepackaged version of Eclipse with PyDev called LiClipse. For this review I looked at the latter, since it provides the simplest and least stressful way to get up and running.

Aside from Python support, LiClipse also includes Git integration via Eclipse’s EGit add-on, support for Python’s Django web framework, and even support for Jython, the Python variant that runs on the JVM. This last seems fitting, given Eclipse’s Java roots, although Jython development has recently flagged.

LiClipse makes good use of the stock features in the Eclipse UI. All keys can be remapped, and LiClipse comes with a stock set of key bindings for Emacs emulation. The “perspectives” view system lets you switch among a number of panel views depending on the task at hand—development, debugging, or working with the project’s Git repository.

Some of the best features come by way of plugins included in the LiClipse package. Refactoring History lets you track changes across a codebase whenever a formal refactoring takes place—something that you theoretically could do with Git, but a dedicated tool comes in handy. Another truly nice feature is the ability to automatically trigger a breakpoint upon raising one or more exceptions, including exceptions you’ve defined.

LiClipse’s handling of virtual environments is hit-and-miss. While LiClipse doesn’t detect the presence of a venv in a project automatically, you can always configure and add them manually, and LiClipse integrates with Pipenv to create and manage them (assuming Pipenv is present in your base Python installation). There’s a nice GUI explorer to see which packages are installed, and in which Python venvs, and you can run pip from that GUI as well, although it’s buried a little deeply inside the LiClipse window hierarchy.

On the downside, it’s unnecesarily hard to do things like install new packages from a requirements.txt file, and it’s awkward to create a shell session with the environment activated in it—a common task that deserves its own tooling.

LiClipse comes with its own code analysis tools built-in, but can be configured to use Mypy and Pylint as well. As with Komodo, though, these choices are hard-wired into the application; there isn’t a simple way to integrate other linters not on that list. Likewise, the one test framework with direct integration into LiClipse is unittest, by way of creating a special run configuration for your project.

LiClipse wraps the PyDev add-on in a lightweight distribution of Eclipse. IDG

LiClipse wraps the PyDev add-on in a lightweight distribution of Eclipse, but PyDev can be added to an existing Eclipse installation too.


JetBrains makes a series of IDEs for various languages, all based on the same core source code. PyCharm is the Python IDE, and it’s built to support the characteristic work patterns and practices of Python developers.

This attention to workflow is evident from the moment you first create a PyCharm project. You can choose templates for many common Python project types (Flask, Django, Google App Engine), including projects with associated JavaScript frameworks (Vue, Angular, etc.). You’re given the option of setting up a virtual environment from the interpreter of your choice, with a sample file in it. A convenient GUI lets you install modules to a venv using pip, and the IDE will even autodetect requirements.txt files and offer to auto-install any missing dependencies. A fair amount of effort on Python projects gets eaten by wrangling virtual environments, so these features are very welcome.

You’ll find this same attention to everyday details throughout the IDE. For instance, if you run a file in your project with Alt-Shift-F10, PyCharm offers to remember that run configuration for future use. This is handy for projects that might have multiple entry points. When you kick open a command-line instance inside PyCharm with a project loaded, PyCharm automatically activates that project’s virtual environment. For users on low-powered notebooks, PyCharm’s power-save mode disables background code analysis to keep the battery from being devoured.

Refactoring a project, another common source of tedium, also has a dedicated PyCharm tool. This goes beyond just renaming functions or methods; you can alter most every aspect of the code in question—change a function signature, for instance—and see a preview of what will be affected in the process. PyCharm provides its own code inspection tools, but a third-party plugin makes it possible to use Pylint.

Python projects benefit from robust test suites, but developers often procrastinate on creating them because of the boilerplate coding involved. PyCharm’s automatic test-generation feature lets you generate skeleton test suites for existing code, then populate them with the tests as needed. If you already have tests, you can configure a run profile to execute them, with support for all the popular testing frameworks (pytest, unittest, nose, etc.). There are other automated shortcuts, as well. For a class, you can automatically look up which methods to implement or override when creating a subclass, again cutting down on boilerplate code.

Another great testing tool, included by default, lets you open and examine the pstat data files created by Python’s cProfile performance-profiling tool. Pstat files are binaries from which you can generate various kinds of reports with Python, but this tool saves you a step when doing that. It even generates call graphs that can be exported to image files.

PyCharm can be expanded and tweaked greatly by way of the plugins available for it, which you can install directly via PyCharm’s UI. This includes support for common data or text formats used with Python (CSV and Markdown), third-party tooling like Docker, and support for other languages such as R and Rust.

PyCharm’s community edition should cover most use cases, but the professional edition (pricing here) adds features useful in enterprise settings, such as out-of-the-box Cython support, code coverage analysis tools, and profiling.

PyCharm is a powerful choice for Python development. IDG

PyCharm’s rich set of features, even in its free edition, makes it a powerful choice for most Python development scenarios.



Page 2

Python extension for Visual Studio Code

The explosive growth and popularity of Microsoft’s Visual Studio Code has fed development for add-ons that support just about every programming language and data format out there. Of the various add-ons for VS Code that provided Python support, the best-known and most widely used are also developed by Microsoft. Together, the editor and add-ons make for one of the best solutions available for Python development, even if some of the really granular features of PyCharm aren’t available.

When installed, Microsoft’s Python extension also installs support for Jupyter notebooks, which can be opened and used directly in the editor. The Python extension also provides Pylance, a language server that provides linting and type checking by way of the Pyright tool. Together, these components provide a solution that covers the vast majority of development scenarios. Another optional but useful extension allows applying the Black formatter to your codebase.

One drawback with Python extension for VS Code is the lack of a general setup process, like a wizard, for creating a new Python project and configuring all of its elements. Each step must be done manually: creating the virtual environment, configuring paths, and so on. On the other hand, many of those steps—such as a making a venv—are supported directly in the Python extension. VS Code also automatically detects virtual environments in a project directory, and makes a best effort to use them whenever you open a terminal window in the editor. This saves the hassle of having to manually activate the environment. VS Code can also detect virtual environments created with Poetry, the Python project-management tool, or Pipenv.

Another powerful feature in VS Code, the command palette, lets you find just about any command or setting by simply typing a word or two. Prefix your search term with “Py” or “Python” and you’ll get even more focused results. A broad variety of linters and code-formatting tools are supported natively in the Python extension.

One thing VS Code supports well with the Python extension is the discovery and execution of unit testing. Both Python’s native unittest and the third-party (but popular) pytest are supported. Run the “Python: Configure tests” command from the palette, and it will walk through test discovery and set up a test runner button on the status bar. Individual tests even have inline annotations that let you re-run or debug them. It’s a model for how I wish many other things could be done with the Python extension.

The Python extension for Visual Studio Code concentrates on the most broadly used parts of Python, and leaves the more esoteric corners to third parties. For instance, there is no support for the Cython superset of Python, which lets you compile Python into C. A third-party extension provides Cython syntax highlighting, but no actual integration of Cython workflow. This has become less crucial with the introduction of Cython’s “pure Python” syntax, but it’s an example of how the Python extension focuses on the most common use cases.

What’s best about the Python extension for Visual Studio Code is how it benefits from the flexibility and broad culture of extensions available for VS Code generally. Key bindings, for instance, can be freely remapped, and any number of themes are available to make VS Code’s fonts or color palettes more palatable.

VS Code extensions includes support for Python. IDG

VS Code’s open-ended architecture allows support for any number of languages, with Python being a major player.

Python Tools for Visual Studio 2022

If you already use Visual Studio in some form and are adding Python to the mix, using the Python Tools for Visual Studio add-on makes perfect sense. Microsoft’s open source plugin provides prepackaged access to a number of common Python frameworks, and it makes Python debugging and deployment functions available through Visual Studio’s interface in the same manner as any other major language.

When Visual Studio 2015 came along, InfoWorld’s Martin Heller was impressed by its treatment of open source languages as first-class citizens right alongside Microsoft’s own. Python is included among those languages, with a level of support that makes it worth considering as a development environment, no matter what kind of project you’re building.

There are two ways to get set up with Python on Visual Studio. You can add the Python Tools to an existing installation of Visual Studio, or you can download a stub that installs Visual Studio from scratch and adds Python Tools automatically. Both roads lead to the same Rome: A Visual Studio installation with templates for many common Python application types.

Out of the box, Python for Visual Studio can create projects that use some of the most widely used Python web frameworks: Flask, Flask with Jade (a templating language), Django, and Bottle. Also available are templates for generic web services, a simple command-line application, a Windows IoT core application that uses Python, and an option to create Visual Studio projects from existing Python code. I was pleased to see templates for IronPython, the revitalized Python port that runs on the .NET framework. Also available are templates for Scikit-learn projects, using the cookiecutter project templating system. That said, it would be nice to see more options for other machine learning systems, like PyTorch.

When you create a new project using one of these frameworks, Visual Studio checks to make sure you have the dependencies already available. If not, it presents a few choices. You can create a Python virtual environment and have the needed packages placed there. You can have the packages installed into the Python interpreter available systemwide. Or you can add the dependencies to the project manually. If you have an existing Python project and want to migrate it into Visual Studio, you can take an existing Python code directory (a copy is probably best) and migrate it to become a Visual Studio project.

One nice touch is that Visual Studio logs all the steps it takes when it sets up a project, so you know what changes were made and where everything is located. Visual Studio also smartly detects the presence of requirements.txt files, and can create a virtual environment for your project with those requirements preinstalled. If you’re porting an existing project that includes virtual enviromments, they too will be automatically detected and included. Unfortunately, Visual Studio doesn’t yet work with pyproject.toml files for setting up a project.

Visual Studio’s Solution Explorer contains not only the files associated with each of your Python projects, but also the accompanying Python environment, as well as any Python packages installed therein. Right-click on the environment and you can install packages interactively, automatically generate a requirements file, or add folders, .zip archives, or files to the project’s search path. Visual Studio automatically generates IntelliSense indexes for installed environments, so the editor’s on-the-fly suggestions are based on what’s installed in the entire Python environment you’re using, not only the current file or project.

Smart techniques for working with Visual Studio’s metaphors abound. When you launch a web application for testing, through the green arrow launch icon in the toolbar, Visual Studio’s app launcher pops open the default web browser (or the browser you select) and points it at the application’s address and port. The Build menu has a Publish option that can deploy your application on a variety of cloud services, including Microsoft’s Azure App Service.

Python Tools for Visual Studio provides a built-in facility for running the Pylint and Mypy code analyzers. As with other Visual Studio features that depend on external packages, Visual Studio will attempt to install either of those packages if you haven’t yet set them up. You can also set up the linters by hand in your virtual environment; in fact I prefer this option because it is the most flexible.

I was disappointed by the absence of support for Cython, the project that allows Python modules to be compiled into C extensions, DLLs, and standalone executables. Cython uses Visual Studio as one of its compilers, but there’s no support for legacy Cython-format files in Python Tools for Visual Studio, nor direct support for compiling Cython modules in Visual Studio.

Visual Studio includes first-class support for Python. IDG

Microsoft offers first-class support for Python as a development language in Visual Studio, including support for web frameworks.

Spyder 5

Most Python IDEs are general purpose, meaning they’re suitable for any kind of Python development—or for developing in other languages along with Python. Spyder focuses on providing an IDE for scientific work rather than, say, web development or command-line applications. That focus makes Spyder less flexible than the other IDEs profiled here, especially since it doesn’t have the same range of immediate third-party extensibility, but it’s still quite powerful for its specific niche.

Spyder itself is written in Python. This might be its biggest quirk or its best feature, depending on how you see it. Spyder can be downloaded and installed as a module to run from within a given Python instance, set up as a standalone application, or it can be set up from within the Anaconda Python distribution or the portable WinPython distro. In all of these cases, the IDE will run from a particular instance of Python.

It is possible to install Spyder standalone with an installer, but the chief drawback there is the absence of per-project configuration. This mainly means there is no easy way to configure Spyder to work with a given project’s virtual environment when you launch the project; you can only configure Spyder as a whole to work with one particular venv.

Another approach is to create a venv and install Spyder into that, and launch Spyder from within it. However, this requires installing dozens of packages that total over 400MB, so might not be practical for multiple projects that require it. Another downside: Regardless of the setup method, Spyder takes much longer to launch than the other IDEs profiled here.

Where Spyder shines is in making Python’s scientific computing tools immediately available in a single interface. The left-hand side of the UI is taken up with the usual project-file-tree/editor-tab-set display. But the right-hand side features two tabbed panes devoted to visualization and interactive tools. IPython and Jupyter notebooks run in their own pane, along with generated graphical plots (which you can show inline as well, or solely in the Plots tab).

I particularly liked the variable explorer that shows you, and lets you interactively edit, all the user-created variables in your IPython session. I also liked the built-in profiler pane, which lets you see statistics on which parts of your program take the most time to run. Unfortunately, I couldn’t get the profiler to work reliably with projects in their own venv unless I installed Spyder in the venv and launched it from there.

Key bindings in Spyder are all configurable, including those for panes other than the editor (e.g., the plotting view). But here again, key bindings can only be configured on an editor-wide basis. For unit testing, you will need to install a separate module, spyder-unittest, which works with Python’s own unittest and the pytest and nose frameworks.

Spyder is a Python IDE for math and science, and other dev work too. IDG

Spyder focuses on math and science—hence its presence in the Anaconda distribution—but it can be used for other kinds of development work, too.


For those who don’t have much experience, PyCharm is one of the best IDEs to start with. It’s friendly to newcomers, but not hamstrung in its feature set. In fact, it sports some of the most useful features among all the IDEs profiled here. Many of those features are available only in the for-pay version, but there’s plenty in the free version to help a fledgling developer get started.

LiClipse and the Python Tools for Visual Studio are good choices for developers already intimately familiar with Eclipse and Microsoft Visual Studio, respectively. Both are full-blown development environments—as much as you’re going to find—that integrate Python quite nicely. However, they’re also sprawling, complex applications that come with a lot of cognitive overhead. If you’ve already mastered either of these IDEs, you’ll find it a great choice for Python work.

Microsoft’s Visual Studio Code editor, equipped with Microsoft’s Python extension, is a far more lightweight option than Visual Studio. VS Code has become immensely popular thanks to its wide range of extensions, which allow developers in projects that use not only Python but HTML and JavaScript, for instance, to assemble a collection of extensions to complement their workflow.

The Python incarnation of ActiveState’s Komodo IDE is a natural fit for developers who have already used the Komodo IDE for some other language, and it has unique features (like the regular expression evaluator) that ought to broaden its appeal. Komodo deserves a close look from both novices and experts.

Spyder is best suited to working with Jupyter notebooks or other scientific computing tools in distributions like Anaconda, rather than as a development platform for Python generally.

Finally, IDLE is best reserved for quick-and-dirty scripting, and even on that count, it might take a back seat to a standalone code editor with a Python syntax plugin. That said, IDLE is always there when you need it.

Next read this:

Llama Chat and Code Llama are good at coding

Posted by on 12 September, 2023

This post was originally published on this site

A funny thing happened when I was playing with Poe, a chatbot aggregator from Quora. I selected the recently released free almost-open-source Llama 2 70B Chat model from Meta and gave it the prompt “Generate a Python program to scrape a website. Include tests for and” That’s a shorter version of a prompt I used to test CodeWhisperer, Bard, and Copilot X in June. None of those three models generated a working program.

To my utter surprise, Llama 2 70B Chat aced this test:

llama code generation 01IDG

The bot went on to suggest some follow-up questions:

Tell me more.
Can you explain how the BeautifulSoup library works for parsing HTML content?
Are there any other libraries or tools commonly used for web scraping in Python?
What are some best practices for web scraping to ensure compliance with website terms of service?

The generated code looked good to me, so I copied it into Visual Studio Code, saved it, and ran it. It ran perfectly:

% /usr/local/bin/python3 /Users/martinheller/Code/Llama/
martinheller@Martins-M1-MBP Code % /usr/local/bin/python3 /Users/martinheller/Code/Llama/
Title: Welcome to
Link: #content
Link: #python-network
Link: /
Link: /jobs/
Link: /community-landing/
Link: #top
Link: /
Link: #site-map
Link: #
Link: javascript:;
Link: javascript:;
Link: javascript:;
Link: #
Link: /community/irc/
Link: /about/
Link: /about/apps/
Link: /about/quotes/
Link: /about/gettingstarted/
Link: /about/help/
Link: /downloads/
Link: /downloads/
Link: /downloads/source/
Link: /downloads/windows/
Link: /downloads/macos/
Link: /download/other/
Link: /download/alternatives
Link: /doc/

Comparing the Llama-generated code with the CodeWhisperer-generated code, the major difference is that Llama used the html.parser model for Beautiful Soup, which worked, while CodeWhisperer used the lxml model, which choked.

Llama 2 code explanation

I also asked Llama 2 70B Chat to explain the same sample program I had given to CodeWhisperer, Bard, and Copilot X. CodeWhisperer doesn’t currently have a chat window, so it doesn’t do code explanations, but Bard did a great job on this task and Copilot X did a good job.

llama code generation 02IDG
llama code generation 03IDG
llama code generation 04IDG

Llama’s explanation (shown above) is as good, or possibly better, than what Bard generated. I don’t completely understand why Llama stopped in item 12, but I suspect that it may have hit a token limit, unless I accidentally hit the “stop” button in Poe and didn’t notice.

For more about Llama 2 in general, including discussion of its potential copyright violations and whether it’s open source or not, see “What is Llama 2? Meta’s large language model explained.”

Coding with Code Llama

A couple of days after I finished working with Llama 2, Meta AI released several Code Llama models. A few days after that, at Google Cloud Next 2023, Google announced that they were hosting Code Llama models (among many others) in the new Vertex AI Model Garden. Additionally, Perplexity made one of the Code Llama models available online, along with three sizes of Llama 2 Chat.

So there were several ways to run Code Llama at the time I was writing this article. It’s likely that there will be several more, and several code editor integrations, in the next months.

Poe didn’t host any Code Llama models when I first tried it, but during the course of writing this article Quora added Code Llama 7B, 13B, and 34B to Poe’s repertoire. Unfortunately, all three models gave me the dreaded “Unable to reach Poe” error message, which I interpret to mean that the model’s endpoint is busy or not yet connected. The following day, Poe updated, and running the Code Llama 34B model worked:

llama code generation 05IDG

As you can see from the screenshot, Code Llama 34B went one better than Llama 2 and generated programs using both Beautiful Soup and Scrapy.

Perplexity is website that hosts a Code Llama model, as well as several other generative AI models from various companies. I tried the Code Llama 34B Instruct model, optimized for multi-turn code generation, on the Python code-generation task for website scraping:

llama code generation 06IDG

As far as it went, this wasn’t a bad response. I know that the requests.get() method and bs4 with the html.parser engine work for the two sites I suggested for tests, and finding all the links and printing their HREF tags is a good start on processing. A very quick code inspection suggested something obvious was missing, however:

llama code generation 07IDG

Now this looks more like a command-line utility, but different functionality is now missing. I would have preferred a functional form, but I said “program” rather than “function” when I made the request, so I’ll give the model a pass. On the other hand, the program as it stands will report undefined functions when compiled.

llama code generation 08IDG

Returning JSON wasn’t really what I had in mind, but for the purposes of testing the model I’ve probably gone far enough.

Llama 2 and Code Llama on Google Cloud

At Google Cloud Next 2023, Google Cloud announced that new additions to Google Cloud Vertex AI’s Model Garden include Llama 2 and Code Llama from Meta, and published a Colab Enterprise notebook that lets you deploy pre-trained Code Llama models with vLLM with the best available serving throughput.

If you need to use a Llama 2 or Code Llama model for less than a day, you can do so for free, and even run it on a GPU. Use Colab. If you know how, it’s easy. If you don’t, search for “run code llama on colab” and you’ll see a full page of explanations, including lots of YouTube videos and blog posts on the subject. Note that while Colab is free but time-limited and resource-limited, Colab Enterprise costs money but isn’t limited.

If you want to create a website for running LLMs, you can use the same vLLM library as used in the Google Cloud Colab Notebook to set up an API. Ideally, you’ll set it up on a server with a GPU big enough to hold the model you want to use, but that isn’t totally necessary: You can get by with something like a M1 or M2 Macintosh as long as it has enough RAM to run your model. You can also use LangChain for this, at the cost of writing or copying a few lines of code.

Running Llama 2 with Ollama

If you are using an Arm-based Macintosh as your workstation, you can run Llama models locally as a command-line utility. The invaluable Sharon Machlis explains how to use Ollama; it’s easy, although if you don’t have enough RAM memory for the model it’ll use virtual memory (i.e. SSD or, heaven forfend, spinning disk) and run really slow. (Linux and Windows support is planned for Ollama.)

I tried out Ollama with several models (of the many it supports) on my M1 MacBook Pro, which unfortunately has only 8GB of RAM. I started with my standard Python web-scraping code generation task using Llama 2, apparently one of the smaller models (7B?). The result is similar to what I got from the Llama 2 70B model running on Poe, although not as well-structured. Note that Ollama only downloads the model the first time it needs it.

martinheller@Martins-M1-MBP ~ % ollama run llama2
pulling manifest
pulling 8daa9615cce3... 100% |███████████████| (3.8/3.8 GB, 51 MB/s)
pulling 8c17c2ebb0ea... 100% |██████████████████| (7.0/7.0 kB, 39 MB/s)
pulling 7c23fb36d801... 100% |█████████████████| (4.8/4.8 kB, 5.0 MB/s)
pulling 578a2e81f706... 100% |████████████████████| (95/95 B, 129 kB/s)
pulling e35ab70a78c7... 100% |████████████████████| (90/90 B, 550 kB/s)
pulling 96862bb35d77... 100% |██████████████████| (447/447 B, 2.8 MB/s)
verifying sha256 digest
writing manifest
>>> Generate a Python program to scrape a website. Include tests for and
Sure! Here is an example of how you can use the `requests` and `BeautifulSoup` libraries in Python to scrape a website:
import requests
from bs4 import BeautifulSoup

# Example website to scrape
url = ""

# Send a request to the website and get the HTML response
response = requests.get(url)

# Parse the HTML content using Beautiful Soup
soup = BeautifulSoup(response.content, 'html.parser')

# Find the specific information you want to scrape on the website
title = soup.find('h1', class_='main-title').text
print(title)  # Output: ""

# Find all the links on the page
links = soup.find_all('a')
for link in links:
    print(link.get('href'))  # Output: ["/python", "/docs", "/learn", etc.]

# Test the code with different websites
url2 = ""
response2 = requests.get(url2)
soup2 = BeautifulSoup(response2.content, 'html.parser')
title2 = soup2.find('h1', class_='main-title').text
print(title2)  # Output: "Infoworld"

# Find all the links on the page
links2 = soup2.find_all('a')
for link in links2:
    print(link.get('href'))  # Output: ["/news", "/features", "/opinions", etc.]
This code sends a request to the specified website, parses the HTML content using Beautiful Soup, and then finds specific information on the page (in this case, the title and all the links). You can modify the `url` variable at the top of the script to specify the website you want to scrape.

For example, if you want to scrape the Python documentation, you can set `url = "` and then use the same code as before to find the title and links on the page.

To test the code with different websites, you can modify the `url2` variable at the top of the script to specify a different website. For example, you can set `url2 = ""` to scrape that website, or any other website you want to test.

Note: Make sure you have the necessary permissions and legal rights to scrape a website. Web scraping can be illegal in some cases, so be sure to check the terms of service for the website you're scraping and respect their rules.

With that baseline established, I tried the same prompt using Code Llama. Again, I didn’t specify the model size, but it looks like it is 7B.

martinheller@Martins-M1-MBP ~ % ollama run codellama
pulling manifest
pulling 8268242df6f0... 100% |████████████████████████████████████████████| (3.8/3.8 GB, 50 MB/s)
pulling 2c8743bdc4ad... 100% |███████████████████████████████████████████████| (7.0/7.0 kB, 40 MB/s)
pulling 38fa20ee7daa... 100% |███████████████████████████████████████████████| (4.8/4.8 kB, 28 MB/s)
pulling 578a2e81f706... 100% |█████████████████████████████████████████████████| (95/95 B, 993 kB/s)
pulling 404e21afdc6a... 100% |█████████████████████████████████████████████████| (30/30 B, 187 kB/s)
pulling 86b68a7bb45d... 100% |███████████████████████████████████████████████| (507/507 B, 4.0 MB/s)
verifying sha256 digest
writing manifest
>>> Generate a Python program to scrape a website. Include tests for and
 To write a Python program that scrapes a website, you will need to use a library such as BeautifulSoup or Scrapy. These libraries allow you to navigate and parse the HTML structure of a webpage, and extract specific data from it.
Here is an example of how you might use BeautifulSoup to scrape the Python website at
import requests
from bs4 import BeautifulSoup

# Send a request to the Python website
url = ''
response = requests.get(url)

# Parse the HTML content using Beautiful Soup
soup = BeautifulSoup(response.content, 'html.parser')

# Find all links on the page
links = soup.find_all('a')

# Print each link to the console
for link in links:
To write tests for this program, you can use a testing framework such as unittest or pytest. These frameworks allow you to define test functions that can be run automatically to verify that your code is working correctly.
Here is an example of how you might write tests using the unittest module:
import unittest
from bs4 import BeautifulSoup
class TestPythonOrg(unittest.TestCase):
    def setUp(self):
        # Send a request to the Python website
        url = ''
        response = requests.get(url)
        # Parse the HTML content using Beautiful Soup
        soup = BeautifulSoup(response.content, 'html.parser')
        self.soup = soup
    def test_title(self):
        title = self.soup.find('title').text
        self.assertEqual(title, 'Python Documentation - Python v3.10')
    def test_description(self):
        description = self.soup.find('meta', {'name': 'description'})['content']
        self.assertIn('The official homepage for the Python Programming Language', description)
    def tearDown(self):

if __name__ == '__main__':
This test program sends a request to the Python website, parses the HTML content using Beautiful Soup, and then runs two separate tests: one that verifies that the page title is correct, and another that verifies that the page description contains certain keywords.
You can run this test program by saving it to a file (e.g. ``), opening a terminal or command prompt, and then running the following command:
This will run all of the test functions in the program, and print a message indicating whether each test passed or failed.



Page 2

This model took my request for tests a little more seriously, and demonstrated how to use the Python Unit Test module.

I wondered how different the output of the Python-tuned Code Llama 7B model would be:

martinheller@Martins-M1-MBP ~ % ollama run codellama:7b-python
pulling manifest
pulling 1dd2448f61e7... 100% |████████████████████████████████████████████| (3.8/3.8 GB, 48 MB/s)
pulling 2c8743bdc4ad... 100% |███████████████████████████████████████████████| (7.0/7.0 kB, 52 MB/s)
pulling 38fa20ee7daa... 100% |███████████████████████████████████████████████| (4.8/4.8 kB, 64 MB/s)
pulling b507b9c2f6ca... 100% |██████████████████████████████████████████████████| (13/13 B, 60 kB/s)
pulling 404e21afdc6a... 100% |█████████████████████████████████████████████████| (30/30 B, 240 kB/s)
pulling b413591f8542... 100% |███████████████████████████████████████████████| (507/507 B, 3.2 MB/s)
verifying sha256 digest
writing manifest
>>> Generate a Python program to scrape a website. Include tests for and

import urllib.request, html.parser
class Website(object):
    Class used to store the data scraped from the website
    :param str title: Title of article/page
    :param str author: Author of article/page
    :param str date: Date that the article was published on
    :param str content: Body text of article
    def __init__(self, title=None, author=None, date=None, content=None):
        self.title = title #title of post/article = author #author of the post or article = date #date that the article was published on
        self.content = content #the body text of the post/article
class Article(object):
    Class used to store data scraped from an infowars article
    :param str title: Title of article/page
    :param str author: Author of article/page
    :param str date: Date that the article was published on
    :param str content: Body text of article
    def __init__(self, title=None, author=None, date=None, content=None):
        self.title = title #title of post/article = author #author of the post or article = date #date that the article was published on
        self.content = content #the body text of the post/article
def scrape(website_url, htmlparser=html.parser):
    Function used to gather all the data from a website and return it in a form for use later

    :param str url: The URL of the website you are scraping from
    :param object parser: Defaults to the built-in html.parser but can be changed depending on the website's code style
    #create website object and assign it to a variable called 'website'
    website = Website()
    #open url using urlopen function from urllib.request module
    with urllib.request.urlopen(website_url) as url:
        #use read function to get the contents of the page and store it in a variable called 'content'
        content =
        #use decode_content=True argument for html parsing module to be able to parse the url content
        parsed_content = htmlparser(markup=content, convert_charrefs=True, decode_entities=False, decode_content=True)
        #search through content of webpage and find all instances where a tag is used to bold/italicize text in the article body text
        title = parsed_content.find("h1", {"class": "entry-title"})
        author = parsed_content.find("div", {"class": "author-info"}).text
        date = parsed_content.find("span", {"class": "updated"}).text
        content = parsed_content.findAll("p")[3].text + 'n' + parsed_content.findAll("p")[4].text + 'n' + parsed_content.findAll("p")[5].text
        #store all the data gathered from the website into variables in the 'website' object created earlier
        website.title = title.text = author = date
        website.content = content
        return website

All of a sudden, we see Python classes and a function that uses them. This run forgot to generate the unit tests. Let’s try the same prompt and model again and see if we get different output.

martinheller@Martins-M1-MBP ~ % ollama run codellama:7b-python
>>> Generate a Python program to scrape a website. Include tests for and
#Python program to scrap data from website#
import requests
from bs4 import BeautifulSoup as soup
url_1 = ''
reqs_1 = requests.get(url_1)
url_2 = ''
reqs_2 = requests.get(url_2)
#HTML parsing from website
page_1 = reqs_1.text
#HTML parsing from website
page_2 = reqs_2.text
#BeautifulSoup object for website
soup_1 = soup(page_1, 'html.parser')
#BeautifulSoup object for website
soup_2 = soup(page_2,'html.parser')
def scrape():
    #Dictionary to store data from both websites and
    website_data = {}
    #Scraping all the text from headings of a website#
    headlines = soup_1.find_all('div',{'class':'headline'})
    for line in headlines:
        data = line.text
        if "Python" in data or 'Programming' in data:
            website_data['headlines'] = [data]
    #Scraping all the text from paragraphs of a website#
    paragrphs = soup_1.find_all('p')
    for line in paragrphs:
        data = line.text
        if "Python" in data or 'Programming' in data:
            website_data['paras'] = [data]
    #Scraping all the text from headings of a website#
    headlines2 = soup_2.find_all('p')
    for line in headlines2:
        data = line.text
        if "Python" in data or 'Programming' in data:
            website_data['headlines'] = [data]
    #Scraping all the text from paragraphs of a website#
    paragrphs2 = soup_1.find_all('p')
    for line in paragrphs2:
        data = line.text
        if "Python" in data or 'Programming' in data:
            website_data['paras'] = [data]
    #Return the dictionary of all text scraped from both websites and
    return website_data

This time the model generated code in a functional style instead of an object-oriented style. It did something ugly, though: Instead of using the home page of for its second test, it used the URL of an article about the Python programming language. Alas, that page does not currently exist, so we may have either bumped up against old content in the model, or the model had a hallucination.

Two Llamas for coding

As you’ve seen, Llama 2 Chat can generate and explain Python code quite well, right out of the box. There’s no need to fine-tune it further on code-generation tasks—although Meta has done exactly that for Code Llama.

Llama 2 Chat is not without controversy, however. Meta says that it’s open source, but the OSI begs to disagree, on two counts. Meta says that it’s more ethical and safer than other LLMs, but a class action lawsuit from three authors says that its training has violated their copyrights.

It’s nice that Llama 2 Chat works so well. It’s troubling that to train it to work well Meta may have violated copyrights. Perhaps, sooner rather than later, someone will find a way to train generative AIs to be effective without triggering legal problems.

Code Llama’s nine fine-tuned models offer additional capabilities for code generation, and the Python-specific versions seem to know something about Python classes and testing modules as well as about functional Python.

When the bigger Code Llama models are more widely available online running on GPUs, it will be interesting to see how they stack up against Llama 2 70B Chat. It will also be interesting to see how well the smaller Code Llama models perform for code completion when integrated with Visual Studio Code or another code editor.

Next read this:

Review: CodeWhisperer, Bard, and Copilot X

Posted by on 27 June, 2023

This post was originally published on this site

When I wrote about the GitHub Copilot preview in 2021, I noted that the AI pair programmer didn’t always generate good, correct, or even running code, but was still somewhat useful. At the time, I concluded that future versions could be real time-savers. Two years later, Copilot is improving. These days, it costs money even for individuals, and it has some competition. In addition, the scope of coding assistants has expanded beyond code generation to code explanations, pull request summaries, security scanning, and related tasks.

Three tools for AI pair programming

Let’s start with a quick overview of the tools under review, then we’ll dive in for a closer look at each one.

  • Amazon CodeWhisperer is the product that competes most directly with Copilot. A “coding companion” like Copilot, CodeWhisperer integrates with Visual Studio Code and JetBrains IDEs, generates code suggestions in response to comments and code completions based on existing code, and can scan code for security issues. CodeWhisperer supports five programming languages well, and another 10 at a lesser degree of support. It can optionally flag and log references to code it uses and optionally filter out code suggestions that resemble open source training data.
  • Google Bard is a web-based interface to LaMDA (Language Model for Dialogue Applications), a conversational AI model capable of fluid, multi-turn dialogue. Bard recently added the ability to help with coding and topics about coding. When Bard emits code that may be subject to an open source license, it cites its sources and provides the relevant information. Bard is also good at code explanations.
  • GitHub Copilot X is “leveling up” from the original Copilot with chat and terminal interfaces, support for pull requests, and early adoption of OpenAI’s GPT-4. Currently, to access Copilot X you need to have an active Copilot subscription and join the waiting list, with no guarantee about when you’ll get access to the new features. It took about a month for my invitation to arrive after I joined the waiting list.

Using one of these code generators is not the only way to generate code. To begin with, you can access general-purpose transformers like GPT-4 and its predecessors, including ChatGPT, BingGPT/Bing Chat (available in the Edge browser), and There are also other code-specific AI tools, such as StarCoder, Tabnine, Cody, AlphaCode, Polycoder, and Replit Ghostwriter. In every case I’ve mentioned, it is vital to use discretion and carefully test and review the generated code before using it.

How the tools were tested

In my previous article about code generation, I evaluated the AI code generators based on the rather easy task of writing a program to determine the number of days between two dates. Most did okay, although some needed more guidance than others. For this review, I tried the code generators on the more difficult task of scraping for a list of articles. I gave them an outline but no additional help. None generated correct code, although some came closer than others. As an additional task, I asked the tools that support code explanation to explain a Python code example from an MIT Open Courseware introductory programming course.

For reference, the outline I gave to the code generators is:

Scrape front page:
	Find all articles by looking for links with ‘article’ in the href; extract title, author, date from each
	List all articles alphabetically by title; eliminate duplicates 
	List all articles alphabetically by author last name
	List all articles latest first

In general, I tried to act like a more naive programmer than I am, just to see what the tools would do.

Now, let’s look more closely at each of our code generators.

Amazon CodeWhisperer

Within your IDE, Amazon CodeWhisperer analyzes your English language comments and surrounding code to infer what code it should generate to complete what you are typing. Then, it offers code as a suggestion that you can either accept or reject, or you can ask CodeWhisperer for alternate code, or ignore and continue writing your own code. CodeWhisperer’s large language model (LLM) was trained on billions of lines of code, including Amazon and open source code. Any given suggestion is based not only on your comments and immediate code context, but also on the contents of other files open in the IDE.

In addition to code generation, CodeWhisperer can scan your Python, Java, and JavaScript code for security vulnerabilities and suggest fixes for them. The vulnerability lists it consults include Open Web Application Security Project (OWASP), crypto library best practices, AWS API best practices, and other API best practices. Security scans occur on-demand, unlike code completion, which is offered continuously as you code unless you turn off suggestions.

Programming languages and IDEs

CodeWhisperer’s best programming languages for code generation (the most prevalent languages in the training corpus) are Java, Python, JavaScript, TypeScript, and C#. It has been trained to a lesser extent on Ruby, Go, PHP, C++, C, Shell, Scala, Rust, Kotlin, and SQL.

There are CodeWhisperer plugins for Visual Studio Code and JetBrains IDEs. You can also activate CodeWhisperer for use inside AWS Cloud9 and AWS Lambda; in both cases, you must edit your IAM permissions as well as checking the Enable CodeWhisperer box.

I installed CodeWhisperer in Visual Studio Code. There are four steps:

  • Add the CodeWhisperer plugin to VS Code.
  • Add a connection to AWS.
  • Sign in on the AWS website.
  • Start CodeWhisperer from the AWS developer tools panel.
CodeWhisperer usage instructions. IDG

Figure 1. Once you have installed and authorized CodeWhisperer (at left), you can see the usage instructions, shown here on the right.

Code suggestions and completions

It’s worth watching some of the videos demonstrating CodeWhisperer’s capabilities, listed on the CodeWhisperer resource page. They’ll tell you more than I can in words and screenshots. While watching them, it became clear to me that CodeWhisperer shines on code that calls AWS APIs.

Nevertheless, here’s a screenshot of the generated code from CodeWhisperer:

CodeWhisperer generating code. IDG

Figure 2. CodeWhisperer generating code.

Note that the generated code did not work, and also that CodeWhisperer generated multiple functions for more or less the same purpose. It was fairly close, however, and CodeWhisperer kind of read my mind about using Beautiful Soup to scrape the web page.

CodeWhisper program. IDG

Figure 3. Trying the CodeWhisperer-generated program. 

Security scans and open source references

I ran the CodeWhisperer security scan on my project, as well. Unsurprisingly, no issues were detected in this file, although the CodeWhisperer scan found a resource leak in another file generated by GitHub Copilot X. The CodeWhisperer security scan is a differentiating feature that hasn’t yet surfaced in the competing AI code generators.

This file did not trigger any open source references, but when I opened the CodeWhisperer Reference Log, I found a link to the setting that controls whether CodeWhisperer will suggest code that has such references. In the Amazon examples I viewed, code suggestions with references displayed the license type at the top.

The bottom line for CodeWhisperer

CodeWhisperer generates code in an IDE and does security reviews, but it lacks a chat window and code explanations. The tool works more or less as advertised, and the price (free) is right for individual developers. For corporate developers, the $19 per month price for CodeWhisperer is the same as for Copilot. I’d favor CodeWhisperer over Copilot for shops that use AWS heavily and shops that need to know when code suggestions refer to open source.

Google Bard

Bard support for programming was announced April 21, 2023. The blog post mentioned “more than 20 programming languages including C++, Go, Java, Javascript, Python and Typescript.”

When I asked Bard for the full list, I received the following response:

Bard supports over 20 programming languages, including:

  • C++
  • Go
  • Java
  • JavaScript
  • Python
  • TypeScript
  • C#
  • PHP
  • Ruby
  • Swift
  • Kotlin
  • Rust
  • Julia
  • R
  • Haskell
  • Erlang
  • Scala
  • Lua
  • OCaml
  • Scheme

Bard can generate code in these languages, debug code, and explain code. It can also translate from one programming language to another.

Please note that Bard is still under development, so its capabilities may not be perfect. It is always a good idea to review AI-generated code carefully before using it.

I tested Bard on my code explanation and code generation tasks, which are shown below. The code explanation was quite good—so good that I wondered whether Bard had been trained on that MIT course code sample, which dates from 2008 and has been available on the web since then.

The code generation for the site scraper looked fairly credible at first glance but had several obvious problems on further inspection. It was a matter of clicking twice to open the code in Colab, but it threw a type error when I tried to run it, and it didn’t even get as far as the code I had identified as wrong.

Rather than fixing the problems, I tried running the code generation again, and the result made a bit more sense. It threw an attribute error when I tried to run it in Colab, however. I fed that traceback back to Bard, it generated a fix, I ran the “fixed” code in Colab, and there was a type error in a different spot. Enough, already! If I was actually a naive programmer, I would have thrown up my hands.

What about citing web pages and open source code? The Bard FAQ had this to say:

Bard, like some other standalone LLM experiences, is intended to generate original content and not replicate existing content at length. We’ve designed our systems to limit the chances of this occurring, and we will continue to improve how these systems function. If Bard does directly quote at length from a webpage, it cites that page.

Sometimes the same content may be found on multiple webpages and Bard attempts to point to a popular source. In the case of citations to code repositories, the citation may also reference an applicable open source license.

Given that Bard is completely free at this time, and requires no setup, it’s certainly worth trying when you need to generate, debug, and explain code in any of these 20 languages. Don’t forget to review, test, and debug any generated code.

Here is the instruction I gave to Bard and the resulting dialog:

I'm about to give you some code to explain. Please wait until the code is complete.

Sure, I will wait until you are finished giving me the code. Please feel free to take your time. I am here to help you in any way that I can.

Here are some tips for explaining code:
•	Start by explaining the overall purpose of the code. What does it do? What problem is it trying to solve?
•	Then, break down the code into smaller chunks and explain each one in turn.
•	Use clear and concise language. Avoid jargon and technical terms.
•	Use examples to illustrate your points.
•	Be patient and don't be afraid to ask questions.

I am confident that I can help you explain any code that you give me. Just let me know what you need and I will do my best to help.



Page 2

And here is the code I provided:

import pylab, random

class Stock(object):
    def __init__(self, price, distribution, vol):
        self.price = price
        self.history = [price]
        self.distribution = distribution
        self.vol = vol
        self.lastChangeInfluence = 0.0
    def setPrice(self, price):
        self.price = price
    def getPrice(self):
        return self.price
    def makeMove(self, bias, mo):
        oldPrice = self.price
        baseMove = self.distribution(self.vol) + bias
        self.price = self.price * (1.0 + baseMove)
        self.price += mo*random.choice([0.0, 1.0])*self.lastChangeInfluence
        change = self.price - oldPrice
        if change >= 0:
            self.lastChangeInfluence = min(change, oldPrice*0.01)
            self.lastChangeInfluence = max(change, -oldPrice*0.01)
    def showHistory(self, fig, test):
        pylab.title('Closing Prices, Test ' + test)

class SimpleMarket(object):
    def __init__(self, numStks, volUB):
        self.stks = []
        self.bias = 0.0
        for n in range(numStks):
            volatility = random.uniform(0, volUB)
            distribution = lambda vol: random.gauss(0.0, vol)
            stk = Stock(100.0, distribution, volatility)
    def addStock(self, stk):
    def setBias(self, bias):
        self.bias = bias
    def getBias(self):
        return self.bias
    def getStocks(self):
        return self.stks[:]
    def move(self, mo):
        prices = []
        for s in self.stks:
            s.makeMove(self.bias, mo)
        return prices

class Market(SimpleMarket):
    def __init__(self, numStks, volUB, dailyBiasRange):
        SimpleMarket.__init__(self, numStks, volUB)
        self.dailyBiasRange = dailyBiasRange
    def move(self, mo):
        prices = []
        dailyBias = random.gauss(self.dailyBiasRange[0], self.dailyBiasRange[1])
        for s in self.stks:
            s.makeMove(self.bias + dailyBias, mo)
        return prices

def simMkt(mkt, numDays, mo):
    endPrices = []
    for i in range(numDays):
        vals = mkt.move(mo)
        vals = pylab.array(vals)
        mean = vals.sum()/float(len(vals))
    return endPrices

def plotAverageOverTime(endPrices, title):

def plotDistributionAtEnd(mkt, title, color):
    prices = []
    sumSoFar = 0
    for s in mkt.getStocks():
        sumSoFar += s.getPrice()
    mean = sumSoFar/float(len(prices))
    pylab.plot(prices, color)
    pylab.axhline(mean, color = color)
    pylab.ylabel('Last Sale')

def runTrial(showHistory, test, p):
    colors = ['b','g','r','c','m','y','k']

    mkt = Market(p['numStocks'], p['volUB'], p['dailyBiasRange'])
    endPrices = simMkt(mkt, p['numDays'], p['mo'])
    plotAverageOverTime(endPrices, 'Average Closing Prices')
    plotDistributionAtEnd(mkt, 'Distribution of Prices', colors[test%len(colors)])
    if showHistory:
        for s in mkt.getStocks():
            s.showHistory(test+2, str(test))

def runTest(numTrials):
    #Constants used in testing
    numDaysPerYear = 200.0
    params = {}
    params['numDays'] = 200
    params['numStocks'] = 500
    params['bias'] = 0.1/numDaysPerYear #General market bias
    params['volUB'] = 12.0/numDaysPerYear #Upper bound on volatility for a stock
    params['mo'] = 1.1/numDaysPerYear #Momentum factor
    params['dailyBiasRange'] = (0.0, 4.0/200.0)

    for t in range(1, numTrials+1):
        runTrial(True, t, params)


Here is some of Bard’s response, which was quite in depth:

The code you have provided is a Python program that simulates a stock market. The program defines three classes:

Stock: This class represents a single stock. It has properties for the stock's price, history, distribution, volatility, and last change influence.

SimpleMarket: This class represents a simple stock market. It has a list of stocks, a bias, and a volatility upper bound.

Market: This class inherits from SimpleMarket and adds a daily bias range property. The program also defines several functions:
•	simMkt: This function simulates the market for a given number of days and momentum factor.
•	plotAverageOverTime: This function plots the average closing price of the stocks over time.
•	plotDistributionAtEnd: This function plots the distribution of prices at the end of the simulation.
•	runTrial: This function runs a single simulation and plots the results.
•	runTest: This function runs a number of simulations and plots the results.

The program is run by calling the runTest function with the desired number of simulations. The program will then generate three plots:
•	A plot of the average closing price of the stocks over time.
•	A plot of the distribution of prices at the end of the simulation.
•	A plot of the history of prices for each stock.

The program can be used to simulate a variety of different stock markets. By changing the values of the parameters, you can change the volatility of the market, the bias of the market, and the momentum factor. You can also use the program to simulate different trading strategies. For example, you could simulate a strategy that buys stocks when they are undervalued and sells them when they are overvalued.

Bard also provided an explanation of how each part of the code worked:

•	The Stock class represents a single stock. It has properties for the stock's price, history, distribution, volatility, and last change influence. The price property is the current price of the stock. The history property is a list of the stock's prices over time. The distribution property is a function that generates random prices for the stock. The volatility property is a measure of how much the stock's price can fluctuate. The lastChangeInfluence property is a measure of how much the stock's price is affected by its previous change.

•	The SimpleMarket class represents a simple stock market. It has a list of stocks, a bias, and a volatility upper bound. The list of stocks property is a list of the stocks in the market. The bias property is a measure of how much the market is biased towards rising prices. The volatility upper bound property is a measure of how much the market can fluctuate.

•	The Market class inherits from SimpleMarket and adds a daily bias range property. The daily bias range property is a measure of how much the market's bias can change from day to day.

•	The simMkt function simulates the market for a given number of days and momentum factor. The number of days parameter is the number of days to simulate. The momentum factor parameter is a measure of how much the prices of stocks are affected by their momentum.

•	The plotAverageOverTime function plots the average closing price of the stocks over time. The plotDistributionAtEnd function plots the distribution of prices at the end of the simulation.

•	The runTrial function runs a single simulation and plots the results. The showHistory parameter is a boolean value that determines whether the history of prices for each stock is plotted. The test parameter is a unique identifier for the simulation.

•	The runTest function runs a number of simulations and plots the results. The numTrials parameter is the number of simulations to run.

I hope this explanation is helpful. Please let me know if you have any other questions.

When I followed up, typing in, “where is the pylab module?” Bard responded with the location of the module, how to import it, a description of features, and a pointer to the documentation.

Additionally, here is the Bard-generated code for scraping for a list of articles:

import requests
from bs4 import BeautifulSoup

def scrape_infoworld():
  """Scrape the InfoWorld front page and return a list of articles."""

  # Get the HTML from the InfoWorld front page.
  response = requests.get("")
  soup = BeautifulSoup(response.content, "html.parser")

  # Find all articles by looking for links with 'article' in the href.
  articles = soup.find_all("a", href=lambda x: "article" in x["href"])

  # Extract the title, author, and date from each article.
  article_data = []
  for article in articles:
    title = article.find("h3").text
    author = article.find("span", class_="byline").text
    date = article.find("time").text
      "title": title,
      "author": author,
      "date": date,

  # Sort the articles alphabetically by title.
  article_data.sort(key=lambda x: x["title"])

  # Remove any duplicate articles.
  article_data = list(set(article_data))

  # Sort the articles alphabetically by author last name.
  article_data.sort(key=lambda x: x["author"].split()[-1])

  # Sort the articles latest first.
  article_data.sort(key=lambda x: x["date"], reverse=True)

  return article_data

if __name__ == "__main__":
  articles = scrape_infoworld()

  # Print the articles.
  for article in articles:


This code will first get the HTML from the InfoWorld front page. Then, it will find all articles by looking for links with 'article' in the href. For each article, it will extract the title, author, and date. The articles will then be sorted alphabetically by title, removed of any duplicates, sorted alphabetically by author last name, and sorted latest first. Finally, the articles will be printed.

And here is the runtime result:

TypeError                                 Traceback (most recent call last)
<ipython-input-1-afbe7808577e> in <cell line: 40>()
     40 if __name__ == "__main__":
---> 41   articles = scrape_infoworld()
     43   # Print the articles.

6 frames

<ipython-input-1-afbe7808577e> in <lambda>(x)
     11   # Find all articles by looking for links with 'article' in the href.
---> 12   articles = soup.find_all("a", href=lambda x: "article" in x["href"])
     14   # Extract the title, author, and date from each article.

TypeError: string indices must be integers

The bottom line for Bard

Bard has a chat interface and both generates and explains code, but it doesn’t have an interactive IDE integration. Given that Bard is completely free at this time, and requires no setup, it’s certainly worth trying when you need to generate, debug, and explain code in any of the 20 supported languages.

GitHub Copilot X

GitHub Copilot X is greatly improved over the original GitHub Copilot, and can sometimes generate a correct function and set of tests without much human help. It still makes mistakes and hallucinates (generates false information), but not nearly as much as it once did.

In addition to generating code within a programming editor, currently supporting only the most current versions of Visual Studio and the latest insider version of Visual Studio Code, Copilot X adds a GPT-4 chat panel to the editor. It also adds a terminal interface, support for generating unit tests and pull request descriptions, and the ability to extract explanations from documentation.

I asked the Copilot X chat what programming languages it supports, and it answered “̉I support a wide range of programming languages, including but not limited to: Python, JavaScript, TypeScript, Ruby, Java, C++, C#, PHP, Go, Swift, Kotlin, Rust, and many more.”  I did my testing primarily in Python.

When I used the Copilot Chat facility to ask Copilot X to explain the MIT market simulation code, it gave a partially correct answer. I had to metaphorically pull its teeth to get it to explain the rest of the code.

Copilot X explanation. IDG

Figure 4. Copilot X did a decent but incomplete job of explaining the market simulator.

Copilot X’s most notable failure was the web-scraping code generation task. The tool generated a bunch of superficially credible-looking code that didn’t use Beautiful Soup, but it was clear from reviewing the code that it would never work. I kept bringing the problems to Copilot Chat, but it just dug itself a deeper hole. I could probably have started over and given it better hints, including handing it an import from bs4 and adding some comments showing the HTML and directory structure of the InfoWorld home page. I didn’t do it because that would not be in character for the naive coder persona I had adopted for this round of tests.

Copilot X responds to user feedback. IDG

Figure 5. Copilot X tried to generate the web scraping code without using Beautiful Soup (bs4). Later when I chatted about the solution it generated, it first claimed that it was using Beautiful Soup, but then admitted that it could not find an import.

As with all AI helpers, you have to take the code generated by Copilot X with a huge grain of salt, just as you would for a pull request from an unknown programmer.

The bottom line for Copilot X

In addition to generating code within an IDE, Copilot X adds a GPT-4 chat panel to the editor. It also adds a terminal interface, support for unit test generation, support for generating pull request descriptions, and the ability to extract explanations from technical documentation. Copilot X costs $10 per month for individuals and $19 per user per month for businesses. 


GitHub Copilot X works decently on simple problems, but not necessarily better than the combination of Amazon CodeWhisperer in a code editor and Google Bard in a browser. It’s too bad that CodeWhisperer doesn’t yet have a chat capability or the facility for explaining code, and it’s too bad that Bard doesn’t exactly integrate with an editor or IDE.

I’d be tempted to recommend Copilot X if it hadn’t gone off the rails on my advanced code generation task—mainly because it integrates chat and code generation in an editor. At this point, however, Copilot X isn’t quite ready. Overall, none of the code generation products are really up to snuff, although both Bard and Copilot X do a decent job of code explanation.

All of these products are in active development, so my recommendation is to keep watching them and experimenting, but don’t put your faith in any of them just yet.

Next read this:

First look: wasmCloud and Cosmonic

Posted by on 25 April, 2023

This post was originally published on this site

As you likely know by now, WebAssembly, or wasm, is an efficient, cross-platform, cross-language way to run code almost anywhere, including in a browser and on a server—even in a database. Cosmonic is a commercial platform-as-a-service (PaaS) for wasm modules. It builds on the open-source wasmCloud. This technology preview starts with a quick overview of wasm, then we’ll set up wasmCloud and Cosmonic and see what we can do with them.

What is wasm?

WebAssembly (wasm) is a “binary instruction format for a stack-based virtual machine.” It’s a portable compilation target for programming languages, including C, C++, C#, Rust, Go, Java, PHP, Ruby, Swift, Python, Kotlin, Haskell, and Lua; Rust is often the preferred language for wasm. There are three wasm-specific languages: AssemblyScript, Grain, and Motoko. Wasm targets include browsers (currently Chrome, Firefox, Safari, and Edge), Node.js, Deno, Wasmtime, Wasmer, and wasm2c.

Wasm tries to run at native speed in a small amount of memory. It runs in a memory-safe, sandboxed execution environment, even on the web.

WebAssembly System Interface (WASI) is a modular system interface for WebAssembly. Wasm has a component model with a W3C proposed specification. WebAssembly Gateway Interface (Wagi) is a proposed implementation of CGI for wasm and WASI. Spin is a multi-language framework for wasm applications.

What is wasmCloud?

wasmCloud is a CNCF-owned open source software platform that uses wasm and NATS to build distributed applications composed of portable units of WebAssembly business logic called actors. wasmCloud supports TinyGo and Rust for actor development. It also supports building platforms, which are capability providers. wasmCloud includes lattice, a self-forming, self-healing mesh network using NATS that provides a unified, flattened topology. wasmCloud runs almost everywhere: in the cloud, at the edge, in the browser, on small devices, and so on. The wasmCloud host runtime uses Elixir/OTP and Rust.

Many wasmCloud committers and maintainers work for Cosmonic (the company). Additionally, the wasmCloud wash cloud shell works with Cosmonic (the product).

What is Cosmonic?

Cosmonic is both a company and a product. The product is a WebAssembly platform as a service (PaaS) that builds on top of wasmCloud and uses wasm actors. Cosmonic offers a graphical cloud user interface for designing applications, and its own shell, cosmo, that complements wash and the wasmCloud GUI. Supposedly, anything you build that works in plain wasmCloud should work automatically in Cosmonic.

A host is a distributed, wasmCloud runtime process that manages actors and capability providers. An actor is a WebAssembly module that can handle messages and invoke functions on capability providers. A capability is an abstraction or representation of some functionality required by your application that is not considered part of the core business logic. A capability provider is an implementation of the representation described by a capability contract. There can be multiple providers per capability with different characteristics.

A link is a runtime-defined connection between an actor and a capability provider. Links can be changed without needing to be redeployed or recompiled.

A constellation is a managed, isolated network space that allows your actors and providers to securely communicate with each other regardless of physical or logical location; essentially, a Cosmonic-managed wasmCloud lattice. A super constellation is a larger constellation formed by securely connecting multiple environments through Cosmonic.

A wormhole is an ingress point into your constellation. An OCI distribution is a standard for artifact storage, retrieval, and distribution, implemented by (for example) the Azure Container Registry and the GitHub artifact registry.

The infrastructure view shows the virtual hosts running in your Cosmonic constellation. The logic view shows the logical relationships between components in your Cosmonic constellation or super constellation.

Installing and testing wasmCloud

Installation of wasmCloud varies with your system. I used brew on my M1 MacBook Pro; it installed more than I wanted because of dependencies, particularly the Rust compiler and cargo package manager, which I prefer to install from the Rust language website using rustup. Fortunately, a simple brew uninstall rust cleared the way for a standard rustup installation. While I was installing languages, I also installed TinyGo, the other language supported for wasmCloud actor development.

After installation, I asked the wash shell to tell me about its capabilities:

martinheller@Martins-M1-MBP ~ % wash --help
                               _____ _                 _    _____ _          _ _
                              / ____| |               | |  / ____| |        | | |
 __      ____ _ ___ _ __ ___ | |    | | ___  _   _  __| | | (___ | |__   ___| | |
   / / / _` / __| '_ ` _ | |    | |/ _ | | | |/ _` |  ___ | '_  / _  | |
   V  V / (_| __  | | | | | |____| | (_) | |_| | (_| |  ____) | | | |  __/ | |
   _/_/ __,_|___/_| |_| |_|_____|_|___/ __,_|__,_| |_____/|_| |_|___|_|_|

A single CLI to handle all of your wasmCloud tooling needs

Usage: wash [OPTIONS] <COMMAND>

  app       Manage declarative applications and deployments (wadm) (experimental)
  build     Build (and sign) a wasmCloud actor, provider, or interface
  call      Invoke a wasmCloud actor
  claims    Generate and manage JWTs for wasmCloud actors
  ctl       Interact with a wasmCloud control interface
  ctx       Manage wasmCloud host configuration contexts
  down      Tear down a wasmCloud environment launched with wash up
  drain     Manage contents of local wasmCloud caches
  gen       Generate code from smithy IDL files
  keys      Utilities for generating and managing keys
  lint      Perform lint checks on smithy models
  new       Create a new project from template
  par       Create, inspect, and modify capability provider archive files
  reg       Interact with OCI compliant registries
  up        Bootstrap a wasmCloud environment
  validate  Perform validation checks on smithy models
  help      Print this message or the help of the given subcommand(s)

  -o, --output <OUTPUT>  Specify output format (text or json) [default: text]
  -h, --help             Print help information
  -V, --version          Print version information

Then I made sure I could bring up a wasmCloud:

martinheller@Martins-M1-MBP ~ % wash up
🏃 Running in interactive mode, your host is running at http://localhost:4000
🚪 Press `CTRL+c` at any time to exit
17:00:20.343 [info] Wrote configuration file host_config.json
17:00:20.344 [info] Wrote configuration file /Users/martinheller/.wash/host_config.json
17:00:20.344 [info] Connecting to control interface NATS without authentication
17:00:20.344 [info] Connecting to lattice rpc NATS without authentication
17:00:20.346 [info] Host NCZVXJWZAKMJVVBLGHTPEOVZFV4AW5VOKXMD7GWZ5OSF5YF2ECRZGXXH (gray-dawn-8348) started.
17:00:20.346 [info] Host issuer public key: CCXQKGKOAAVXUQ7MT2TQ57J4DBH67RURBKT6KEZVOHHZYPJKU6EOC3VZ
17:00:20.346 [info] Valid cluster signers: CCXQKGKOAAVXUQ7MT2TQ57J4DBH67RURBKT6KEZVOHHZYPJKU6EOC3VZ
17:00:20.351 [info] Started wasmCloud OTP Host Runtime
17:00:20.356 [info] Running WasmcloudHostWeb.Endpoint with cowboy 2.9.0 at (http)
17:00:20.357 [info] Access WasmcloudHostWeb.Endpoint at http://localhost:4000
17:00:20.453 [info] Lattice cache stream created or verified as existing (0 consumers).
17:00:20.453 [info] Attempting to create ephemeral consumer (cache loader)
17:00:20.455 [info] Created ephemeral consumer for lattice cache loader

While I had the wasmCloud running, I viewed the website at port 4000 on my local machine:

wasmCloud local dashboard IDG

Figure 1. wasmCloud local dashboard on port 4000 after running wash up. There are no actors, providers, or links.

Then I stopped the wasmCloud:

martinheller@Martins-M1-MBP ~ % wash down

✅ wasmCloud host stopped successfully
✅ NATS server stopped successfully
🛁 wash down completed successfully



Page 2

Installing and testing Cosmonic

I installed the Cosmonic CLI from the Quickstart page and asked it to tell me about itself:

martinheller@Martins-M1-MBP ~ % cosmo --help

⣾⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⠏  ⢻⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣷
⣿⣿⣿⣿⣿⣿⣿⣿⣿⡿⠁    ⠙⢿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⡿⠛⠁        ⠈⠛⠛⠿⠿⠿⣿⣿⡿
⣿⣿⣿⣿⣿⣿⣷⣦⣀        ⣀⣤⣶⣶⣾⣿⣿⣿⣷
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣷⡄    ⣴⣾⣿⣿⣿⣿⣿⣿⣿⣿⣿
⢿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣆  ⣼⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡿

      C O S M O N I C

Usage: cosmo [OPTIONS] <COMMAND>

  build     Build (and sign) an actor, provider, or interface
  down      Stop the wasmCloud host and NATS leaf launched by `up`
  launch    Launch an actor on a local wasmCloud host
  login     Securely download credentials to authenticate this machine with Cosmonic infrastructure
  new       Create a new project from template
  up        Start a NATS leaf and wasmCloud host connected to Cosmonic infrastructure, forming a super constellation
  tutorial  Run through the tutorial flow
  whoami    Check connectivity to Cosmonic and query device identity information
  help      Print this message or the help of the given subcommand(s)

  -o, --output <OUTPUT>  Specify output format (text or json) [default: text]
  -h, --help             Print help
  -V, --version          Print version

Then, I went through the online interactive drag-and-drop tutorial to create an echo application, resulting in this diagram:

cosmonic logic view IDG

Figure 2. Cosmonic Logic view after going through the online tutorial. The reversed arrow indicates that the wormhole is connected for ingress into the echo application.

I also ran the local Quickstart hello tutorial:

martinheller@Martins-M1-MBP ~ % cosmo tutorial hello

⣾⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⠏  ⢻⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣷
⣿⣿⣿⣿⣿⣿⣿⣿⣿⡿⠁    ⠙⢿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⡿⠛⠁        ⠈⠛⠛⠿⠿⠿⣿⣿⡿
⣿⣿⣿⣿⣿⣿⣷⣦⣀        ⣀⣤⣶⣶⣾⣿⣿⣿⣷
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣷⡄    ⣴⣾⣿⣿⣿⣿⣿⣿⣿⣿⣿
⢿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣆  ⣼⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡿

      C O S M O N I C
Welcome to cosmo!
✅ You're already authenticated!
⚙️  It looks like you don't have a wasmCloud host running locally. Launching one with:
    `cosmo up`
>>> ⠀⢀
Ok to download NATS and wasmCloud to /Users/martinheller/.cosmo ?: y
🟢 A wasmCloud host connected to your constellation has been started!

To stop the host, run:
    'cosmo down'
>>> ⡋⢀
To start the tutorial, we'll generate a new project with `cosmo new`. Proceed?: y
🌐 Next we'll download code for your hello world actor to the hello/ directory...
>>> ⢋⠁                      Cloning into '.'...
>>> ⠈⢙                      remote: Enumerating objects: 86, done.
remote: Counting objects: 100% (86/86), done.
remote: Compressing objects: 100% (56/56), done.
>>> ⠈⡙
>>> ⠀⢙
>>> ⠀⡙                      remote: Total 86 (delta 23), reused 76 (delta 22), pack-reused 0
Receiving objects: 100% (86/86), 312.66 KiB | 1.02 MiB/s, done.
Resolving deltas: 100% (23/23), done.
>>> ⠀⠩                      Already on 'main'
Your branch is up to date with 'origin/main'.
🔧   Using template subfolder `hello-world/rust`...
🔧   Generating template ...
[ 1/15]   Done: .cargo/config.toml
[ 7/15]   Done: .gitignore
✨   Done! New project created /Users/martinheller/hello
>>> ⠀⠠              No keypair found in "/Users/martinheller/.wash/keys/martinheller_account.nk".
                    We will generate one for you and place it there.
                    If you'd like to use alternative keys, you can supply them as a flag.

No keypair found in "/Users/martinheller/.wash/keys/hello_module.nk".
                    We will generate one for you and place it there.
                    If you'd like to use alternative keys, you can supply them as a flag.

>>> ⠀⢀
Now, we'll launch your hello actor and connect it to its capabilities. Proceed?: y
🚀 Launching your actor with:
    cosmo launch -p hello
🚀 Actor launched!
✅ You already have a Cosmonic-managed host running!
🔗 Launching capability providers and linking them to your actor...
    In the future, you can start providers from the UI at
✅ You're already running a required capability provider: HTTP Server
🌌 Creating a wormhole connected to your actor...
    In the future, you can create wormholes from the UI at

👇 Here's what we did:
⭐️ We started a wasmCloud host on your machine, connected to your constellation
🚀 We launched the hello world actor on your local wasmCloud host
⚙️  We started a managed host on the Cosmonic platform in your constellation
   We started an HTTP server capability provider on this host
🔗 We linked the actor on your local host to the provider running on your Cosmonic-managed host
🌌 We created a wormhole associated with this actor, allowing you to access your hello world app from the internet

Feel free to browse the code placed in the `hello/` directory.

If you're interested in how to deploy custom code to Cosmonic, check out our docs at:

If you want to go through this tutorial again in the future, simply run:
    cosmo tutorial hello

🎉 That's it! Access your actor securely through a wormhole now:

martinheller@Martins-M1-MBP ~ % curl
Hello, World!%

At this point, both my online and offline tutorials appeared in my Cosmonic constellation:

Cosmonic Logic view. IDG

Figure 3. Cosmonic Logic view after completing both the online Echo tutorial and the offline Hello World tutorial. The two applications share a single HTTP-Wormhole provider but have separate URLs.

Cosmonic infrastructure view. IDG

Figure 4. Cosmonic Infrastructure view after completing both the online Echo tutorial and the offline Hello World tutorial.

Running cosmo down stops the local host and NATS server from cosmo tutorial hello, but doesn’t affect the online tutorial result. The code generated by the tutorial is remarkably simple, given that it’s creating a web application with a wormhole:

Rust source for the Cosmo tutorial. IDG

Figure 5. Rust source for Hello actor generated by cosmo tutorial hello, displayed in Visual Studio Code. Note that the actual implementation only amounts to one to four lines of Rust code, depending on how you count.


We could go on and explore Cosmonic’s pre-built capabilities and examples, wasmCloud examples, and even build a complete wasmCloud/Cosmonic application.

At this point, you should have a reasonably good feeling for what is possible with this technology. Given that wasmCloud is free and open source, and that Cosmonic’s developer preview is also currently free, I encourage you to explore those possibilities and see what you come up with.

Preview: Google Cloud Dataplex wows

Posted by on 11 April, 2023

This post was originally published on this site

In the beginning, there was a database. On the second day, there were many databases, all isolated silos… and then also data warehouses, data lakes, data marts, all different, and tools to extract, transform, and load all of the data we wanted a closer look at. Eventually, there was also metadata, data classification, data quality, data security, data lineage, data catalogs, and data meshes. And on the seventh day, as it were, Google dumped all of this on an unwitting reviewer, as Google Cloud Dataplex.

OK, that was a joke. This reviewer sort of knew what he was getting into, although he still found the sheer quantity of new information (about managing data) hard to take in.

Seriously, the distributed data problem is real. And so are the data security, safety of personally identifiable information (PII), and governance problems. Dataplex performs automatic data discovery and metadata harvesting, which allows you to logically unify your data without moving it.

Google Cloud Dataplex performs data management and governance using machine learning to classify data, organize data in domains, establish data quality, determine data lineage, and both manage and govern the data lifecycle. As we’ll discuss in more detail below, Dataplex typically starts with raw data in a data lake, does automatic schema harvesting, applies data validation checks, unifies the metadata, and makes data queryable by Google-native and open source tools.

Competitors to Google Cloud Dataplex include AWS Glue and Amazon EMR, Microsoft Azure HDInsight and Microsoft Purview Information Protection, Oracle Coherence, SAP Data Intelligence, and Talend Data Fabric.

google cloud dataplex 01 IDG

Google Cloud Dataplex overview diagram. This diagram lists five Google analytics components, four functions of Dataplex proper, and seven kinds of data reachable via BigLake, of which three are planned for the future.

Google Cloud Dataplex features

Overall, Google Cloud Dataplex is designed to unify, discover, and classify your data from all of your data sources without requiring you to move or duplicate your data. The key to this is to extract the metadata that describes your data and store it in a central place. Dataplex’s key features:

Data discovery

You can use Google Cloud Dataplex to automate data discovery, classification, and metadata enrichment of structured, semi-structured, and unstructured data. You can manage technical, operational, and business metadata in a unified data catalog. You can search your data using a built-in faceted-search interface, the same search technology as Gmail.

Data organization and life cycle management

You can logically organize data that spans multiple storage services into business-specific domains using Dataplex lakes and data zones. You can manage, curate, tier, and archive your data easily.

Centralized security and governance

You can use Dataplex to enable central policy management, monitoring, and auditing for data authorization and classification, across data silos. You can facilitate distributed data ownership based on business domains with global monitoring and governance.

Built-in data quality and lineage

You can automate data quality across distributed data and enable access to data you can trust. You can use automatically captured data lineage to better understand your data, trace dependencies, and troubleshoot data issues.

Serverless data exploration

You can interactively query fully governed, high-quality data using a serverless data exploration workbench with access to Spark SQL scripts and Jupyter notebooks. You can collaborate across teams with built-in publishing, sharing, and search features, and operationalize your work with scheduling from the workbench.

How Google Cloud Dataplex works

As you identify new data sources, Dataplex harvests the metadata for both structured and unstructured data, using built-in data quality checks to enhance integrity. Dataplex automatically registers all metadata in a unified metastore. You can also access data and metadata through a variety of Google Cloud services, such as BigQuery, Dataproc Metastore, Data Catalog, and open source tools, such as Apache Spark and Presto.

The two most common use cases for Dataplex are a domain-centric data mesh and data tiering based on readiness. I went through a series of labs that demonstrate both.

google cloud dataplex 02 IDG

In this diagram, domains are represented by Dataplex lakes and owned by separate data producers. Data producers own creation, curation, and access control in their domains. Data consumers can then request access to the lakes (domains) or zones (sub-domains) for their analysis.

google cloud dataplex 03 IDG

Data tiering means that your ingested data is initially accessible only to data engineers and is later refined and made available to data scientists and analysts. In this case, you can set up a lake to have a raw zone for the data that the engineers have access to, and a curated zone for the data that is available to the data scientists and analysts.

Preparing your data for analysis

Google Cloud Dataplex is about data engineering and conditioning, starting with raw data in data lakes. It uses a variety of tools to discover data and metadata, organize data into domains, enrich the data with business context, track data lineage, test data quality, curate the data, secure data and protect private information, monitor changes, and audit changes.

The Dataplex process flow starts in cloud storage with raw ingested data, often in CSV tables with header rows. The discovery process extracts the schema and does some curation, producing metadata tables as well as queryable files in cloud storage using Dataflow flex and serverless Spark jobs; the curated data can be in Parquet, Avro, or Orc format. The next step uses serverless Spark SQL to transform the data, apply data security, store it in BigQuery, and create views with different levels of authorization and access. The fourth step creates consumable data products in BigQuery that business analysts and data scientists can query and analyze.

google cloud dataplex 04 IDG

Google Cloud Dataplex process flow. The data starts as raw CSV and/or JSON files in cloud storage buckets, then is curated into queryable Parquet, Avro, and/or ORC files using Dataflow flex and Spark. Spark SQL queries transform the data into refined BigQuery tables and secure and authorized views. Data profiling and Spark jobs bring the final data into a form that can be analyzed.

In the banking example that I worked through, the Dataplex data mesh architecture has four data lakes for different banking domains. Each domain has raw data, curated data, and data products. The data catalog and data quality framework are centralized.

google cloud dataplex 05 IDG

Google Cloud Dataplex data mesh architecture. In this banking example, there are four domains in data lakes, for customer consumer banking, merchant consumer banking, lending consumer banking, and credit card consumer banking. Each data lake contains raw, curated, and product data zones. The central operations domain applies to all four data domains.

Automatic cataloging starts with schema harvesting and data validation checks, and creates unified metadata that makes data queryable. The Dataplex Attribute Store is an extensible infrastructure that lets you specify policy-related behaviors on the associated resources. That allows you to create taxonomies, create attributes and organize them in a hierarchy, associate one or more attributes to tables, and associate one or more attributes to columns.

You can track your data classification centrally and apply classification rules across domains to control the leakage of sensitive data such as social security numbers. Google calls this DLP (data loss prevention).

google cloud dataplex 06 IDG

Customer demographics data product. At this level information that is PII (personally identifiable information) or otherwise sensitive can be flagged, and measures can be taken to reduce the risk, such as masking sensitive columns from unauthorized viewers.

Automatic data profiling, currently in public preview, lets you identify common statistical characteristics of the columns of your BigQuery tables within Dataplex data lakes. Automatic data profiling performs scans to let you see the distribution of values for individual columns.

End-to-end data lineage helps you to understand the origin of your data and the transformations that have been applied to it. Among other benefits, data lineage allows you to trace the downstream impact of data issues and identify the upstream causes.

google cloud dataplex 07 IDG

Google Cloud Dataplex explorer data lineage. Here we are examining the SQL query that underlies one step in the data transformation process. This particular query was run as an Airflow DAG from Google Cloud Composer.

Dataplex’s data quality scans apply auto-recommended rules to your data, based on the data profile. The rules screen for common issues such as null values, values (such as IDs) that should be unique but aren’t, and values that are out of range, such as birth dates that are in the future or the distant past.

I half-joked at the beginning of this review about finding Google Cloud Dataplex somewhat overwhelming. It’s true, it is overwhelming. At the same time, Dataplex seems to be potentially the most complete system I’ve seen for turning raw data from silos into checked and governed unified data products ready for analysis.

Google Cloud Dataplex is still in preview. Some of its components are not in their final form, and others are still missing. Among the missing are connections to on-prem storage, streaming data, and multi-cloud data. Even in preview form, however, Dataplex is highly useful for data engineering.

Vendor: Google, 

Cost: Based on pay-as-you-go usage; $0.060/DCU-hour standard, $0.089/DCU-hour premium, $0.040/DCU-hour shuffle storage.

Platform: Google Cloud Platform.

Tailscale: Fast and easy VPNs for developers

Posted by on 15 March, 2023

This post was originally published on this site

Networking can be an annoying problem for software developers. I’m not talking about local area networking or browsing the web, but the much harder problem of ad hoc, inbound, wide area networking.

Suppose you create a dazzling website on your laptop and you want to share it with your friends or customers. You could modify the firewall on your router to permit incoming web access on the port your website uses and let your users know the current IP address and port, but that could create a potential security vulnerability. Plus, it would only work if you have control over the router and you know how to configure firewalls for port redirection.

Alternatively, you could upload your website to a server, but that’s an extra step that can often become time-consuming, and maintaining dedicated servers can be a burden, both in time and money. You could spin up a small cloud instance and upload your site there, but that is also an extra step that can often become time-consuming, even though it’s often fairly cheap.

Another potential solution is Universal Plug and Play (UPnP), which enables devices to set port forwarding rules by themselves. UPnP needs to be enabled on your router, but it’s only safe if the modem and router are updated and secure. If not, it creates serious security risks on your whole network. The usual advice from security vendors is not to enable it, since the UPnP implementations on many routers are still dangerous, even in 2023. On the other hand, if you have an Xbox in the house, UPnP is what it uses to set up your router for multiplayer gaming and chat.

A simpler and safer way is Tailscale, which allows you to create an encrypted, peer-to-peer virtual network using the secure WireGuard protocol without generating public keys or constantly typing passwords. It can traverse NAT and firewalls, span subnets, use UPnP to create direct connections if it’s available, and connect via its own network of encrypted TCP relay servers if UPnP is not available.

In some sense, all VPNs (virtual private networks) compete with Tailscale. Most other VPNs, however, route traffic through their own servers, which tends to increase the network latency. One major use case for server-based VPNs is to make your traffic look like it’s coming from the country where the server is located; Tailscale doesn’t help much with this. Another use case is to penetrate corporate firewalls by using a VPN server inside the firewall. Tailscale competes for this use case, and usually has a simpler setup.

Besides Tailscale, the only other peer-to-peer VPN is the free open source WireGuard, on which Tailscale builds. Wireguard doesn’t handle key distribution and pushed configurations. Tailscale takes care of all of that.

What is Tailscale?

Tailscale is an encrypted point-to-point VPN service based on the open source WireGuard protocol. Compared to traditional VPNs based on central servers, Tailscale often offers higher speeds and lower latency, and it is usually easier and cheaper to set up and use.

Tailscale is useful for software developers who need to set up ad hoc networking and don’t want to fuss with firewalls or subnets. It’s also useful for businesses that need to set up VPN access to their internal networks without installing a VPN server, which can often be a significant expense.

Installing and using Tailscale

Signing up for a Tailscale Personal plan was free and quick; I chose to use my GitHub ID for authentication. Installing Tailscale took a few minutes on each machine I tried: an M1 MacBook Pro, where I installed it from the macOS App Store; an iPad Pro, installed from the iOS App Store; and a Pixel 6 Pro, installed from the Google Play Store. Installing on Windows starts with a download from the Tailscale website, and installing on Linux can be done using a curl command and shell script, or a distribution-specific series of commands.

tailscale 01 IDG

You can install Tailscale on macOS, iOS, Windows, Linux, and Android. This tab shows the instructions for macOS.

Tailscale uses IP addresses in the 100.x.x.x range and automatically assigns DNS names, which you can customize if you wish. You can see your whole “tailnet” from the Tailscale site and from each machine that is active on the tailnet.

In addition to viewing your machines, you can view and edit the services available, the users of your tailnet, your access controls (ACL), your logs, your tailnet DNS, and your tailnet settings.

tailscale 02 IDG

Once the three devices were running Tailscale, I could see them all on my Tailscale login page. I chose to use my GitHub ID for authentication, as I was testing just for myself. If I were setting up Tailscale for a team I would use my team email address.

tailscale 06 IDG

Tailscale pricing.

Tailscale installs a CLI on desktop and laptop computers. It’s not absolutely necessary to use this command line, but many software developers will find it convenient.

How Tailscale works

Tailscale, unlike most VPNs, sets up peer-to-peer connections, aka a mesh network, rather than a hub-and-spoke network. It uses the open source WireGuard package (specifically the userspace Go variant, wireguard-go) as its base layer.

For public key distribution, Tailscale does use a hub-and-spoke configuration. The coordination server is at Fortunately, public key distribution takes very little bandwidth. Private keys, of course, are never distributed.

You may be familiar with generating public-private key pairs manually to use with ssh, and including a link to the private key file as part of your ssh command line. Tailscale does all of that transparently for its network, and ties the keys to whatever login or 2FA credentials you choose.

The key pair steps are:

  1. Each node generates a random public/private key pair for itself, and associates the public key with its identity.
  2. The node contacts the coordination server and leaves its public key and a note about where that node can currently be found, and what domain it’s in.
  3. The node downloads a list of public keys and addresses in its domain, which have been left on the coordination server by other nodes.
  4. The node configures its WireGuard instance with the appropriate set of public keys.

Tailscale doesn’t handle user authentication itself. Instead, it always outsources authentication to an OAuth2, OIDC (OpenID Connect), or SAML provider, including Gmail, G Suite, and Office 365. This avoids the need to maintain a separate set of user accounts or certificates for your VPN.

tailscale 07 IDG

Tailscale CLI help. On macOS, the CLI executable lives inside the app package. A soft link to this executable doesn’t seem to work on my M1 MacBook Pro, possibly because Tailscale runs in a sandbox.

NAT traversal is a complicated process, one that I personally tried unsuccessfully to overcome a decade ago. NAT (network address translation) is one of the ways firewalls work: Your computer’s local address of, say,, gets translated in the firewall, as a packet goes from your computer to the internet, to your current public IP address and a random port number, say, and remembers that port number as yours. When a site returns a response to your request, your firewall recognizes the port and translates it back to your local address before passing you the response.

tailscale 08 IDG

Tailscale status, Tailscale pings to two devices, and plain pings to the same devices using the native network. Notice that the Tailscale ping to the Pixel device first routes via a DERP server (see below) in NYC, and then manages to find the LAN connection.

Where’s the problem? Suppose you have two firewall clients trying to communicate peer-to-peer. Neither can succeed until someone or something tells both ends what port to use.

This arbitrator will be a server when you use the STUN (Session Traversal Utilities for NAT) protocol; while STUN works on most home routers, it unfortunately doesn’t work on most corporate routers. One alternative is the TURN (Traversal Using Relays around NAT) protocol, which uses relays to get around the NAT deadlock issue; the trouble with that is that TURN is a pain in the neck to implement, and there aren’t many existing TURN relay servers.

Tailscale implements a protocol of its own for this, called DERP (Designated Encrypted Relay for Packets). This use of the term DERP has nothing to do with being goofy, but it does suggest that someone at Tailscale has a sense of humor.

Tailscale has DERP servers around the world to keep latency low; these include nine servers in the US. If, for example, you are trying to use Tailscale to connect your smartphone from a park to your desktop at your office, the chances are good that the connection will route via the nearest DERP server. If you’re lucky, the DERP server will only be used as a side channel to establish the connection. If you’re not, the DERP server will carry the encrypted WireGuard traffic between your nodes.

Tailscale vs. other VPNs

Tailscale offers a reviewer’s guide. I often look at such documents and then do my own thing because I’ve been around the block a couple of times and recognize when a company is putting up straw men and knocking them down, but this one is somewhat helpful. Here are some key differentiators to consider.

With most VPNs, when you are disconnected you have to log in again. It can be even worse when your company has two internet providers and has two VPN servers to handle them, because you usually have to figure out what’s going on by trial and error or by attempting to call the network administrator, who is probably up to his or her elbows in crises. With Tailscale (and WireGuard), the connection just resumes. Similarly, many VPN servers have trouble with flakey connections such as LTE. Tailscale and WireGuard take the flakiness in stride.

With most VPNs, getting a naive user connected for the first time is an exercise in patience for the network administrator and possibly scary for the user who has to “punch a hole” in her home firewall to enable the connection. With Tailscale it’s a five-minute process that isn’t scary at all.

Most VPNs want to be exclusive. Connecting to two VPN concentrators at once is considered a cardinal sin and a potential security vulnerability, especially if they are at different companies. Tailscale doesn’t care. WireGuard can handle this situation just fine even with hub-and-spoke topologies, and with Tailscale point-to-point connections there is a Zero Trust configuration that exposes no vulnerability.

Tailscale solutions

Tailscale has documented about a dozen solutions to common use cases that can be addressed with its ad hoc networking. These range from wanting to code from your iPad to running a private Minecraft server without paying for hosting or opening up your firewall.

As we’ve seen, Tailscale is simple to use, but also sophisticated under the hood. It’s an easy choice for ad hoc networking, and a reasonable alternative to traditional hub-and-spoke VPNs for companies. The only common VPN function that I can think of that it won’t do is spoof your location so that you can watch geographically restricted video content—but there are free VPNs that handle that.

Cost: Personal, open source, and “friends and family” plans, free. Personal Pro, $48 per year. Team, $5 per user per month (free trial available). Business, $15 per user per month (free trial available). Custom plans, contact sales.

Platform: macOS 10.13 or later, Windows 7 SP1 or later, Linux (most major distros), iOS 15 or later, Android 6 or later, Raspberry Pi, Synology.

Page 1 of 212

Social Media

Bulk Deals

Subscribe for exclusive Deals

Recent Post



Subscribe for exclusive Deals

Copyright 2015 - InnovatePC - All Rights Reserved

Site Design By Digital web avenue