Category Archives: Tech Reviews

Llama Chat and Code Llama are good at coding

Posted by on 12 September, 2023

This post was originally published on this site

A funny thing happened when I was playing with Poe, a chatbot aggregator from Quora. I selected the recently released free almost-open-source Llama 2 70B Chat model from Meta and gave it the prompt “Generate a Python program to scrape a website. Include tests for python.org and infoworld.com.” That’s a shorter version of a prompt I used to test CodeWhisperer, Bard, and Copilot X in June. None of those three models generated a working program.

To my utter surprise, Llama 2 70B Chat aced this test:

llama code generation 01IDG

The bot went on to suggest some follow-up questions:

Tell me more.
Can you explain how the BeautifulSoup library works for parsing HTML content?
Are there any other libraries or tools commonly used for web scraping in Python?
What are some best practices for web scraping to ensure compliance with website terms of service?

The generated code looked good to me, so I copied it into Visual Studio Code, saved it, and ran it. It ran perfectly:

% /usr/local/bin/python3 /Users/martinheller/Code/Llama/scrapesite.py
martinheller@Martins-M1-MBP Code % /usr/local/bin/python3 /Users/martinheller/Code/Llama/scrapesite.py
Title: Welcome to Python.org
Link: #content
Link: #python-network
Link: /
Link: https://www.python.org/psf/
Link: https://docs.python.org
Link: https://pypi.org/
Link: /jobs/
Link: /community-landing/
Link: #top
Link: /
Link: https://psfmember.org/civicrm/contribute/transact?reset=1&id=2
Link: #site-map
Link: #
Link: javascript:;
Link: javascript:;
Link: javascript:;
Link: #
Link: https://www.facebook.com/pythonlang?fref=ts
Link: https://twitter.com/ThePSF
Link: /community/irc/
Link: /about/
Link: /about/apps/
Link: /about/quotes/
Link: /about/gettingstarted/
Link: /about/help/
Link: http://brochure.getpython.info/
Link: /downloads/
Link: /downloads/
Link: /downloads/source/
Link: /downloads/windows/
Link: /downloads/macos/
Link: /download/other/
Link: https://docs.python.org/3/license.html
Link: /download/alternatives
Link: /doc/
…

Comparing the Llama-generated code with the CodeWhisperer-generated code, the major difference is that Llama used the html.parser model for Beautiful Soup, which worked, while CodeWhisperer used the lxml model, which choked.

Llama 2 code explanation

I also asked Llama 2 70B Chat to explain the same sample program I had given to CodeWhisperer, Bard, and Copilot X. CodeWhisperer doesn’t currently have a chat window, so it doesn’t do code explanations, but Bard did a great job on this task and Copilot X did a good job.

llama code generation 02IDG
llama code generation 03IDG
llama code generation 04IDG

Llama’s explanation (shown above) is as good, or possibly better, than what Bard generated. I don’t completely understand why Llama stopped in item 12, but I suspect that it may have hit a token limit, unless I accidentally hit the “stop” button in Poe and didn’t notice.

For more about Llama 2 in general, including discussion of its potential copyright violations and whether it’s open source or not, see “What is Llama 2? Meta’s large language model explained.”

Coding with Code Llama

A couple of days after I finished working with Llama 2, Meta AI released several Code Llama models. A few days after that, at Google Cloud Next 2023, Google announced that they were hosting Code Llama models (among many others) in the new Vertex AI Model Garden. Additionally, Perplexity made one of the Code Llama models available online, along with three sizes of Llama 2 Chat.

So there were several ways to run Code Llama at the time I was writing this article. It’s likely that there will be several more, and several code editor integrations, in the next months.

Poe didn’t host any Code Llama models when I first tried it, but during the course of writing this article Quora added Code Llama 7B, 13B, and 34B to Poe’s repertoire. Unfortunately, all three models gave me the dreaded “Unable to reach Poe” error message, which I interpret to mean that the model’s endpoint is busy or not yet connected. The following day, Poe updated, and running the Code Llama 34B model worked:

llama code generation 05IDG

As you can see from the screenshot, Code Llama 34B went one better than Llama 2 and generated programs using both Beautiful Soup and Scrapy.

Perplexity is website that hosts a Code Llama model, as well as several other generative AI models from various companies. I tried the Code Llama 34B Instruct model, optimized for multi-turn code generation, on the Python code-generation task for website scraping:

llama code generation 06IDG

As far as it went, this wasn’t a bad response. I know that the requests.get() method and bs4 with the html.parser engine work for the two sites I suggested for tests, and finding all the links and printing their HREF tags is a good start on processing. A very quick code inspection suggested something obvious was missing, however:

llama code generation 07IDG

Now this looks more like a command-line utility, but different functionality is now missing. I would have preferred a functional form, but I said “program” rather than “function” when I made the request, so I’ll give the model a pass. On the other hand, the program as it stands will report undefined functions when compiled.

llama code generation 08IDG

Returning JSON wasn’t really what I had in mind, but for the purposes of testing the model I’ve probably gone far enough.

Llama 2 and Code Llama on Google Cloud

At Google Cloud Next 2023, Google Cloud announced that new additions to Google Cloud Vertex AI’s Model Garden include Llama 2 and Code Llama from Meta, and published a Colab Enterprise notebook that lets you deploy pre-trained Code Llama models with vLLM with the best available serving throughput.

If you need to use a Llama 2 or Code Llama model for less than a day, you can do so for free, and even run it on a GPU. Use Colab. If you know how, it’s easy. If you don’t, search for “run code llama on colab” and you’ll see a full page of explanations, including lots of YouTube videos and blog posts on the subject. Note that while Colab is free but time-limited and resource-limited, Colab Enterprise costs money but isn’t limited.

If you want to create a website for running LLMs, you can use the same vLLM library as used in the Google Cloud Colab Notebook to set up an API. Ideally, you’ll set it up on a server with a GPU big enough to hold the model you want to use, but that isn’t totally necessary: You can get by with something like a M1 or M2 Macintosh as long as it has enough RAM to run your model. You can also use LangChain for this, at the cost of writing or copying a few lines of code.

Running Llama 2 with Ollama

If you are using an Arm-based Macintosh as your workstation, you can run Llama models locally as a command-line utility. The invaluable Sharon Machlis explains how to use Ollama; it’s easy, although if you don’t have enough RAM memory for the model it’ll use virtual memory (i.e. SSD or, heaven forfend, spinning disk) and run really slow. (Linux and Windows support is planned for Ollama.)

I tried out Ollama with several models (of the many it supports) on my M1 MacBook Pro, which unfortunately has only 8GB of RAM. I started with my standard Python web-scraping code generation task using Llama 2, apparently one of the smaller models (7B?). The result is similar to what I got from the Llama 2 70B model running on Poe, although not as well-structured. Note that Ollama only downloads the model the first time it needs it.

martinheller@Martins-M1-MBP ~ % ollama run llama2
pulling manifest
pulling 8daa9615cce3... 100% |███████████████| (3.8/3.8 GB, 51 MB/s)
pulling 8c17c2ebb0ea... 100% |██████████████████| (7.0/7.0 kB, 39 MB/s)
pulling 7c23fb36d801... 100% |█████████████████| (4.8/4.8 kB, 5.0 MB/s)
pulling 578a2e81f706... 100% |████████████████████| (95/95 B, 129 kB/s)
pulling e35ab70a78c7... 100% |████████████████████| (90/90 B, 550 kB/s)
pulling 96862bb35d77... 100% |██████████████████| (447/447 B, 2.8 MB/s)
verifying sha256 digest
writing manifest
success
>>> Generate a Python program to scrape a website. Include tests for python.org and infoworld.com
Sure! Here is an example of how you can use the `requests` and `BeautifulSoup` libraries in Python to scrape a website:
```
import requests
from bs4 import BeautifulSoup

# Example website to scrape
url = "https://www.python.org"

# Send a request to the website and get the HTML response
response = requests.get(url)

# Parse the HTML content using Beautiful Soup
soup = BeautifulSoup(response.content, 'html.parser')

# Find the specific information you want to scrape on the website
title = soup.find('h1', class_='main-title').text
print(title)  # Output: "Python.org"

# Find all the links on the page
links = soup.find_all('a')
for link in links:
    print(link.get('href'))  # Output: ["/python", "/docs", "/learn", etc.]

# Test the code with different websites
url2 = "https://www.infoworld.com"
response2 = requests.get(url2)
soup2 = BeautifulSoup(response2.content, 'html.parser')
title2 = soup2.find('h1', class_='main-title').text
print(title2)  # Output: "Infoworld"

# Find all the links on the page
links2 = soup2.find_all('a')
for link in links2:
    print(link.get('href'))  # Output: ["/news", "/features", "/opinions", etc.]
```
This code sends a request to the specified website, parses the HTML content using Beautiful Soup, and then finds specific information on the page (in this case, the title and all the links). You can modify the `url` variable at the top of the script to specify the website you want to scrape.

For example, if you want to scrape the Python documentation, you can set `url = "https://www.python.org/doc/` and then use the same code as before to find the title and links on the page.

To test the code with different websites, you can modify the `url2` variable at the top of the script to specify a different website. For example, you can set `url2 = "https://www.infoworld.com"` to scrape that website, or any other website you want to test.

Note: Make sure you have the necessary permissions and legal rights to scrape a website. Web scraping can be illegal in some cases, so be sure to check the terms of service for the website you're scraping and respect their rules.

With that baseline established, I tried the same prompt using Code Llama. Again, I didn’t specify the model size, but it looks like it is 7B.

martinheller@Martins-M1-MBP ~ % ollama run codellama
pulling manifest
pulling 8268242df6f0... 100% |████████████████████████████████████████████| (3.8/3.8 GB, 50 MB/s)
pulling 2c8743bdc4ad... 100% |███████████████████████████████████████████████| (7.0/7.0 kB, 40 MB/s)
pulling 38fa20ee7daa... 100% |███████████████████████████████████████████████| (4.8/4.8 kB, 28 MB/s)
pulling 578a2e81f706... 100% |█████████████████████████████████████████████████| (95/95 B, 993 kB/s)
pulling 404e21afdc6a... 100% |█████████████████████████████████████████████████| (30/30 B, 187 kB/s)
pulling 86b68a7bb45d... 100% |███████████████████████████████████████████████| (507/507 B, 4.0 MB/s)
verifying sha256 digest
writing manifest
success
>>> Generate a Python program to scrape a website. Include tests for python.org and infoworld.com
 To write a Python program that scrapes a website, you will need to use a library such as BeautifulSoup or Scrapy. These libraries allow you to navigate and parse the HTML structure of a webpage, and extract specific data from it.
Here is an example of how you might use BeautifulSoup to scrape the Python website at python.org:
```
import requests
from bs4 import BeautifulSoup

# Send a request to the Python website
url = 'https://www.python.org'
response = requests.get(url)

# Parse the HTML content using Beautiful Soup
soup = BeautifulSoup(response.content, 'html.parser')

# Find all links on the page
links = soup.find_all('a')

# Print each link to the console
for link in links:
    print(link.get('href'))
```
To write tests for this program, you can use a testing framework such as unittest or pytest. These frameworks allow you to define test functions that can be run automatically to verify that your code is working correctly.
Here is an example of how you might write tests using the unittest module:
```
import unittest
from bs4 import BeautifulSoup
class TestPythonOrg(unittest.TestCase):
    def setUp(self):
        # Send a request to the Python website
        url = 'https://www.python.org'
        response = requests.get(url)
        # Parse the HTML content using Beautiful Soup
        soup = BeautifulSoup(response.content, 'html.parser')
        self.soup = soup
    def test_title(self):
        title = self.soup.find('title').text
        self.assertEqual(title, 'Python Documentation - Python v3.10')
    def test_description(self):
        description = self.soup.find('meta', {'name': 'description'})['content']
        self.assertIn('The official homepage for the Python Programming Language', description)
    def tearDown(self):
        pass

if __name__ == '__main__':
    unittest.main()
```
This test program sends a request to the Python website, parses the HTML content using Beautiful Soup, and then runs two separate tests: one that verifies that the page title is correct, and another that verifies that the page description contains certain keywords.
You can run this test program by saving it to a file (e.g. `test_python_org.py`), opening a terminal or command prompt, and then running the following command:
```
python test_python_org.py
```
This will run all of the test functions in the program, and print a message indicating whether each test passed or failed.

1

2



Page 2

This model took my request for tests a little more seriously, and demonstrated how to use the Python Unit Test module.

I wondered how different the output of the Python-tuned Code Llama 7B model would be:

martinheller@Martins-M1-MBP ~ % ollama run codellama:7b-python
pulling manifest
pulling 1dd2448f61e7... 100% |████████████████████████████████████████████| (3.8/3.8 GB, 48 MB/s)
pulling 2c8743bdc4ad... 100% |███████████████████████████████████████████████| (7.0/7.0 kB, 52 MB/s)
pulling 38fa20ee7daa... 100% |███████████████████████████████████████████████| (4.8/4.8 kB, 64 MB/s)
pulling b507b9c2f6ca... 100% |██████████████████████████████████████████████████| (13/13 B, 60 kB/s)
pulling 404e21afdc6a... 100% |█████████████████████████████████████████████████| (30/30 B, 240 kB/s)
pulling b413591f8542... 100% |███████████████████████████████████████████████| (507/507 B, 3.2 MB/s)
verifying sha256 digest
writing manifest
success
>>> Generate a Python program to scrape a website. Include tests for python.org and infoworld.com
"""

import urllib.request, html.parser
class Website(object):
    """
    Class used to store the data scraped from the website
    :param str title: Title of article/page
    :param str author: Author of article/page
    :param str date: Date that the article was published on
    :param str content: Body text of article
    """
    
    def __init__(self, title=None, author=None, date=None, content=None):
        self.title = title #title of post/article
        self.author = author #author of the post or article
        self.date = date #date that the article was published on
        self.content = content #the body text of the post/article
class Article(object):
    """
    Class used to store data scraped from an infowars article
    :param str title: Title of article/page
    :param str author: Author of article/page
    :param str date: Date that the article was published on
    :param str content: Body text of article
    """
    
    def __init__(self, title=None, author=None, date=None, content=None):
        self.title = title #title of post/article
        self.author = author #author of the post or article
        self.date = date #date that the article was published on
        self.content = content #the body text of the post/article
def scrape(website_url, htmlparser=html.parser):
    """
    Function used to gather all the data from a website and return it in a form for use later

    :param str url: The URL of the website you are scraping from
    :param object parser: Defaults to the built-in html.parser but can be changed depending on the website's code style
    """
    
    #create website object and assign it to a variable called 'website'
    website = Website()
    
    #open url using urlopen function from urllib.request module
    with urllib.request.urlopen(website_url) as url:
        #use read function to get the contents of the page and store it in a variable called 'content'
        content = url.read()
        
        #use decode_content=True argument for html parsing module to be able to parse the url content
        parsed_content = htmlparser(markup=content, convert_charrefs=True, decode_entities=False, decode_content=True)
        print("n")
        
        #search through content of webpage and find all instances where a tag is used to bold/italicize text in the article body text
        title = parsed_content.find("h1", {"class": "entry-title"})
        author = parsed_content.find("div", {"class": "author-info"}).text
        date = parsed_content.find("span", {"class": "updated"}).text
        content = parsed_content.findAll("p")[3].text + 'n' + parsed_content.findAll("p")[4].text + 'n' + parsed_content.findAll("p")[5].text
        
        #store all the data gathered from the website into variables in the 'website' object created earlier
        website.title = title.text
        website.author = author
        website.date = date
        website.content = content
        return website

All of a sudden, we see Python classes and a function that uses them. This run forgot to generate the unit tests. Let’s try the same prompt and model again and see if we get different output.

martinheller@Martins-M1-MBP ~ % ollama run codellama:7b-python
>>> Generate a Python program to scrape a website. Include tests for python.org and infoworld.com
.
#Python program to scrap data from website#
import requests
from bs4 import BeautifulSoup as soup
url_1 = 'http://python.org/'
reqs_1 = requests.get(url_1)
url_2 = 'https://www.infoworld.com/article/3067955/programming-language/python-the-best-programming-language.html'
reqs_2 = requests.get(url_2)
#HTML parsing from website python.org#
page_1 = reqs_1.text
#HTML parsing from website infoworld.com#
page_2 = reqs_2.text
#BeautifulSoup object for website python.org#
soup_1 = soup(page_1, 'html.parser')
#BeautifulSoup object for website infoworld.com#
soup_2 = soup(page_2,'html.parser')
def scrape():
    #Dictionary to store data from both websites python.org and infoworld.com#
    website_data = {}
    #Scraping all the text from headings of a website#
    headlines = soup_1.find_all('div',{'class':'headline'})
    for line in headlines:
        data = line.text
        if "Python" in data or 'Programming' in data:
            website_data['headlines'] = [data]
    
    #Scraping all the text from paragraphs of a website#
    paragrphs = soup_1.find_all('p')
    for line in paragrphs:
        data = line.text
        if "Python" in data or 'Programming' in data:
            website_data['paras'] = [data]
    
    #Scraping all the text from headings of a website#
    headlines2 = soup_2.find_all('p')
    for line in headlines2:
        data = line.text
        if "Python" in data or 'Programming' in data:
            website_data['headlines'] = [data]
    
    #Scraping all the text from paragraphs of a website#
    paragrphs2 = soup_1.find_all('p')
    for line in paragrphs2:
        data = line.text
        if "Python" in data or 'Programming' in data:
            website_data['paras'] = [data]
    
    #Return the dictionary of all text scraped from both websites python.org and infoworld.com#
    return website_data

This time the model generated code in a functional style instead of an object-oriented style. It did something ugly, though: Instead of using the home page of infoworld.com for its second test, it used the URL of an article about the Python programming language. Alas, that page does not currently exist, so we may have either bumped up against old content in the model, or the model had a hallucination.

Two Llamas for coding

As you’ve seen, Llama 2 Chat can generate and explain Python code quite well, right out of the box. There’s no need to fine-tune it further on code-generation tasks—although Meta has done exactly that for Code Llama.

Llama 2 Chat is not without controversy, however. Meta says that it’s open source, but the OSI begs to disagree, on two counts. Meta says that it’s more ethical and safer than other LLMs, but a class action lawsuit from three authors says that its training has violated their copyrights.

It’s nice that Llama 2 Chat works so well. It’s troubling that to train it to work well Meta may have violated copyrights. Perhaps, sooner rather than later, someone will find a way to train generative AIs to be effective without triggering legal problems.

Code Llama’s nine fine-tuned models offer additional capabilities for code generation, and the Python-specific versions seem to know something about Python classes and testing modules as well as about functional Python.

When the bigger Code Llama models are more widely available online running on GPUs, it will be interesting to see how they stack up against Llama 2 70B Chat. It will also be interesting to see how well the smaller Code Llama models perform for code completion when integrated with Visual Studio Code or another code editor.

Next read this:

Review: CodeWhisperer, Bard, and Copilot X

Posted by on 27 June, 2023

This post was originally published on this site

When I wrote about the GitHub Copilot preview in 2021, I noted that the AI pair programmer didn’t always generate good, correct, or even running code, but was still somewhat useful. At the time, I concluded that future versions could be real time-savers. Two years later, Copilot is improving. These days, it costs money even for individuals, and it has some competition. In addition, the scope of coding assistants has expanded beyond code generation to code explanations, pull request summaries, security scanning, and related tasks.

Three tools for AI pair programming

Let’s start with a quick overview of the tools under review, then we’ll dive in for a closer look at each one.

  • Amazon CodeWhisperer is the product that competes most directly with Copilot. A “coding companion” like Copilot, CodeWhisperer integrates with Visual Studio Code and JetBrains IDEs, generates code suggestions in response to comments and code completions based on existing code, and can scan code for security issues. CodeWhisperer supports five programming languages well, and another 10 at a lesser degree of support. It can optionally flag and log references to code it uses and optionally filter out code suggestions that resemble open source training data.
  • Google Bard is a web-based interface to LaMDA (Language Model for Dialogue Applications), a conversational AI model capable of fluid, multi-turn dialogue. Bard recently added the ability to help with coding and topics about coding. When Bard emits code that may be subject to an open source license, it cites its sources and provides the relevant information. Bard is also good at code explanations.
  • GitHub Copilot X is “leveling up” from the original Copilot with chat and terminal interfaces, support for pull requests, and early adoption of OpenAI’s GPT-4. Currently, to access Copilot X you need to have an active Copilot subscription and join the waiting list, with no guarantee about when you’ll get access to the new features. It took about a month for my invitation to arrive after I joined the waiting list.

Using one of these code generators is not the only way to generate code. To begin with, you can access general-purpose transformers like GPT-4 and its predecessors, including ChatGPT, BingGPT/Bing Chat (available in the Edge browser), and Boo.ai. There are also other code-specific AI tools, such as StarCoder, Tabnine, Cody, AlphaCode, Polycoder, and Replit Ghostwriter. In every case I’ve mentioned, it is vital to use discretion and carefully test and review the generated code before using it.

How the tools were tested

In my previous article about code generation, I evaluated the AI code generators based on the rather easy task of writing a program to determine the number of days between two dates. Most did okay, although some needed more guidance than others. For this review, I tried the code generators on the more difficult task of scraping InfoWorld.com for a list of articles. I gave them an outline but no additional help. None generated correct code, although some came closer than others. As an additional task, I asked the tools that support code explanation to explain a Python code example from an MIT Open Courseware introductory programming course.

For reference, the outline I gave to the code generators is:


Scrape infoworld.com front page:
	Open https://www.infoworld.com/
	Find all articles by looking for links with ‘article’ in the href; extract title, author, date from each
	List all articles alphabetically by title; eliminate duplicates 
	List all articles alphabetically by author last name
	List all articles latest first

In general, I tried to act like a more naive programmer than I am, just to see what the tools would do.

Now, let’s look more closely at each of our code generators.

Amazon CodeWhisperer

Within your IDE, Amazon CodeWhisperer analyzes your English language comments and surrounding code to infer what code it should generate to complete what you are typing. Then, it offers code as a suggestion that you can either accept or reject, or you can ask CodeWhisperer for alternate code, or ignore and continue writing your own code. CodeWhisperer’s large language model (LLM) was trained on billions of lines of code, including Amazon and open source code. Any given suggestion is based not only on your comments and immediate code context, but also on the contents of other files open in the IDE.

In addition to code generation, CodeWhisperer can scan your Python, Java, and JavaScript code for security vulnerabilities and suggest fixes for them. The vulnerability lists it consults include Open Web Application Security Project (OWASP), crypto library best practices, AWS API best practices, and other API best practices. Security scans occur on-demand, unlike code completion, which is offered continuously as you code unless you turn off suggestions.

Programming languages and IDEs

CodeWhisperer’s best programming languages for code generation (the most prevalent languages in the training corpus) are Java, Python, JavaScript, TypeScript, and C#. It has been trained to a lesser extent on Ruby, Go, PHP, C++, C, Shell, Scala, Rust, Kotlin, and SQL.

There are CodeWhisperer plugins for Visual Studio Code and JetBrains IDEs. You can also activate CodeWhisperer for use inside AWS Cloud9 and AWS Lambda; in both cases, you must edit your IAM permissions as well as checking the Enable CodeWhisperer box.

I installed CodeWhisperer in Visual Studio Code. There are four steps:

  • Add the CodeWhisperer plugin to VS Code.
  • Add a connection to AWS.
  • Sign in on the AWS website.
  • Start CodeWhisperer from the AWS developer tools panel.
CodeWhisperer usage instructions. IDG

Figure 1. Once you have installed and authorized CodeWhisperer (at left), you can see the usage instructions, shown here on the right.

Code suggestions and completions

It’s worth watching some of the videos demonstrating CodeWhisperer’s capabilities, listed on the CodeWhisperer resource page. They’ll tell you more than I can in words and screenshots. While watching them, it became clear to me that CodeWhisperer shines on code that calls AWS APIs.

Nevertheless, here’s a screenshot of the generated code from CodeWhisperer:

CodeWhisperer generating code. IDG

Figure 2. CodeWhisperer generating code.

Note that the generated code did not work, and also that CodeWhisperer generated multiple functions for more or less the same purpose. It was fairly close, however, and CodeWhisperer kind of read my mind about using Beautiful Soup to scrape the web page.

CodeWhisper program. IDG

Figure 3. Trying the CodeWhisperer-generated program. 

Security scans and open source references

I ran the CodeWhisperer security scan on my project, as well. Unsurprisingly, no issues were detected in this file, although the CodeWhisperer scan found a resource leak in another file generated by GitHub Copilot X. The CodeWhisperer security scan is a differentiating feature that hasn’t yet surfaced in the competing AI code generators.

This file did not trigger any open source references, but when I opened the CodeWhisperer Reference Log, I found a link to the setting that controls whether CodeWhisperer will suggest code that has such references. In the Amazon examples I viewed, code suggestions with references displayed the license type at the top.

The bottom line for CodeWhisperer

CodeWhisperer generates code in an IDE and does security reviews, but it lacks a chat window and code explanations. The tool works more or less as advertised, and the price (free) is right for individual developers. For corporate developers, the $19 per month price for CodeWhisperer is the same as for Copilot. I’d favor CodeWhisperer over Copilot for shops that use AWS heavily and shops that need to know when code suggestions refer to open source.

Google Bard

Bard support for programming was announced April 21, 2023. The blog post mentioned “more than 20 programming languages including C++, Go, Java, Javascript, Python and Typescript.”

When I asked Bard for the full list, I received the following response:

Bard supports over 20 programming languages, including:

  • C++
  • Go
  • Java
  • JavaScript
  • Python
  • TypeScript
  • C#
  • PHP
  • Ruby
  • Swift
  • Kotlin
  • Rust
  • Julia
  • R
  • Haskell
  • Erlang
  • Scala
  • Lua
  • OCaml
  • Scheme

Bard can generate code in these languages, debug code, and explain code. It can also translate from one programming language to another.

Please note that Bard is still under development, so its capabilities may not be perfect. It is always a good idea to review AI-generated code carefully before using it.

I tested Bard on my code explanation and code generation tasks, which are shown below. The code explanation was quite good—so good that I wondered whether Bard had been trained on that MIT course code sample, which dates from 2008 and has been available on the web since then.

The code generation for the site scraper looked fairly credible at first glance but had several obvious problems on further inspection. It was a matter of clicking twice to open the code in Colab, but it threw a type error when I tried to run it, and it didn’t even get as far as the code I had identified as wrong.

Rather than fixing the problems, I tried running the code generation again, and the result made a bit more sense. It threw an attribute error when I tried to run it in Colab, however. I fed that traceback back to Bard, it generated a fix, I ran the “fixed” code in Colab, and there was a type error in a different spot. Enough, already! If I was actually a naive programmer, I would have thrown up my hands.

What about citing web pages and open source code? The Bard FAQ had this to say:

Bard, like some other standalone LLM experiences, is intended to generate original content and not replicate existing content at length. We’ve designed our systems to limit the chances of this occurring, and we will continue to improve how these systems function. If Bard does directly quote at length from a webpage, it cites that page.

Sometimes the same content may be found on multiple webpages and Bard attempts to point to a popular source. In the case of citations to code repositories, the citation may also reference an applicable open source license.

Given that Bard is completely free at this time, and requires no setup, it’s certainly worth trying when you need to generate, debug, and explain code in any of these 20 languages. Don’t forget to review, test, and debug any generated code.

Here is the instruction I gave to Bard and the resulting dialog:


I'm about to give you some code to explain. Please wait until the code is complete.

Sure, I will wait until you are finished giving me the code. Please feel free to take your time. I am here to help you in any way that I can.

Here are some tips for explaining code:
•	Start by explaining the overall purpose of the code. What does it do? What problem is it trying to solve?
•	Then, break down the code into smaller chunks and explain each one in turn.
•	Use clear and concise language. Avoid jargon and technical terms.
•	Use examples to illustrate your points.
•	Be patient and don't be afraid to ask questions.

I am confident that I can help you explain any code that you give me. Just let me know what you need and I will do my best to help.

1

2



Page 2

And here is the code I provided:


import pylab, random

class Stock(object):
    def __init__(self, price, distribution, vol):
        self.price = price
        self.history = [price]
        self.distribution = distribution
        self.vol = vol
        self.lastChangeInfluence = 0.0
    def setPrice(self, price):
        self.price = price
        self.history.append(price)
    def getPrice(self):
        return self.price
    def makeMove(self, bias, mo):
        oldPrice = self.price
        baseMove = self.distribution(self.vol) + bias
        self.price = self.price * (1.0 + baseMove)
        self.price += mo*random.choice([0.0, 1.0])*self.lastChangeInfluence
        self.history.append(self.price)
        change = self.price - oldPrice
        if change >= 0:
            self.lastChangeInfluence = min(change, oldPrice*0.01)
        else:
            self.lastChangeInfluence = max(change, -oldPrice*0.01)
    def showHistory(self, fig, test):
        pylab.figure(fig)
        pylab.plot(self.history)
        pylab.title('Closing Prices, Test ' + test)
        pylab.xlabel('Day')
        pylab.ylabel('Price')

class SimpleMarket(object):
    def __init__(self, numStks, volUB):
        self.stks = []
        self.bias = 0.0
        for n in range(numStks):
            volatility = random.uniform(0, volUB)
            distribution = lambda vol: random.gauss(0.0, vol)
            stk = Stock(100.0, distribution, volatility)
            self.addStock(stk)
    def addStock(self, stk):
        self.stks.append(stk)
    def setBias(self, bias):
        self.bias = bias
    def getBias(self):
        return self.bias
    def getStocks(self):
        return self.stks[:]
    def move(self, mo):
        prices = []
        for s in self.stks:
            s.makeMove(self.bias, mo)
            prices.append(s.getPrice())
        return prices

class Market(SimpleMarket):
    def __init__(self, numStks, volUB, dailyBiasRange):
        SimpleMarket.__init__(self, numStks, volUB)
        self.dailyBiasRange = dailyBiasRange
    def move(self, mo):
        prices = []
        dailyBias = random.gauss(self.dailyBiasRange[0], self.dailyBiasRange[1])
        for s in self.stks:
            s.makeMove(self.bias + dailyBias, mo)
            prices.append(s.getPrice())
        return prices

def simMkt(mkt, numDays, mo):
    endPrices = []
    for i in range(numDays):
        vals = mkt.move(mo)
        vals = pylab.array(vals)
        mean = vals.sum()/float(len(vals))
        endPrices.append(mean)
    return endPrices

def plotAverageOverTime(endPrices, title):
    pylab.plot(endPrices)
    pylab.title(title)
    pylab.xlabel('Days')
    pylab.ylabel('Price')

def plotDistributionAtEnd(mkt, title, color):
    prices = []
    sumSoFar = 0
    for s in mkt.getStocks():
        prices.append(s.getPrice())
        sumSoFar += s.getPrice()
    mean = sumSoFar/float(len(prices))
    prices.sort()
    pylab.plot(prices, color)
    pylab.axhline(mean, color = color)
    pylab.title(title)
    pylab.xlabel('Stock')
    pylab.ylabel('Last Sale')
    pylab.semilogy()

def runTrial(showHistory, test, p):
    colors = ['b','g','r','c','m','y','k']

    mkt = Market(p['numStocks'], p['volUB'], p['dailyBiasRange'])
    mkt.setBias(p['bias'])
    endPrices = simMkt(mkt, p['numDays'], p['mo'])
    pylab.figure(1)
    plotAverageOverTime(endPrices, 'Average Closing Prices')
    pylab.figure(2)
    plotDistributionAtEnd(mkt, 'Distribution of Prices', colors[test%len(colors)])
    if showHistory:
        for s in mkt.getStocks():
            s.showHistory(test+2, str(test))

def runTest(numTrials):
    #Constants used in testing
    numDaysPerYear = 200.0
    params = {}
    params['numDays'] = 200
    params['numStocks'] = 500
    params['bias'] = 0.1/numDaysPerYear #General market bias
    params['volUB'] = 12.0/numDaysPerYear #Upper bound on volatility for a stock
    params['mo'] = 1.1/numDaysPerYear #Momentum factor
    params['dailyBiasRange'] = (0.0, 4.0/200.0)

    for t in range(1, numTrials+1):
        runTrial(True, t, params)

runTest(3)
pylab.show()

Here is some of Bard’s response, which was quite in depth:


The code you have provided is a Python program that simulates a stock market. The program defines three classes:

Stock: This class represents a single stock. It has properties for the stock's price, history, distribution, volatility, and last change influence.

SimpleMarket: This class represents a simple stock market. It has a list of stocks, a bias, and a volatility upper bound.

Market: This class inherits from SimpleMarket and adds a daily bias range property. The program also defines several functions:
•	simMkt: This function simulates the market for a given number of days and momentum factor.
•	plotAverageOverTime: This function plots the average closing price of the stocks over time.
•	plotDistributionAtEnd: This function plots the distribution of prices at the end of the simulation.
•	runTrial: This function runs a single simulation and plots the results.
•	runTest: This function runs a number of simulations and plots the results.

The program is run by calling the runTest function with the desired number of simulations. The program will then generate three plots:
•	A plot of the average closing price of the stocks over time.
•	A plot of the distribution of prices at the end of the simulation.
•	A plot of the history of prices for each stock.

The program can be used to simulate a variety of different stock markets. By changing the values of the parameters, you can change the volatility of the market, the bias of the market, and the momentum factor. You can also use the program to simulate different trading strategies. For example, you could simulate a strategy that buys stocks when they are undervalued and sells them when they are overvalued.

Bard also provided an explanation of how each part of the code worked:


•	The Stock class represents a single stock. It has properties for the stock's price, history, distribution, volatility, and last change influence. The price property is the current price of the stock. The history property is a list of the stock's prices over time. The distribution property is a function that generates random prices for the stock. The volatility property is a measure of how much the stock's price can fluctuate. The lastChangeInfluence property is a measure of how much the stock's price is affected by its previous change.

•	The SimpleMarket class represents a simple stock market. It has a list of stocks, a bias, and a volatility upper bound. The list of stocks property is a list of the stocks in the market. The bias property is a measure of how much the market is biased towards rising prices. The volatility upper bound property is a measure of how much the market can fluctuate.

•	The Market class inherits from SimpleMarket and adds a daily bias range property. The daily bias range property is a measure of how much the market's bias can change from day to day.

•	The simMkt function simulates the market for a given number of days and momentum factor. The number of days parameter is the number of days to simulate. The momentum factor parameter is a measure of how much the prices of stocks are affected by their momentum.

•	The plotAverageOverTime function plots the average closing price of the stocks over time. The plotDistributionAtEnd function plots the distribution of prices at the end of the simulation.

•	The runTrial function runs a single simulation and plots the results. The showHistory parameter is a boolean value that determines whether the history of prices for each stock is plotted. The test parameter is a unique identifier for the simulation.

•	The runTest function runs a number of simulations and plots the results. The numTrials parameter is the number of simulations to run.

I hope this explanation is helpful. Please let me know if you have any other questions.

When I followed up, typing in, “where is the pylab module?” Bard responded with the location of the module, how to import it, a description of features, and a pointer to the documentation.

Additionally, here is the Bard-generated code for scraping InfoWorld.com for a list of articles:


import requests
from bs4 import BeautifulSoup

def scrape_infoworld():
  """Scrape the InfoWorld front page and return a list of articles."""

  # Get the HTML from the InfoWorld front page.
  response = requests.get("https://www.infoworld.com/")
  soup = BeautifulSoup(response.content, "html.parser")

  # Find all articles by looking for links with 'article' in the href.
  articles = soup.find_all("a", href=lambda x: "article" in x["href"])

  # Extract the title, author, and date from each article.
  article_data = []
  for article in articles:
    title = article.find("h3").text
    author = article.find("span", class_="byline").text
    date = article.find("time").text
    article_data.append({
      "title": title,
      "author": author,
      "date": date,
    })

  # Sort the articles alphabetically by title.
  article_data.sort(key=lambda x: x["title"])

  # Remove any duplicate articles.
  article_data = list(set(article_data))

  # Sort the articles alphabetically by author last name.
  article_data.sort(key=lambda x: x["author"].split()[-1])

  # Sort the articles latest first.
  article_data.sort(key=lambda x: x["date"], reverse=True)

  return article_data

if __name__ == "__main__":
  articles = scrape_infoworld()

  # Print the articles.
  for article in articles:
    print(article)

Explanation:

This code will first get the HTML from the InfoWorld front page. Then, it will find all articles by looking for links with 'article' in the href. For each article, it will extract the title, author, and date. The articles will then be sorted alphabetically by title, removed of any duplicates, sorted alphabetically by author last name, and sorted latest first. Finally, the articles will be printed.

And here is the runtime result:


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-afbe7808577e> in <cell line: 40>()
     39 
     40 if __name__ == "__main__":
---> 41   articles = scrape_infoworld()
     42 
     43   # Print the articles.

6 frames

<ipython-input-1-afbe7808577e> in <lambda>(x)
     10 
     11   # Find all articles by looking for links with 'article' in the href.
---> 12   articles = soup.find_all("a", href=lambda x: "article" in x["href"])
     13 
     14   # Extract the title, author, and date from each article.

TypeError: string indices must be integers

The bottom line for Bard

Bard has a chat interface and both generates and explains code, but it doesn’t have an interactive IDE integration. Given that Bard is completely free at this time, and requires no setup, it’s certainly worth trying when you need to generate, debug, and explain code in any of the 20 supported languages.

GitHub Copilot X

GitHub Copilot X is greatly improved over the original GitHub Copilot, and can sometimes generate a correct function and set of tests without much human help. It still makes mistakes and hallucinates (generates false information), but not nearly as much as it once did.

In addition to generating code within a programming editor, currently supporting only the most current versions of Visual Studio and the latest insider version of Visual Studio Code, Copilot X adds a GPT-4 chat panel to the editor. It also adds a terminal interface, support for generating unit tests and pull request descriptions, and the ability to extract explanations from documentation.

I asked the Copilot X chat what programming languages it supports, and it answered “̉I support a wide range of programming languages, including but not limited to: Python, JavaScript, TypeScript, Ruby, Java, C++, C#, PHP, Go, Swift, Kotlin, Rust, and many more.”  I did my testing primarily in Python.

When I used the Copilot Chat facility to ask Copilot X to explain the MIT market simulation code, it gave a partially correct answer. I had to metaphorically pull its teeth to get it to explain the rest of the code.

Copilot X explanation. IDG

Figure 4. Copilot X did a decent but incomplete job of explaining the market simulator.

Copilot X’s most notable failure was the web-scraping code generation task. The tool generated a bunch of superficially credible-looking code that didn’t use Beautiful Soup, but it was clear from reviewing the code that it would never work. I kept bringing the problems to Copilot Chat, but it just dug itself a deeper hole. I could probably have started over and given it better hints, including handing it an import from bs4 and adding some comments showing the HTML and directory structure of the InfoWorld home page. I didn’t do it because that would not be in character for the naive coder persona I had adopted for this round of tests.

Copilot X responds to user feedback. IDG

Figure 5. Copilot X tried to generate the web scraping code without using Beautiful Soup (bs4). Later when I chatted about the solution it generated, it first claimed that it was using Beautiful Soup, but then admitted that it could not find an import.

As with all AI helpers, you have to take the code generated by Copilot X with a huge grain of salt, just as you would for a pull request from an unknown programmer.

The bottom line for Copilot X

In addition to generating code within an IDE, Copilot X adds a GPT-4 chat panel to the editor. It also adds a terminal interface, support for unit test generation, support for generating pull request descriptions, and the ability to extract explanations from technical documentation. Copilot X costs $10 per month for individuals and $19 per user per month for businesses. 

Conclusion 

GitHub Copilot X works decently on simple problems, but not necessarily better than the combination of Amazon CodeWhisperer in a code editor and Google Bard in a browser. It’s too bad that CodeWhisperer doesn’t yet have a chat capability or the facility for explaining code, and it’s too bad that Bard doesn’t exactly integrate with an editor or IDE.

I’d be tempted to recommend Copilot X if it hadn’t gone off the rails on my advanced code generation task—mainly because it integrates chat and code generation in an editor. At this point, however, Copilot X isn’t quite ready. Overall, none of the code generation products are really up to snuff, although both Bard and Copilot X do a decent job of code explanation.

All of these products are in active development, so my recommendation is to keep watching them and experimenting, but don’t put your faith in any of them just yet.

Next read this:

First look: wasmCloud and Cosmonic

Posted by on 25 April, 2023

This post was originally published on this site

As you likely know by now, WebAssembly, or wasm, is an efficient, cross-platform, cross-language way to run code almost anywhere, including in a browser and on a server—even in a database. Cosmonic is a commercial platform-as-a-service (PaaS) for wasm modules. It builds on the open-source wasmCloud. This technology preview starts with a quick overview of wasm, then we’ll set up wasmCloud and Cosmonic and see what we can do with them.

What is wasm?

WebAssembly (wasm) is a “binary instruction format for a stack-based virtual machine.” It’s a portable compilation target for programming languages, including C, C++, C#, Rust, Go, Java, PHP, Ruby, Swift, Python, Kotlin, Haskell, and Lua; Rust is often the preferred language for wasm. There are three wasm-specific languages: AssemblyScript, Grain, and Motoko. Wasm targets include browsers (currently Chrome, Firefox, Safari, and Edge), Node.js, Deno, Wasmtime, Wasmer, and wasm2c.

Wasm tries to run at native speed in a small amount of memory. It runs in a memory-safe, sandboxed execution environment, even on the web.

WebAssembly System Interface (WASI) is a modular system interface for WebAssembly. Wasm has a component model with a W3C proposed specification. WebAssembly Gateway Interface (Wagi) is a proposed implementation of CGI for wasm and WASI. Spin is a multi-language framework for wasm applications.

What is wasmCloud?

wasmCloud is a CNCF-owned open source software platform that uses wasm and NATS to build distributed applications composed of portable units of WebAssembly business logic called actors. wasmCloud supports TinyGo and Rust for actor development. It also supports building platforms, which are capability providers. wasmCloud includes lattice, a self-forming, self-healing mesh network using NATS that provides a unified, flattened topology. wasmCloud runs almost everywhere: in the cloud, at the edge, in the browser, on small devices, and so on. The wasmCloud host runtime uses Elixir/OTP and Rust.

Many wasmCloud committers and maintainers work for Cosmonic (the company). Additionally, the wasmCloud wash cloud shell works with Cosmonic (the product).

What is Cosmonic?

Cosmonic is both a company and a product. The product is a WebAssembly platform as a service (PaaS) that builds on top of wasmCloud and uses wasm actors. Cosmonic offers a graphical cloud user interface for designing applications, and its own shell, cosmo, that complements wash and the wasmCloud GUI. Supposedly, anything you build that works in plain wasmCloud should work automatically in Cosmonic.

A host is a distributed, wasmCloud runtime process that manages actors and capability providers. An actor is a WebAssembly module that can handle messages and invoke functions on capability providers. A capability is an abstraction or representation of some functionality required by your application that is not considered part of the core business logic. A capability provider is an implementation of the representation described by a capability contract. There can be multiple providers per capability with different characteristics.

A link is a runtime-defined connection between an actor and a capability provider. Links can be changed without needing to be redeployed or recompiled.

A constellation is a managed, isolated network space that allows your actors and providers to securely communicate with each other regardless of physical or logical location; essentially, a Cosmonic-managed wasmCloud lattice. A super constellation is a larger constellation formed by securely connecting multiple environments through Cosmonic.

A wormhole is an ingress point into your constellation. An OCI distribution is a standard for artifact storage, retrieval, and distribution, implemented by (for example) the Azure Container Registry and the GitHub artifact registry.

The infrastructure view shows the virtual hosts running in your Cosmonic constellation. The logic view shows the logical relationships between components in your Cosmonic constellation or super constellation.

Installing and testing wasmCloud

Installation of wasmCloud varies with your system. I used brew on my M1 MacBook Pro; it installed more than I wanted because of dependencies, particularly the Rust compiler and cargo package manager, which I prefer to install from the Rust language website using rustup. Fortunately, a simple brew uninstall rust cleared the way for a standard rustup installation. While I was installing languages, I also installed TinyGo, the other language supported for wasmCloud actor development.

After installation, I asked the wash shell to tell me about its capabilities:


martinheller@Martins-M1-MBP ~ % wash --help
_________________________________________________________________________________
                               _____ _                 _    _____ _          _ _
                              / ____| |               | |  / ____| |        | | |
 __      ____ _ ___ _ __ ___ | |    | | ___  _   _  __| | | (___ | |__   ___| | |
   / / / _` / __| '_ ` _ | |    | |/ _ | | | |/ _` |  ___ | '_  / _  | |
   V  V / (_| __  | | | | | |____| | (_) | |_| | (_| |  ____) | | | |  __/ | |
   _/_/ __,_|___/_| |_| |_|_____|_|___/ __,_|__,_| |_____/|_| |_|___|_|_|
_________________________________________________________________________________

A single CLI to handle all of your wasmCloud tooling needs


Usage: wash [OPTIONS] <COMMAND>

Commands:
  app       Manage declarative applications and deployments (wadm) (experimental)
  build     Build (and sign) a wasmCloud actor, provider, or interface
  call      Invoke a wasmCloud actor
  claims    Generate and manage JWTs for wasmCloud actors
  ctl       Interact with a wasmCloud control interface
  ctx       Manage wasmCloud host configuration contexts
  down      Tear down a wasmCloud environment launched with wash up
  drain     Manage contents of local wasmCloud caches
  gen       Generate code from smithy IDL files
  keys      Utilities for generating and managing keys
  lint      Perform lint checks on smithy models
  new       Create a new project from template
  par       Create, inspect, and modify capability provider archive files
  reg       Interact with OCI compliant registries
  up        Bootstrap a wasmCloud environment
  validate  Perform validation checks on smithy models
  help      Print this message or the help of the given subcommand(s)

Options:
  -o, --output <OUTPUT>  Specify output format (text or json) [default: text]
  -h, --help             Print help information
  -V, --version          Print version information

Then I made sure I could bring up a wasmCloud:


martinheller@Martins-M1-MBP ~ % wash up
🏃 Running in interactive mode, your host is running at http://localhost:4000
🚪 Press `CTRL+c` at any time to exit
17:00:20.343 [info] Wrote configuration file host_config.json
17:00:20.344 [info] Wrote configuration file /Users/martinheller/.wash/host_config.json
17:00:20.344 [info] Connecting to control interface NATS without authentication
17:00:20.344 [info] Connecting to lattice rpc NATS without authentication
17:00:20.346 [info] Host NCZVXJWZAKMJVVBLGHTPEOVZFV4AW5VOKXMD7GWZ5OSF5YF2ECRZGXXH (gray-dawn-8348) started.
17:00:20.346 [info] Host issuer public key: CCXQKGKOAAVXUQ7MT2TQ57J4DBH67RURBKT6KEZVOHHZYPJKU6EOC3VZ
17:00:20.346 [info] Valid cluster signers: CCXQKGKOAAVXUQ7MT2TQ57J4DBH67RURBKT6KEZVOHHZYPJKU6EOC3VZ
17:00:20.351 [info] Started wasmCloud OTP Host Runtime
17:00:20.356 [info] Running WasmcloudHostWeb.Endpoint with cowboy 2.9.0 at 0.0.0.0:4000 (http)
17:00:20.357 [info] Access WasmcloudHostWeb.Endpoint at http://localhost:4000
17:00:20.453 [info] Lattice cache stream created or verified as existing (0 consumers).
17:00:20.453 [info] Attempting to create ephemeral consumer (cache loader)
17:00:20.455 [info] Created ephemeral consumer for lattice cache loader

While I had the wasmCloud running, I viewed the website at port 4000 on my local machine:

wasmCloud local dashboard IDG

Figure 1. wasmCloud local dashboard on port 4000 after running wash up. There are no actors, providers, or links.

Then I stopped the wasmCloud:


martinheller@Martins-M1-MBP ~ % wash down

✅ wasmCloud host stopped successfully
✅ NATS server stopped successfully
🛁 wash down completed successfully

1

2



Page 2

Installing and testing Cosmonic

I installed the Cosmonic CLI from the Quickstart page and asked it to tell me about itself:


martinheller@Martins-M1-MBP ~ % cosmo --help

          ⣀⣴⣶⣶⣦⣀
      ⢀⣠⣴⣾⣿⣿⣿⣿⣿⣿⣷⣦⣄⡀
   ⣀⣤⣶⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣶⣤⣀
⢀⣴⣾⣿⣿⣿⣿⣿⣿⣿⣿⣿⠋⠹⣿⣿⣿⣿⣿⣿⣿⣿⣿⣷⣦⡀
⣾⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⠏  ⢻⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣷
⣿⣿⣿⣿⣿⣿⣿⣿⣿⡿⠁    ⠙⢿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⡿⠛⠁        ⠈⠛⠛⠿⠿⠿⣿⣿⡿
⣿⣿⣿⣿⣏
⣿⣿⣿⣿⣿⣿⣷⣦⣀        ⣀⣤⣶⣶⣾⣿⣿⣿⣷
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣷⡄    ⣴⣾⣿⣿⣿⣿⣿⣿⣿⣿⣿
⢿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣆  ⣼⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡿
⠈⠛⢿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣄⣰⣿⣿⣿⣿⣿⣿⣿⣿⣿⡿⠛⠁
   ⠈⠛⠻⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⠿⠛⠁
       ⠙⠻⢿⣿⣿⣿⣿⣿⣿⡿⠛⠋
          ⠈⠛⠿⠿⠛⠁

      C O S M O N I C


Usage: cosmo [OPTIONS] <COMMAND>

Commands:
  build     Build (and sign) an actor, provider, or interface
  down      Stop the wasmCloud host and NATS leaf launched by `up`
  launch    Launch an actor on a local wasmCloud host
  login     Securely download credentials to authenticate this machine with Cosmonic infrastructure
  new       Create a new project from template
  up        Start a NATS leaf and wasmCloud host connected to Cosmonic infrastructure, forming a super constellation
  tutorial  Run through the tutorial flow
  whoami    Check connectivity to Cosmonic and query device identity information
  help      Print this message or the help of the given subcommand(s)

Options:
  -o, --output <OUTPUT>  Specify output format (text or json) [default: text]
  -h, --help             Print help
  -V, --version          Print version

Then, I went through the online interactive drag-and-drop tutorial to create an echo application, resulting in this diagram:

cosmonic logic view IDG

Figure 2. Cosmonic Logic view after going through the online tutorial. The reversed arrow indicates that the wormhole is connected for ingress into the echo application.

I also ran the local Quickstart hello tutorial:


martinheller@Martins-M1-MBP ~ % cosmo tutorial hello


          ⣀⣴⣶⣶⣦⣀
      ⢀⣠⣴⣾⣿⣿⣿⣿⣿⣿⣷⣦⣄⡀
   ⣀⣤⣶⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣶⣤⣀
⢀⣴⣾⣿⣿⣿⣿⣿⣿⣿⣿⣿⠋⠹⣿⣿⣿⣿⣿⣿⣿⣿⣿⣷⣦⡀
⣾⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⠏  ⢻⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣷
⣿⣿⣿⣿⣿⣿⣿⣿⣿⡿⠁    ⠙⢿⣿⣿⣿⣿⣿⣿⣿⣿⣿
⣿⣿⣿⣿⣿⣿⡿⠛⠁        ⠈⠛⠛⠿⠿⠿⣿⣿⡿
⣿⣿⣿⣿⣏
⣿⣿⣿⣿⣿⣿⣷⣦⣀        ⣀⣤⣶⣶⣾⣿⣿⣿⣷
⣿⣿⣿⣿⣿⣿⣿⣿⣿⣷⡄    ⣴⣾⣿⣿⣿⣿⣿⣿⣿⣿⣿
⢿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣆  ⣼⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡿
⠈⠛⢿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣄⣰⣿⣿⣿⣿⣿⣿⣿⣿⣿⡿⠛⠁
   ⠈⠛⠻⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⠿⠛⠁
       ⠙⠻⢿⣿⣿⣿⣿⣿⣿⡿⠛⠋
          ⠈⠛⠿⠿⠛⠁

      C O S M O N I C
Welcome to cosmo!
✅ You're already authenticated!
⚙️  It looks like you don't have a wasmCloud host running locally. Launching one with:
    `cosmo up`
>>> ⠀⢀
Ok to download NATS and wasmCloud to /Users/martinheller/.cosmo ?: y
🟢 A wasmCloud host connected to your constellation has been started!

To stop the host, run:
    'cosmo down'
>>> ⡋⢀
To start the tutorial, we'll generate a new project with `cosmo new`. Proceed?: y
🌐 Next we'll download code for your hello world actor to the hello/ directory...
>>> ⢋⠁                      Cloning into '.'...
>>> ⠈⢙                      remote: Enumerating objects: 86, done.
remote: Counting objects: 100% (86/86), done.
remote: Compressing objects: 100% (56/56), done.
>>> ⠈⡙
>>> ⠀⢙
>>> ⠀⡙                      remote: Total 86 (delta 23), reused 76 (delta 22), pack-reused 0
Receiving objects: 100% (86/86), 312.66 KiB | 1.02 MiB/s, done.
Resolving deltas: 100% (23/23), done.
>>> ⠀⠩                      Already on 'main'
Your branch is up to date with 'origin/main'.
🔧   Using template subfolder `hello-world/rust`...
🔧   Generating template ...
[ 1/15]   Done: .cargo/config.toml
[ 7/15]   Done: .gitignore
✨   Done! New project created /Users/martinheller/hello
>>> ⠀⠠              No keypair found in "/Users/martinheller/.wash/keys/martinheller_account.nk".
                    We will generate one for you and place it there.
                    If you'd like to use alternative keys, you can supply them as a flag.

No keypair found in "/Users/martinheller/.wash/keys/hello_module.nk".
                    We will generate one for you and place it there.
                    If you'd like to use alternative keys, you can supply them as a flag.

>>> ⠀⢀
Now, we'll launch your hello actor and connect it to its capabilities. Proceed?: y
🚀 Launching your actor with:
    cosmo launch -p hello
🚀 Actor launched!
✅ You already have a Cosmonic-managed host running!
🔗 Launching capability providers and linking them to your actor...
    In the future, you can start providers from the UI at https://app.cosmonic.com/constellations/logic
✅ You're already running a required capability provider: HTTP Server
🌌 Creating a wormhole connected to your actor...
    In the future, you can create wormholes from the UI at https://app.cosmonic.com/constellations/logic

👇 Here's what we did:
⭐️ We started a wasmCloud host on your machine, connected to your constellation
🚀 We launched the hello world actor on your local wasmCloud host
⚙️  We started a managed host on the Cosmonic platform in your constellation
   We started an HTTP server capability provider on this host
🔗 We linked the actor on your local host to the provider running on your Cosmonic-managed host
🌌 We created a wormhole associated with this actor, allowing you to access your hello world app from the internet

Feel free to browse the code placed in the `hello/` directory.

If you're interested in how to deploy custom code to Cosmonic, check out our docs at:
    https://cosmonic.com/docs/user_guide/deploying-your-application

If you want to go through this tutorial again in the future, simply run:
    cosmo tutorial hello

🎉 That's it! Access your actor securely through a wormhole now:
    curl https://nameless-pine-8370.cosmonic.app

martinheller@Martins-M1-MBP ~ % curl https://nameless-pine-8370.cosmonic.app
Hello, World!%

At this point, both my online and offline tutorials appeared in my Cosmonic constellation:

Cosmonic Logic view. IDG

Figure 3. Cosmonic Logic view after completing both the online Echo tutorial and the offline Hello World tutorial. The two applications share a single HTTP-Wormhole provider but have separate URLs.

Cosmonic infrastructure view. IDG

Figure 4. Cosmonic Infrastructure view after completing both the online Echo tutorial and the offline Hello World tutorial.

Running cosmo down stops the local host and NATS server from cosmo tutorial hello, but doesn’t affect the online tutorial result. The code generated by the tutorial is remarkably simple, given that it’s creating a web application with a wormhole:

Rust source for the Cosmo tutorial. IDG

Figure 5. Rust source for Hello actor generated by cosmo tutorial hello, displayed in Visual Studio Code. Note that the actual implementation only amounts to one to four lines of Rust code, depending on how you count.

Conclusion

We could go on and explore Cosmonic’s pre-built capabilities and examples, wasmCloud examples, and even build a complete wasmCloud/Cosmonic application.

At this point, you should have a reasonably good feeling for what is possible with this technology. Given that wasmCloud is free and open source, and that Cosmonic’s developer preview is also currently free, I encourage you to explore those possibilities and see what you come up with.

Preview: Google Cloud Dataplex wows

Posted by on 11 April, 2023

This post was originally published on this site

In the beginning, there was a database. On the second day, there were many databases, all isolated silos… and then also data warehouses, data lakes, data marts, all different, and tools to extract, transform, and load all of the data we wanted a closer look at. Eventually, there was also metadata, data classification, data quality, data security, data lineage, data catalogs, and data meshes. And on the seventh day, as it were, Google dumped all of this on an unwitting reviewer, as Google Cloud Dataplex.

OK, that was a joke. This reviewer sort of knew what he was getting into, although he still found the sheer quantity of new information (about managing data) hard to take in.

Seriously, the distributed data problem is real. And so are the data security, safety of personally identifiable information (PII), and governance problems. Dataplex performs automatic data discovery and metadata harvesting, which allows you to logically unify your data without moving it.

Google Cloud Dataplex performs data management and governance using machine learning to classify data, organize data in domains, establish data quality, determine data lineage, and both manage and govern the data lifecycle. As we’ll discuss in more detail below, Dataplex typically starts with raw data in a data lake, does automatic schema harvesting, applies data validation checks, unifies the metadata, and makes data queryable by Google-native and open source tools.

Competitors to Google Cloud Dataplex include AWS Glue and Amazon EMR, Microsoft Azure HDInsight and Microsoft Purview Information Protection, Oracle Coherence, SAP Data Intelligence, and Talend Data Fabric.

google cloud dataplex 01 IDG

Google Cloud Dataplex overview diagram. This diagram lists five Google analytics components, four functions of Dataplex proper, and seven kinds of data reachable via BigLake, of which three are planned for the future.

Google Cloud Dataplex features

Overall, Google Cloud Dataplex is designed to unify, discover, and classify your data from all of your data sources without requiring you to move or duplicate your data. The key to this is to extract the metadata that describes your data and store it in a central place. Dataplex’s key features:

Data discovery

You can use Google Cloud Dataplex to automate data discovery, classification, and metadata enrichment of structured, semi-structured, and unstructured data. You can manage technical, operational, and business metadata in a unified data catalog. You can search your data using a built-in faceted-search interface, the same search technology as Gmail.

Data organization and life cycle management

You can logically organize data that spans multiple storage services into business-specific domains using Dataplex lakes and data zones. You can manage, curate, tier, and archive your data easily.

Centralized security and governance

You can use Dataplex to enable central policy management, monitoring, and auditing for data authorization and classification, across data silos. You can facilitate distributed data ownership based on business domains with global monitoring and governance.

Built-in data quality and lineage

You can automate data quality across distributed data and enable access to data you can trust. You can use automatically captured data lineage to better understand your data, trace dependencies, and troubleshoot data issues.

Serverless data exploration

You can interactively query fully governed, high-quality data using a serverless data exploration workbench with access to Spark SQL scripts and Jupyter notebooks. You can collaborate across teams with built-in publishing, sharing, and search features, and operationalize your work with scheduling from the workbench.

How Google Cloud Dataplex works

As you identify new data sources, Dataplex harvests the metadata for both structured and unstructured data, using built-in data quality checks to enhance integrity. Dataplex automatically registers all metadata in a unified metastore. You can also access data and metadata through a variety of Google Cloud services, such as BigQuery, Dataproc Metastore, Data Catalog, and open source tools, such as Apache Spark and Presto.

The two most common use cases for Dataplex are a domain-centric data mesh and data tiering based on readiness. I went through a series of labs that demonstrate both.

google cloud dataplex 02 IDG

In this diagram, domains are represented by Dataplex lakes and owned by separate data producers. Data producers own creation, curation, and access control in their domains. Data consumers can then request access to the lakes (domains) or zones (sub-domains) for their analysis.

google cloud dataplex 03 IDG

Data tiering means that your ingested data is initially accessible only to data engineers and is later refined and made available to data scientists and analysts. In this case, you can set up a lake to have a raw zone for the data that the engineers have access to, and a curated zone for the data that is available to the data scientists and analysts.

Preparing your data for analysis

Google Cloud Dataplex is about data engineering and conditioning, starting with raw data in data lakes. It uses a variety of tools to discover data and metadata, organize data into domains, enrich the data with business context, track data lineage, test data quality, curate the data, secure data and protect private information, monitor changes, and audit changes.

The Dataplex process flow starts in cloud storage with raw ingested data, often in CSV tables with header rows. The discovery process extracts the schema and does some curation, producing metadata tables as well as queryable files in cloud storage using Dataflow flex and serverless Spark jobs; the curated data can be in Parquet, Avro, or Orc format. The next step uses serverless Spark SQL to transform the data, apply data security, store it in BigQuery, and create views with different levels of authorization and access. The fourth step creates consumable data products in BigQuery that business analysts and data scientists can query and analyze.

google cloud dataplex 04 IDG

Google Cloud Dataplex process flow. The data starts as raw CSV and/or JSON files in cloud storage buckets, then is curated into queryable Parquet, Avro, and/or ORC files using Dataflow flex and Spark. Spark SQL queries transform the data into refined BigQuery tables and secure and authorized views. Data profiling and Spark jobs bring the final data into a form that can be analyzed.

In the banking example that I worked through, the Dataplex data mesh architecture has four data lakes for different banking domains. Each domain has raw data, curated data, and data products. The data catalog and data quality framework are centralized.

google cloud dataplex 05 IDG

Google Cloud Dataplex data mesh architecture. In this banking example, there are four domains in data lakes, for customer consumer banking, merchant consumer banking, lending consumer banking, and credit card consumer banking. Each data lake contains raw, curated, and product data zones. The central operations domain applies to all four data domains.

Automatic cataloging starts with schema harvesting and data validation checks, and creates unified metadata that makes data queryable. The Dataplex Attribute Store is an extensible infrastructure that lets you specify policy-related behaviors on the associated resources. That allows you to create taxonomies, create attributes and organize them in a hierarchy, associate one or more attributes to tables, and associate one or more attributes to columns.

You can track your data classification centrally and apply classification rules across domains to control the leakage of sensitive data such as social security numbers. Google calls this DLP (data loss prevention).

google cloud dataplex 06 IDG

Customer demographics data product. At this level information that is PII (personally identifiable information) or otherwise sensitive can be flagged, and measures can be taken to reduce the risk, such as masking sensitive columns from unauthorized viewers.

Automatic data profiling, currently in public preview, lets you identify common statistical characteristics of the columns of your BigQuery tables within Dataplex data lakes. Automatic data profiling performs scans to let you see the distribution of values for individual columns.

End-to-end data lineage helps you to understand the origin of your data and the transformations that have been applied to it. Among other benefits, data lineage allows you to trace the downstream impact of data issues and identify the upstream causes.

google cloud dataplex 07 IDG

Google Cloud Dataplex explorer data lineage. Here we are examining the SQL query that underlies one step in the data transformation process. This particular query was run as an Airflow DAG from Google Cloud Composer.

Dataplex’s data quality scans apply auto-recommended rules to your data, based on the data profile. The rules screen for common issues such as null values, values (such as IDs) that should be unique but aren’t, and values that are out of range, such as birth dates that are in the future or the distant past.

I half-joked at the beginning of this review about finding Google Cloud Dataplex somewhat overwhelming. It’s true, it is overwhelming. At the same time, Dataplex seems to be potentially the most complete system I’ve seen for turning raw data from silos into checked and governed unified data products ready for analysis.

Google Cloud Dataplex is still in preview. Some of its components are not in their final form, and others are still missing. Among the missing are connections to on-prem storage, streaming data, and multi-cloud data. Even in preview form, however, Dataplex is highly useful for data engineering.

Vendor: Google, https://cloud.google.com/dataplex 

Cost: Based on pay-as-you-go usage; $0.060/DCU-hour standard, $0.089/DCU-hour premium, $0.040/DCU-hour shuffle storage.

Platform: Google Cloud Platform.

Tailscale: Fast and easy VPNs for developers

Posted by on 15 March, 2023

This post was originally published on this site

Networking can be an annoying problem for software developers. I’m not talking about local area networking or browsing the web, but the much harder problem of ad hoc, inbound, wide area networking.

Suppose you create a dazzling website on your laptop and you want to share it with your friends or customers. You could modify the firewall on your router to permit incoming web access on the port your website uses and let your users know the current IP address and port, but that could create a potential security vulnerability. Plus, it would only work if you have control over the router and you know how to configure firewalls for port redirection.

Alternatively, you could upload your website to a server, but that’s an extra step that can often become time-consuming, and maintaining dedicated servers can be a burden, both in time and money. You could spin up a small cloud instance and upload your site there, but that is also an extra step that can often become time-consuming, even though it’s often fairly cheap.

Another potential solution is Universal Plug and Play (UPnP), which enables devices to set port forwarding rules by themselves. UPnP needs to be enabled on your router, but it’s only safe if the modem and router are updated and secure. If not, it creates serious security risks on your whole network. The usual advice from security vendors is not to enable it, since the UPnP implementations on many routers are still dangerous, even in 2023. On the other hand, if you have an Xbox in the house, UPnP is what it uses to set up your router for multiplayer gaming and chat.

A simpler and safer way is Tailscale, which allows you to create an encrypted, peer-to-peer virtual network using the secure WireGuard protocol without generating public keys or constantly typing passwords. It can traverse NAT and firewalls, span subnets, use UPnP to create direct connections if it’s available, and connect via its own network of encrypted TCP relay servers if UPnP is not available.

In some sense, all VPNs (virtual private networks) compete with Tailscale. Most other VPNs, however, route traffic through their own servers, which tends to increase the network latency. One major use case for server-based VPNs is to make your traffic look like it’s coming from the country where the server is located; Tailscale doesn’t help much with this. Another use case is to penetrate corporate firewalls by using a VPN server inside the firewall. Tailscale competes for this use case, and usually has a simpler setup.

Besides Tailscale, the only other peer-to-peer VPN is the free open source WireGuard, on which Tailscale builds. Wireguard doesn’t handle key distribution and pushed configurations. Tailscale takes care of all of that.

What is Tailscale?

Tailscale is an encrypted point-to-point VPN service based on the open source WireGuard protocol. Compared to traditional VPNs based on central servers, Tailscale often offers higher speeds and lower latency, and it is usually easier and cheaper to set up and use.

Tailscale is useful for software developers who need to set up ad hoc networking and don’t want to fuss with firewalls or subnets. It’s also useful for businesses that need to set up VPN access to their internal networks without installing a VPN server, which can often be a significant expense.

Installing and using Tailscale

Signing up for a Tailscale Personal plan was free and quick; I chose to use my GitHub ID for authentication. Installing Tailscale took a few minutes on each machine I tried: an M1 MacBook Pro, where I installed it from the macOS App Store; an iPad Pro, installed from the iOS App Store; and a Pixel 6 Pro, installed from the Google Play Store. Installing on Windows starts with a download from the Tailscale website, and installing on Linux can be done using a curl command and shell script, or a distribution-specific series of commands.

tailscale 01 IDG

You can install Tailscale on macOS, iOS, Windows, Linux, and Android. This tab shows the instructions for macOS.

Tailscale uses IP addresses in the 100.x.x.x range and automatically assigns DNS names, which you can customize if you wish. You can see your whole “tailnet” from the Tailscale site and from each machine that is active on the tailnet.

In addition to viewing your machines, you can view and edit the services available, the users of your tailnet, your access controls (ACL), your logs, your tailnet DNS, and your tailnet settings.

tailscale 02 IDG

Once the three devices were running Tailscale, I could see them all on my Tailscale login page. I chose to use my GitHub ID for authentication, as I was testing just for myself. If I were setting up Tailscale for a team I would use my team email address.

tailscale 06 IDG

Tailscale pricing.

Tailscale installs a CLI on desktop and laptop computers. It’s not absolutely necessary to use this command line, but many software developers will find it convenient.

How Tailscale works

Tailscale, unlike most VPNs, sets up peer-to-peer connections, aka a mesh network, rather than a hub-and-spoke network. It uses the open source WireGuard package (specifically the userspace Go variant, wireguard-go) as its base layer.

For public key distribution, Tailscale does use a hub-and-spoke configuration. The coordination server is at login.tailscale.com. Fortunately, public key distribution takes very little bandwidth. Private keys, of course, are never distributed.

You may be familiar with generating public-private key pairs manually to use with ssh, and including a link to the private key file as part of your ssh command line. Tailscale does all of that transparently for its network, and ties the keys to whatever login or 2FA credentials you choose.

The key pair steps are:

  1. Each node generates a random public/private key pair for itself, and associates the public key with its identity.
  2. The node contacts the coordination server and leaves its public key and a note about where that node can currently be found, and what domain it’s in.
  3. The node downloads a list of public keys and addresses in its domain, which have been left on the coordination server by other nodes.
  4. The node configures its WireGuard instance with the appropriate set of public keys.

Tailscale doesn’t handle user authentication itself. Instead, it always outsources authentication to an OAuth2, OIDC (OpenID Connect), or SAML provider, including Gmail, G Suite, and Office 365. This avoids the need to maintain a separate set of user accounts or certificates for your VPN.

tailscale 07 IDG

Tailscale CLI help. On macOS, the CLI executable lives inside the app package. A soft link to this executable doesn’t seem to work on my M1 MacBook Pro, possibly because Tailscale runs in a sandbox.

NAT traversal is a complicated process, one that I personally tried unsuccessfully to overcome a decade ago. NAT (network address translation) is one of the ways firewalls work: Your computer’s local address of, say, 192.168.1.191, gets translated in the firewall, as a packet goes from your computer to the internet, to your current public IP address and a random port number, say 173.76.179.155:9876, and remembers that port number as yours. When a site returns a response to your request, your firewall recognizes the port and translates it back to your local address before passing you the response.

tailscale 08 IDG

Tailscale status, Tailscale pings to two devices, and plain pings to the same devices using the native network. Notice that the Tailscale ping to the Pixel device first routes via a DERP server (see below) in NYC, and then manages to find the LAN connection.

Where’s the problem? Suppose you have two firewall clients trying to communicate peer-to-peer. Neither can succeed until someone or something tells both ends what port to use.

This arbitrator will be a server when you use the STUN (Session Traversal Utilities for NAT) protocol; while STUN works on most home routers, it unfortunately doesn’t work on most corporate routers. One alternative is the TURN (Traversal Using Relays around NAT) protocol, which uses relays to get around the NAT deadlock issue; the trouble with that is that TURN is a pain in the neck to implement, and there aren’t many existing TURN relay servers.

Tailscale implements a protocol of its own for this, called DERP (Designated Encrypted Relay for Packets). This use of the term DERP has nothing to do with being goofy, but it does suggest that someone at Tailscale has a sense of humor.

Tailscale has DERP servers around the world to keep latency low; these include nine servers in the US. If, for example, you are trying to use Tailscale to connect your smartphone from a park to your desktop at your office, the chances are good that the connection will route via the nearest DERP server. If you’re lucky, the DERP server will only be used as a side channel to establish the connection. If you’re not, the DERP server will carry the encrypted WireGuard traffic between your nodes.

Tailscale vs. other VPNs

Tailscale offers a reviewer’s guide. I often look at such documents and then do my own thing because I’ve been around the block a couple of times and recognize when a company is putting up straw men and knocking them down, but this one is somewhat helpful. Here are some key differentiators to consider.

With most VPNs, when you are disconnected you have to log in again. It can be even worse when your company has two internet providers and has two VPN servers to handle them, because you usually have to figure out what’s going on by trial and error or by attempting to call the network administrator, who is probably up to his or her elbows in crises. With Tailscale (and WireGuard), the connection just resumes. Similarly, many VPN servers have trouble with flakey connections such as LTE. Tailscale and WireGuard take the flakiness in stride.

With most VPNs, getting a naive user connected for the first time is an exercise in patience for the network administrator and possibly scary for the user who has to “punch a hole” in her home firewall to enable the connection. With Tailscale it’s a five-minute process that isn’t scary at all.

Most VPNs want to be exclusive. Connecting to two VPN concentrators at once is considered a cardinal sin and a potential security vulnerability, especially if they are at different companies. Tailscale doesn’t care. WireGuard can handle this situation just fine even with hub-and-spoke topologies, and with Tailscale point-to-point connections there is a Zero Trust configuration that exposes no vulnerability.

Tailscale solutions

Tailscale has documented about a dozen solutions to common use cases that can be addressed with its ad hoc networking. These range from wanting to code from your iPad to running a private Minecraft server without paying for hosting or opening up your firewall.

As we’ve seen, Tailscale is simple to use, but also sophisticated under the hood. It’s an easy choice for ad hoc networking, and a reasonable alternative to traditional hub-and-spoke VPNs for companies. The only common VPN function that I can think of that it won’t do is spoof your location so that you can watch geographically restricted video content—but there are free VPNs that handle that.

Cost: Personal, open source, and “friends and family” plans, free. Personal Pro, $48 per year. Team, $5 per user per month (free trial available). Business, $15 per user per month (free trial available). Custom plans, contact sales.

Platform: macOS 10.13 or later, Windows 7 SP1 or later, Linux (most major distros), iOS 15 or later, Android 6 or later, Raspberry Pi, Synology.

Ballerina: A programming language for the cloud

Posted by on 8 March, 2023

This post was originally published on this site

Ballerina, which is developed and supported by WSO2, is billed as “a statically typed, open-source, cloud-native programming language.” What is a cloud-native programming language? In the case of Ballerina, it is one that supports networking and common internet data structures and that includes interfaces to a large number of databases and internet services. Ballerina was designed to simplify the development of distributed microservices by making it easier to integrate APIs, and to do so in a way that will feel familiar to C, C++, C#, and Java programmers.

Essentially, Ballerina is a C-like compiled language that has features for JSON, XML, and tabular data with SQL-like language-integrated queries, concurrency with sequence diagrams and language-managed threads, live sequence diagrams synched to the source code, flexible types for use both inside programs and in service interfaces, explicit error handling and concurrency safety, and network primitives built into the language.

There are two implementations of Ballerina. The currently available version, jBallerina, has a toolchain implemented in Java, compiles to Java bytecode, runs on a Java virtual machine, and interoperates with Java programs. A newer, unreleased (and incomplete) version, nBallerina, cross-compiles to native binaries using LLVM and provides a C foreign function interface. jBallerina can currently generate GraalVM native images on an experimental basis from its CLI, and can also generate cloud artifacts for Docker and Kubernetes. Ballerina has interface modules for PostgreSQL, MySQL, Microsoft SQL Server, Redis, DynamoDB, Azure Cosmos DB, MongoDB, Snowflake, Oracle Database, and JDBC databases.

For development, Ballerina offers a Visual Studio Code plug-in for source and graphical editing and debugging; a command-line utility with several useful features; a web-based sandbox; and a REPL (read-evaluate-print loop) shell. Ballerina can work with OpenAPI, GraphQL schemas, and gRPC schemas. It has a module-sharing platform called Ballerina Central, and a large library of examples. The command-line utility provides a build system and a package manager, along with code generators and the interactive REPL shell.

Finally, Ballerina offers integration with Choreo, WSO2’s cloud-hosted API management and integration solution, for observability, CI/CD, and devops, for a small fee. Ballerina itself is free open source.

Ballerina Language

The Ballerina Language combines familiar elements from C-like languages with unique features. For an example using familiar elements, here’s a “Hello, World” program with variables:

import ballerina/io;
string greeting = "Hello";
public function main() {
    string name = "Ballerina";
    io:println(greeting, " ", name);
}

Both int and float types are signed 64-bit in Ballerina. Strings and identifiers are Unicode, so they can accommodate many languages. Strings are immutable. The language supports methods as well as functions, for example:

// You can have Unicode identifiers.
function พิมพ์ชื่อ(string ชื่อ) {
    // Use u{H} to specify character using Unicode code point in hex.
   io:println(ชื่u{E2D});
}
string s = "abc".substring(1, 2);
int n = s.length();

In Ballerina, nil is the name for what is normally called null. A question mark after the type makes it nullable, as in C#. An empty pair of parentheses means nil.

int? v = ();

Arrays in Ballerina use square brackets:

int[] v = [1, 2, 3];

Ballerina maps are associative key-value structures, similar to Python dictionaries:

map<int> m = {
    "x": 1,
    "y": 2
};

Ballerina records are similar to C structs:

record { int x; int y; } r = {
    x: 1,
    y: 2
};

You can define named types and records in Ballerina, similar to C typedefs:

type MapArray map<string>[];
MapArray arr = [
    {"x": "foo"},
    {"y": "bar"}
];
type Coord record {
    int x;
    int y;
};

You can create a union of multiple types using the | character:

type flexType string|int;
flexType a = 1;
flexType b = "Hello";

Ballerina doesn’t support exceptions, but it does support errors. The check keyword is a shorthand for returning if the type is error:

function intFromBytes(byte[] bytes) returns int|error {
    string|error ret = string:fromBytes(bytes);
    if ret is error {
        return ret;
    } else {
        return int:fromString(ret);
    }
}

This is the same function using check instead of if ret is error { return ret:

function intFromBytes(byte[] bytes) returns int|error {
    string str = check string:fromBytes(bytes);
    return int:fromString(str);
}

You can handle abnormal errors and make them fatal with the panic keyword. You can ignore return values and errors using the Python-like underscore _ character.

Ballerina has an any type, classes, and objects. Object creation uses the new keyword, like Java. Ballerina’s enum types are shortcuts for unions of string constants, unlike C. The match statement is like the switch case statement in C, only more flexible. Ballerina allows type inference to a var keyword. Functions in Ballerina are first-class types, so Ballerina can be used as a functional programming language. Ballerina supports asynchronous programming with the start, future, wait, and cancel keywords; these run in strands, which are logical threads.

Ballerina provides distinctive network services, tables and XML types, concurrency and transactions, and various advanced features. These are all worth exploring carefully; there’s too much for me to summarize here. The program in the image below should give you a feel for some of them.

ballerina 01 IDG

This example on the Ballerina home page shows the code and sequence diagram for a program that pulls GitHub issues from a repository and adds each issue as a new row to a Google Sheet. The code and diagram are linked; a change to one will update the other. The access tokens need to be filled in at the question marks before the program can run, and the ballerinax/googleapis.sheets package needs to be pulled from Ballerina Central, either using the “Pull unresolved modules” code action in VS Code or using the bal pull command from the CLI.

Ballerina standard libraries and extensions

There are more than a thousand packages in the Ballerina Central repository. They include the Ballerina Standard Library (ballerina/*), Ballerina-written extensions (ballerinax/*), and a few third-party demos and extensions.

The standard library is documented here. The Ballerina-written extensions tend to be connectors to third-party products such as databases, observability systems, event streams, and common web APIs, for example GitHub, Slack, and Salesforce.

Anyone can create an organization and publish (push) a package to Ballerina Central. Note that all packages in this repository are public. You can of course commit your code to GitHub or another source code repository, and control access to that.

Installing Ballerina

You can install Ballerina by downloading the appropriate package for your Windows, Linux, or macOS system and then running the installer. There are additional installation options, including building it from the source code. Then run bal version from the command line to verify a successful installation.

In addition, you should install the Ballerina extension for Visual Studio Code. You can double-check that the extension installed correctly in VS Code by running View -> Command Palette -> Ballerina. You should see about 20 commands.

The bal command line

The bal command-line is a tool for managing Ballerina source code that helps you to manage Ballerina packages and modules, test, build, and run programs. It also enables you to easily install, update, and switch among Ballerina distributions. See the screen shot below, which shows part of the output from bal help, or refer to the documentation.

ballerina bal help lg IDG

bal help shows the various subcommands available from the Ballerina command line. The commands include compilation, packaging, scaffolding and code generation, and documentation generation.

Ballerina Examples

Ballerina has, well, a lot of examples. You can find them in the Ballerina by Example learning page, and also in VS Code by running the Ballerina: Show Examples command. Going through the examples is an alternate way to learn Ballerina programming; it’s a good supplement to the tutorials and documentation, and supports unstructured discovery as well as deliberate searches.

One caution about the examples: Not all of them are self-explanatory, as though an intern who knew the product wrote them without thinking about learners or having any review by naive users. On the other hand, many are self-explanatory and/or include links to the relevant documentation and source code.

For instance, in browsing the examples I discovered that Ballerina has a testing framework, Testarina, which is defined in the module ballerina/test. The test module defines the necessary annotations to construct a test suite, such as @test:Config {}, and the assertions you might expect if you’re familiar with JUnit, Rails unit tests, or any similar testing frameworks, for example the assertion test:assertEquals(). The test module also defines ways to specify setup and teardown functions, specify mock functions, and establish test dependencies.

ballerina examples IDG

Ballerina Examples, as viewed from VS Code’s Ballerina: Show Examples command. Similar functionality is available online.

Overall, Ballerina is a useful and feature-rich programming language for its intended purpose, which is cloud-oriented programming, and it is free open source. It doesn’t produce the speediest runtime modules I’ve ever used, but that problem is being addressed, both by experimental GraalVM native images and the planned nBallerina project, which will compile to native code.

At this point, Ballerina might be worth adopting for internal projects that integrate internet services and don’t need to run fast or be beautiful. Certainly, the price is right.

Cost: Ballerina Platform and Ballerina Language: Free open source under the Apache License 2.0. Choreo hosting: $150 per component per month after five free components, plus infrastructure costs.

Platform: Windows, Linux, macOS; Visual Studio Code.

nbdev v2 review: Git-friendly Jupyter Notebooks

Posted by on 1 February, 2023

This post was originally published on this site

There are many ways to go about programming. One of the most productive paradigms is interactive: You use a REPL (read-eval-print loop) to write and test your code as you code, and then copy the tested code into a file.

The REPL method, which originated in LISP development environments, is well-suited to Python programming, as Python has always had good interactive development tools. The drawback of this style of programming is that once you’ve written the code you have to separately pull out the tests and write the documentation, save all that to a repository, do your packaging, and publish your package and documentation.

Donald Knuth’s literate programming paradigm prescribes writing the documentation and code in the same document, with the documentation aimed at humans interspersed with the code intended for the computer. Literate programming has been used widely for scientific programming and data science, often using notebook environments, such as Jupyter Notebooks, Jupyter Lab, Visual Studio Code, and PyCharm. One issue with notebooks is that they sometimes don’t play well with repositories because they save too much information, including metadata that doesn’t matter to anyone. That creates a problem when there are merge conflicts, as notebooks are cell-oriented and source code repositories such as Git are line-oriented.

Jeremy Howard and Hamel Husain of fast.ai, along with about two dozen minor contributors, have come up with a set of command-line utilities that not only allow Jupyter Notebooks to play well with Git, but also enable a highly productive interactive literate programming style. In addition to producing correct Python code quickly, you can produce documentation and tests at the same time, save it all to Git without fear of corruption from merge conflicts, and publish to PyPI and Conda with a few commands. While there’s a learning curve for these utilities, that investment pays dividends, as you can be done with your development project in about the time it would normally take to simply write the code.

As you can see in the diagram below, nbdev works with Jupyter Notebooks, GitHub, Quarto, Anaconda, and PyPI. To summarize what each piece of this system does:

  • You can generate documentation using Quarto and host it on GitHub Pages. The docs support LaTeX, are searchable, and are automatically hyperlinked.
  • You can publish packages to PyPI and Conda as well as tools to simplify package releases. Python best practices are automatically followed, for example, only exported objects are included in __all__.
  • There is two-way sync between notebooks and plaintext source code, allowing you to use your IDE for code navigation or quick edits.
  • Tests written as ordinary notebook cells are run in parallel with a single command.
  • There is continuous integration with GitHub Actions that run your tests and rebuild your docs.
  • Git-friendly notebooks with Jupyter/Git hooks that clean unwanted metadata and render merge conflicts in a human-readable format.
nbdev 01 IDG

The nbdev software works with Jupyter Notebooks, GitHub, Quarto, Anaconda, and PyPi to produce a productive, interactive environment for Python development.

nbdev installation

nbdev works on macOS, Linux, and most Unix-style operating systems. It requires a recent version of Python 3; I used Python 3.9.6 on macOS Ventura, running on an M1 MacBook Pro. nbdev works on Windows under WSL (Windows Subsystem for Linux), but not under cmd or PowerShell. You can install nbdev with pip or Conda. I used pip:

pip install nbdev

That installed 29 command-line utilities, which you can list using nbdev_help:

% nbdev_help
nbdev_bump_version              Increment version in settings.ini by one
nbdev_changelog                 Create a CHANGELOG.md file from closed and labeled GitHub issues
nbdev_clean                     Clean all notebooks in `fname` to avoid merge conflicts
nbdev_conda                     Create a `meta.yaml` file ready to be built into a package, and optionally build and upload it
nbdev_create_config             Create a config file.
nbdev_docs                      Create Quarto docs and README.md
nbdev_export                    Export notebooks in `path` to Python modules
nbdev_filter                    A notebook filter for Quarto
nbdev_fix                       Create working notebook from conflicted notebook `nbname`
nbdev_help                      Show help for all console scripts
nbdev_install                   Install Quarto and the current library
nbdev_install_hooks             Install Jupyter and git hooks to automatically clean, trust, and fix merge conflicts in notebooks
nbdev_install_quarto            Install latest Quarto on macOS or Linux, prints instructions for Windows
nbdev_merge                     Git merge driver for notebooks
nbdev_migrate                   Convert all markdown and notebook files in `path` from v1 to v2
nbdev_new                       Create an nbdev project.
nbdev_prepare                   Export, test, and clean notebooks, and render README if needed
nbdev_preview                   Preview docs locally
nbdev_proc_nbs                  Process notebooks in `path` for docs rendering
nbdev_pypi                      Create and upload Python package to PyPI
nbdev_readme                    None
nbdev_release_both              Release both conda and PyPI packages
nbdev_release_gh                Calls `nbdev_changelog`, lets you edit the result, then pushes to git and calls `nbdev_release_git`
nbdev_release_git               Tag and create a release in GitHub for the current version
nbdev_sidebar                   Create sidebar.yml
nbdev_test                      Test in parallel notebooks matching `path`, passing along `flags`
nbdev_trust                     Trust notebooks matching `fname`
nbdev_update                    Propagate change in modules matching `fname` to notebooks that created them

The nbdev developers suggest either watching this 90-minute video or going through this roughly one-hour written walkthrough. I did both, and also read through more of the documentation and some of the source code. I learned different material from each, so I’d suggest watching the video first and then doing the walkthrough. For me, the video gave me a clear enough idea of the package’s utility to motivate me to go through the tutorial.

Begin the nbdev walkthrough

The tutorial starts by having you install Jupyter Notebook:

pip install notebook

And then launching Jupyter:

jupyter notebook

The installation continues in the notebook, first by creating a new terminal and then using the terminal to install nbdev. You can skip that installation if you already did it in a shell, like I did.

Then you can use nbdev to install Quarto:

nbdev_install_quarto

That requires root access, so you’ll need to enter your password. You can read the Quarto source code or docs to verify that it’s safe.

At this point you need to browse to GitHub and create an empty repository (repo). I followed the tutorial and called mine nbdev_hello_world, and added a fairly generic description. Create the repo. Consult the instructions if you need them. Then clone the repo to your local machine. The instructions suggest using the Git command line on your machine, but I happen to like using GitHub Desktop, which also worked fine.

In either case, cd into your repo in your terminal. It doesn’t matter whether you use a terminal on your desktop or in your notebook. Now run nbdev_new, which will create a bunch of files in your repo. Then commit and push your additions to GitHub:

git add .
git commit -m'Initial commit'
git push

Go back to your repo on GitHub and open the Actions tab. You’ll see something like this:

nbdev 02 IDG

GitHub Actions after initial commit. There are two: a continuous integration (CI) workflow to clean your code, and a Deploy to GitHub Pages workflow to post your documentation.

Now enable GitHub Pages, following the optional instructions. It should look like this:

nbdev 03 IDG

Enabling GitHub Pages.

Open the Actions tab again, and you’ll see a third workflow:

nbdev 04 IDG

There are now three workflows in your repo. The new one generates web documentation.

Now open your generated website, at https://{user}.github.io/{repo}. Mine is at https://meheller.github.io/nbdev-hello-world/. You can copy that and change meheller to your own GitHub handle and see something similar to the following:

nbdev 05 IDG

Initial web documentation page for the package.

Continue the nbdev walkthrough

Now we’re finally getting to the good stuff. You’ll install web hooks to automatically clean notebooks when you check them in,

nbdev_install_hooks

export your library,

nbdev_export

install your package,

pip install -e '.[dev]'

preview your docs,

nbdev_preview

(and click the link) and at long last start editing your Python notebook:

jupyter notebook

(and click on nbs, and click on 00_core.ipynb).

Edit the notebook as described, then prepare your changes:

nbdev_prepare

Edit index.ipynb as described, then push your changes to GitHub:

git add .
git commit -m'Add `say_hello`; update index'
git push

If you wish, you can push on and add advanced functionality.

nbdev 06 IDG

The nbdev-hello-world repo after finishing the tutorial.

As you’ve seen, especially if you’ve worked through the tutorial yourself, nbdev can enable a highly productive Python development workflow in notebooks, working smoothly with a GitHub repo and Quarto documentation displayed on GitHub Pages. If you haven’t yet worked through the tutorial, what are you waiting for?

Contact: fast.ai, https://nbdev.fast.ai/

Cost: Free open source under Apache License 2.0.

Platforms: macOS, Linux, and most Unix-style operating systems. It works on Windows under WSL, but not under cmd or PowerShell.

Review: Appsmith shines for low-code development on a budget

Posted by on 30 November, 2022

This post was originally published on this site

The low-code or no-code platform market has hundreds of vendors, which produce products of varying utility, price, convenience, and effectiveness. The low-code development market is at least partially built on the idea of citizen developers doing most of the work, although a 2021 poll by vendor Creatio determined that two-thirds of citizen developers are IT-related users. The same poll determined that low code is currently being adopted primarily for custom application development inside separate business units.

When I worked for a low-code or no-code application platform vendor (Alpha Software) a decade ago, more than 90% of the successful low-code customer projects I saw had someone from IT involved, and often more than one. There would usually be a business user driving the project, supported by a database administrator and a developer. The business user might put in the most time, but could only start the project with help from the database administrator, mostly to provide gated access to corporate databases. They usually needed help from a developer to finish and deploy the project. Often, the business user’s department would serve as guinea pigs, aka testers, as well as contributing to the requirements and eventually using the internal product.

Appsmith is one of about two dozen low-code or no-code development products that are open source. The Appsmith project is roughly 60% TypeScript, 25% Java, and 11% JavaScript. Appsmith describes itself as designed to build, ship, and maintain internal tools, with the ability to connect to any data source, create user interfaces (UIs) with pre-built widgets, code freely with an inbuilt JavaScript editor, and deploy with one click. Appsmith defines internal tools as custom dashboards, admin panels, and CRUD applications that enable your team to automate processes and securely interact with your databases and APIs. Ultimately, Appsmith competes with all 400+ low-code or no-code vendors, with the biggest competitors being the ones with similar capabilities.

Appsmith started as an internal tool for a game development company. “A few years ago, we published a game that went viral. Hundreds of help requests came in overnight. We needed a support app to handle them fast. That’s when we realized how hard it was to create a basic internal app quickly! Less than a year later, Appsmith had started taking shape.”

Starting with an internal application and enhancing it for customer use is a tough path to take, and doesn’t often end well. Here’s a hands-on review to help you decide whether Appsmith is the low-code platform you’ve been looking for.

Low-code development with Appsmith

Appsmith offers a drag-and-drop environment for building front-ends, database and API connectors to power back-ends, fairly simple embedded JavaScript coding capabilities, and easy publishing and sharing. Here’s a quick look at how these features work together in an Appsmith development project.

Connecting to data sources

You can connect to data sources from Appsmith using its direct connections, or by using a REST API. Supported data sources currently include Amazon S3, ArangoDB, DynamoDB, ElasticSearch, Firestore, Google Sheets, MongoDB, Microsoft SQL Server, MySQL, PostgreSQL, Redis, Redshift, Snowflake, and SMTP (to send mail). Many of these are not conventionally considered databases. Appsmith encrypts your credentials and avoids storing data returned from your queries. It uses connection pools for database connections, and limits the maximum number of queries that can run concurrently on a database to five, which could become a bottleneck if you have a lot of users and run complex queries. Appsmith also supports 17 API connectors, some of which are conventionally considered databases.

Building the UI

Appsmith offers about 45 widgets, including containers and controls. You can drag and drop widgets from the palette to the canvas. Existing widgets on the canvas will move out of the way of new widgets as you place them, and widgets can resize themselves while maintaining their aspect ratios.

Data access and binding

You can create, test, and name queries against each of your data sources. Then, you can use the named queries from the appropriate widgets. Query results are stored in the data property of the query object, and you can access the data using JavaScript written inside of “handlebars,” aka “mustache” syntax. Here’s an example:

{{ Query1.data }}

You can use queries to display raw or transformed data in a widget, display lists of data in dropdowns and tables, and to insert or update data captured from widgets into your database. Appsmith is reactive, so the widgets are automatically updated whenever the data in the query changes.

Writing code

You can use JavaScript inside handlebars anywhere in Appsmith. You can reference every entity in Appsmith as a JavaScript variable and perform all JavaScript functions and operations on them. This means you can reference all widgets, APIs, queries, and their associated data and properties anywhere in an application using the handlebar or mustache syntax.

In general, the JavaScript in Appsmith is restricted to single-line expressions. You can, however, write a helper function in a JavaScript Object to call from a single-line expression. You can also write immediately-invoked function expressions, which can contain multiline JavaScript inside the function definition.

Appsmith development features

As of October 5, 2022, Appsmith has announced a number of improvements. First, it has achieved SOC2 Type II certification, which means that it has completed a third-party audit to certify its information compliance. Second, it has added GraphQL support. GraphQL is an open source data query and manipulation language for APIs, and a runtime for fulfilling queries with existing data; it was developed by Facebook, now Meta.

Appsmith now has an internal view for console logs; you don’t have to use the browser’s debugger. It also has added a widget that allows users to scan codes using their device cameras; it supports 12 formats, including 1D product bar codes such as UPC-A and -E, 1D industrial bar codes such as Code 39, and 2D codes such as QR and Data Matrix. It added three new slider controls: numbers, a range, and categories. Appsmith’s engineers halved the render time for widgets by only redrawing widgets that have changed.

Appsmith hosting options

You can use the cloud version of Appsmith (sign up at https://appsmith.com/)  or host Appsmith yourself. The open source version of Appsmith is free either way. Appsmith recommends using Docker or Kubernetes on a machine or virtual machine with two vCPUs and 4GB of memory. There are one-button install options for Appsmith on AWS and DigitalOcean.

If you want priority support, SAML and SSO, and unlimited private Git repos, you can pay for Appsmith Business. This service is open for early access as of this writing.

Appsmith widgets

Appsmith widgets include most of the controls and containers you’d expect to find in a drag-and-drop UI builder. They include simple controls, such as text, input, and button controls; containers such as generic containers, tabs, and forms; and media controls such as pictures, video players, audio in and out, a camera control for still and video shooting, and the new code scanner.

Hands-on with Appsmith

I went through the introductory Appsmith tutorial to build a customer support dashboard in the Appsmith Cloud. It uses a PostgreSQL database pre-populated with a users table. I found the tutorial style a little patronizing (note all the exclamation points and congratulatory messages), but perhaps that isn’t as bad as the “introductory” tutorials in some products that quickly go right over users’ heads.

Note that if you have a more advanced application in mind, you might find a starter template among the 20 offered.

The Appsmith tutorial starts with a summary of how Appsmith works.

The Appsmith tutorial start page.IDG

Here is the application we’re trying to build. I have tested it by, among other things, adding a name to record 8.

The application example. IDG

Here’s a test of the SELECT query for the user’s table in the SQL sandbox.

Testing the SELECT query in the user sandbox. IDG

Here, we’re calling a prepared SQL query from JavaScript to populate the Customers table widget.

Calling a prepared SQL query from JavaScript. IDG

Next, we view the property pane for the Customers table.

The property pane for the Customers table. IDG

Now, we can start building the Customer update form.

Build the Customer update form. IDG

We’ve added the JavaScript call that extracts the name to the Default Value property for the NameInput widget.

Adding a JavaScript call. IDG

And we’ve bound the email and country fields to the correct queries via JavaScript and SQL.

Binding the email and country fields to the correct queries. IDG

Here, we’ve added an Update button below the form. It is not yet bound to an action.

Adding an update button IDG

The first step in processing the update is to execute the updateCustomerInfo query, as shown here.

Execute the updateCustomerInfo query. IDG

The updateCustomerInfo query is an SQL UPDATE statement. Note how it is parameterized using JavaScript variables bound to the form fields.

The query is parameterized using JavaScript variables bound to the form fields. IDG

The second step in the update is to get the customers again once the first query completes, as shown below.

Get the customers once the query completes. IDG

Now, we can test our application in the development environment before deploying it.

Test the app before deploying it. IDG

Once it’s deployed, we can run the application without seeing the development environment.

Run the application after it had deployed. IDG

Notice that an Appsmith workspace contains all of your applications. (Mine are shown here.)

The Appsmith workspace. IDG

Here’s the page showing all 20 Appsmith templates available to use and customize.

A list of Appsmith templates. IDG

Conclusion

As you’ve seen, Appsmith is a competent drag-and-drop low-code application platform. It includes a free, open source option as long as you don’t need priority support, SAML, SSO, or more than three private Git repositories. For any of those features, or for custom access controls, audit logs, backup and restore, or custom branding, you’d need the paid Appsmith Business plan.

If you think Appsmith might meet your needs for internal departmental application development, I’d encourage you to try out the free, open source version. Trying it in the cloud is less work than self-hosting, although you may eventually want to self-host if you adopt the product.

Happy Hacking Keyboard review: Tiny typing comfort at a cost

Posted by on 9 November, 2022

This post was originally published on this site

Happy Hacking Keyboard Hybrid Type-S is aimed at those with highly specific needs from a keyboard: a compact keyboard layout; quiet but comfortable typing; and the need to switch between multiple Bluetooth-connected devices on the fly. Its price tag will raise eyebrows ($385 list), but it offers a package of features that are otherwise hard to find in a single keyboard.

The HHKB (as it’s abbreviated) uses a key layout even more compact than most laptops. The whole unit is compact enough to throw into a knapsack. Function keys, arrows, and many other controls are accessed by way of a special “Fn” key. Delete is directly above the Return key. There’s also no dedicated Caps Lock key; that’s accessed by pressing Fn+Tab.

Once I got used to the HHKB layout, though, typing on it was quite nice. (I wrote this review with it.) A big part of the HHKB’s cost is its Topre electrostatic capacitive switch mechanisms, which generate a pleasant amount of tactile feedback while also reducing typing clatter. The overall noise level is on a par with a soft-touch notebook keyboard. Unfortunately, there’s no key back lighting.

DIP switches and Fn key combinations let you choose between Mac or Windows key sets. For instance, when Windows is the selected key set, the Mac Option key is repurposed to become the Windows menu key. Unfortunately, the few multimedia key bindings included by default—volume controls, mute, and eject—are Mac-only.

One really powerful HHKB feature is its multi-device support. You can pair the keyboard via Bluetooth to up to four different devices—desktop PCs, phones, anything that pairs keyboards via Bluetooth—and switch between them with a Fn-Control-number key combo. It’s an appealing option for those who struggle with KVM switches or who want to use the same keyboard at work and at home. You can also connect the HHKB directly to a system via USB-C, although USB-C cabling isn’t included. 

Another powerful feature is the HHKB’s key mapping utility. You can create custom key layouts if you don’t like the existing one. For instance, I remapped one of the Alt keys to behave like a dedicated Caps Lock key, and mapped the Fn+WASD keys to work as arrows, as the default arrow key bindings are not very comfortable for my hands. Key sets can be saved to files, or replaced at any time with the factory setting.

The Happy Hacking Keyboard Hybrid Type-S is wonderfully compact, comfortable, and quiet. I love its key mapping and Bluetooth switching capabilities. But I find its $385 price tag really hard to swallow.

Book review: ‘Python Tools for Scientists’

Posted by on 26 October, 2022

This post was originally published on this site

Python has earned a name as a go-to language for working quickly and conveniently with data, performing data analysis, and getting things done. But because the Python ecosystem is so vast and powerful, many people who are just starting with the language have a hard time sorting through it all. “Do I use NumPy or Pandas for this job?”, they ask, or “What’s the difference between Plotly and Bokeh?” Sound familiar?

Python Tools for Scientists, by Lee Vaughn (No Starch Press, San Francisco), to be released in January 2023, is a guide for the Pythonically perplexed. As described in the introduction, this book is intended to be used as “a machete for hacking through the dense jungle of Python distributions, tools, and libraries.” In keeping with that goal, the book is confined to one popular Python distribution for scientific work—Anaconda—and the common scientific computing tools and libraries that are packaged with it: the Spyder IDE, Jupyter Notebook, and Jupyterlab, and the NumPy, Matplotlib, Pandas, Seaborn, and Scikit-learn libraries.

Setting up a Python workspace

The first part of the book deals with setting up a workspace, in this case by installing Anaconda and getting familiar with tools like Jupyter and Spyder. It also covers the details of creating virtual environments and managing packages within them, with many detailed command-line instructions and screenshots throughout. 

Getting to know the Python language

For those who don’t know Python at all, the book’s second part is a compressed primer for the language. Aside from covering the basics—Python syntax, data, and container types, flow control, functions/modules—it also provides detail on classes and object-oriented programming, writing self-documenting code, and working with files (text, pickled data, and JSON). If you need a more in-depth introduction, the preface points you toward more robust learning resources. That said, this section by itself is as detailed as some standalone “get started with Python” guides.

Unpacking Anaconda

Part three tours many of the libraries packaged with Anaconda for general scientific computing (SciPy), deep learning, computer vision, natural language processing, dashboards and visualization, geospatial data and geovisualization, and many more. The goal of this section isn’t to demonstrate the libraries in depth, but rather to lay out their differences and allow for informed choices between them. An example is the recommendation for how to choose a deep learning library:

If you’re brand new to deep learning, consider Keras, followed by PyTorch. […] If you’re working with large datasets and need speed and performance, choose either PyTorch or TensorFlow.

Demonstrations

Part four goes into depth with several key libraries: NumPy, Matplotlib, Pandas, Seaborn (for data visualization), and Scikit-learn. Each library is demonstrated with practical examples. In the case of Pandas, Seaborn, and Scikit-learn, there’s a fun project involving a dataset (the Palmer Penguins Project) that you can interact with as you read along.

This book does not cover some aspects of scientific computing with Python. For instance, Cython and Numba aren’t discussed, and there’s no mention of cross-integration with other scientific-computing languages like R or FORTRAN. Instead, this book stays focused on its main mission: guiding you through the thicket of scientific Python offerings available using Anaconda.

Social Media

Bulk Deals

Subscribe for exclusive Deals

Recent Post

Facebook

Twitter

Subscribe for exclusive Deals




Copyright 2015 - InnovatePC - All Rights Reserved

Site Design By Digital web avenue