Monthly Archives: August 2022

Kissflow review: No code and low code for workflows

Posted by on 24 August, 2022

This post was originally published on this site

Kissflow Work Platform provides a collection of tools to create, manage, and track workflows. It has five core modules to handle any type of work that moves your way: processes, projects, cases, datasets, and collaboration, although projects and cases are currently being merged into a new module, boards.

With both no-code and low-code tools, Kissflow promises to extend application development to your entire organization. The company also offers a BPM platform, not covered in this review.

Kissflow comes with more than 200 pre-built templates. You can install these templates from the Kissflow marketplace. After installing them, you have complete control to configure them to your needs. Examples of templates include sales enquiry, purchase request, employee transfer request, purchase catalog, software directory, lead qualification, sales pipeline, customer onboarding, IT help desk, bug tracking, and incident management.

Kissflow was renamed from OrangeScape Technologies in 2019. The company is headquartered in Chennai, India. Kissflow claims to have a million users in 10 thousand companies distributed over 160 countries.

In addition to the platform, Kissflow offers paid training and consulting, which extends to building out your entire network of interconnected processes. Kissflow also has a partner reseller program.

As we’ll see, Kissflow has no-code modules aimed at non-programmers, and low-code modules (apps) aimed at IT. The user experience for each has been tailored to their target audiences, which makes them quite different.

There were over 400 vendors in the no-code and low-code development space the last time I looked. Gartner covers about 250 of them. Kissflow counts Microsoft Power Apps and Google Cloud AppSheet as competitors for its no-code modules, and OutSystems, Mendix, and Appian as competitors for its low-code modules.

kissflow 01 IDG

Kissflow includes three modules for citizen developers (data forms, boards, and flows) and one module for IT teams (apps).

Kissflow data forms

A Kissflow form is an entity with which you can collect input from the people who participate in your flow. Forms are predominantly used in flows like processes, projects, and cases in Kissflow. A form has three primary components—section, field, and table. In addition, forms have workflows and permissions, and fields can be attached to computations.

kissflow 02 IDG

Example of a Kissflow form, in this case for press release submission. The green “fx” triangles in fields mean that they are computed fields. The formula displays in the field properties.

kissflow 03 IDG

This is the simulation view for a form. We’re looking at the mobile layout, as well as the simulated workflow. There is also a web layout.

kissflow 04 IDG

The workflow for the press release form. The unnamed branches are there to demonstrate branching flows with decision points.

kissflow 05 IDG

The permission view for a form. It shows the possible fine-grained, field-level permissions as well as the field layout and the flow attached to the form.

kissflow 06 IDG

Filling out a form. Since I have development and administration rights, I have an “edit process” button at the lower left.

Kissflow processes

A process is a type of workflow that ensures a strict sequential set of steps performed on form data. Flow admins for a process can set up a form to carry data, and then make a predefined path for it to follow. The system automatically routes the requests through various steps until the item is complete. Processes are a great fit in places where you would want strict control and efficiency.

The screenshots above show the forms and flow for a press release request. Common processes include vacation requests, purchase request, employee onboarding, budget approval requests, visitor pass requests, and vendor enrollments.

Kissflow cases, projects, and boards

Case systems are useful for support requests, incident management, service requests, bug tracking, help desks (as shown below), sales pipelines, customer onboarding, HR help desks, and facility service requests. Cases support both list and board views.

Kissflow projects are adaptable and support various project management methodologies such as value stream mapping, work breakdown structure, and iterative incremental development. Projects use Kanban boards as their default visualization, and also support list and matrix visualizations.

Projects and case systems are currently being combined into a new module, boards.

kissflow 07IDG

This help desk is a good example of a case system. It has two views, the list view shown and a board view. The list view emphasizes the cases; the board view emphasizes the status of the cases.

Kissflow datasets

A dataset is a collection of tabular data that can be used in your flows. Forms in your flows can look up information in your datasets or use the information for advanced assignment logic.

A view is a subset of your dataset. You can create views to restrict access to certain parts of a dataset.

Kissflow integrations

An integration consists of a series of sequential steps that indicate how data should be transferred and transformed between different Kissflow applications or other third-party applications. Each of these steps are built using connectors.

Any integration starts with a trigger—an event in one of your connectors that kick-starts your workflow. It pushes data from the flow to complete one or many connector actions.

Kissflow apps

Low-code app development in Kissflow uses visual tools and custom scripting to develop and deliver applications. Kissflow’s app platform provides a visual interface with drag-and-drop features and pre-configured components that allow for application development with little or no coding. You can optionally use JavaScript to add custom features to your application.

kissflow 08 IDG

This IT asset management system is an example of a low-code Kissflow app. The pages and forms were created with drag-and-drop of stock components. You (or IT) can optionally add JavaScript to handle specific events.

kissflow 09 IDG

To develop a Kissflow low-code app, you initially work in a development workspace. When you need testing, you can promote the app to a testing workspace, and then publish it when the testing is complete. Here we are looking at the app’s variables.

kissflow 10 IDG

Here we’re viewing all of the pages in the IT asset management app. You can see that the AdminHome page is marked with a “home” icon.

kissflow 11 IDG

This app has role-based navigation for employees, IT admins, and IT managers.

kissflow 12 IDG

Here we’re editing the menu navigation system for an IT administrator role.

kissflow 13 IDG

The model view shows the relationships among the various tables of the app. Clicking on an item opens a detail view for that item.

kissflow 14 IDG

This is the model detail view for the asset entry table. It shows the fields in the table as well as the related tables.

Adding code in Kissflow

You can add functionality to pages and components by handling events. Kissflow can implement redirect and popup actions without making you write code, but for more complicated actions you will need to write some JavaScript.

kissflow 15 IDG

The property sheets to the right of a Kissflow page include general, style, and event properties. For a page, there are two possible events to handle: onPageLoad and onPageUnload. Other events apply to components on the page, for example the onClick event, which causes redirection to detail pages for some of the cards displayed.

kissflow 16IDG

The JavaScript action we are viewing handles the onPageLoad event for the home page. The kf class provides Kissflow functionality. At the right you can view application variables, components, and other stuff that might be relevant to the code you’re writing.

Overall, Kissflow has a good selection of low-code and no-code capabilities, even though its Cases and Projects modules are currently in flux. Combining those two no-code modules into a single Boards module does seem like a good idea, as deciding whether you need a Case or Project system at the beginning of your development effort can be challenging, especially if you’re new to Kissflow.

I initially criticized the separation of Kissflow’s no-code modules and low-code apps into different development systems with inconsistent user experiences. After some convincing by a Kissflow product manager, I accepted that real developers need the three-stage (dev, test, production) deployment process implemented for apps, while citizen developers often find that too complicated. I’ve seen similar issues in several contexts, not just with citizen developers but also with data scientists.

As a side effect of Kissflow’s new implementation of apps and transition from Cases and Projects to Boards, its documentation has become at least partially out of date. (Some of the documentation pages still say OrangeTech, which is a clue to their age.) I’m sure that will all be fixed in time. Meanwhile, expect to ask lots of questions as you learn the product.

Cost: Small business: $10/user/month, 50 users minimum, $6,000 billed annually. Corporate: $20/user/month, 100 users minimum, $24,000 billed annually. Enterprise: Get a custom quote. A free trial is available without the need for a credit card. Courses on the Academy range from $50 to $550, payable by credit card.

Platform: Server: Kissflow is hosted on Google Cloud Platform. Client: Chrome 56+, Safari 13.2+, Edge 79+.

Posted Under: Tech Reviews
Deno vs. Node.js: Which is better?

Posted by on 10 August, 2022

This post was originally published on this site

In this article, you’ll learn about Node.js and Deno, the differences between CommonJS and ECMAScript modules, using TypeScript with Deno, and faster deployments with Deno Deploy. We’ll conclude with notes to help you decide between using Node.js or Deno for your next development project.

What is Node.js?

Node.js is a cross-platform JavaScript runtime environment that is useful for both servers and desktop applications. It runs a single-threaded event loop registered with the system to handle connections, and each new connection causes a JavaScript callback function to fire. The callback function can handle requests with non-blocking I/O calls. If necessary, it can spawn threads from a pool to execute blocking or CPU-intensive operations and to balance the load across CPU cores.

Node’s approach to scaling with callback functions requires less memory to handle more connections than most competitive architectures that scale with threads, including Apache HTTP Server, the various Java application servers, IIS and ASP.NET, and Ruby on Rails.

Node applications aren’t limited to pure JavaScript. You can use any language that transpiles to JavaScript; for example, TypeScript and CoffeeScript. Node.js incorporates the Google Chrome V8 JavaScript engine, which supports ECMAScript 2015 (ES6) syntax without any need for an ES6-to-ES5 transpiler such as Babel.

Much of Node’s utility comes from its large package library, which is accessible from the npm command. NPM, the Node Package Manager, is part of the standard Node.js installation, although it has its own website.

The JavaScript-based Node.js platform was introduced by Ryan Dahl in 2009. It was developed as a more scalable alternative to the Apache HTTP Server for Linux and MacOS. NPM, written by Isaac Schlueter, launched in 2010. A native Windows version of Node.js debuted in 2011.

What is Deno?

Deno is a secure runtime for JavaScript and TypeScript that has been extended for WebAssembly, JavaScript XML (JSX), and its TypeScript extension, TSX. Developed by the creator of Node.js, Deno is an attempt to reimagine Node to leverage advances in JavaScript since 2009, including the TypeScript compiler.

Like Node.js, Deno is essentially a shell around the Google V8 JavaScript engine. Unlike Node, it includes the TypeScript compiler in its executable image. Dahl, who created both runtimes, has said that Node.js suffers from three major issues: a poorly designed module system based on centralized distribution; lots of legacy APIs that must be supported; and lack of security. Deno fixes all three problems.

Node’s module system problem was solved by an update in mid-2022.

CommonJS and ECMAScript modules

When Node was created, the de-facto standard for JavaScript modules was CommonJS, which is what npm originally supported. Since then, the ECMAScript committee officially blessed ECMAScript modules, also known as ES modules, which is supported by the jspm package manager. Deno also supports ES modules.

Experimental support for ES modules was added in Node.js 12.12 and is stable from Node.js 16 forward. TypeScript 4.7 also supports ES modules for Node.js 16.

The way to load a CommonJS module in JavaScript is to use the require statement. The way to load an ECMAScript module is to use an import statement along with a matching export statement.

The latest Node.js has loaders for both CommonJS and ES modules. How are they different? The CommonJS loader is fully synchronous; is responsible for handling require() calls; supports folders as modules; and tries adding extensions (.js, .json, or .node) if one was omitted from the require() call. The CommonJS loader cannot be used to load ECMAScript modules. The ES modules loader is asynchronous; is responsible for handling both import statements and import() expressions; does not support folders as modules (directory indexes such as ./startup/index.js must be fully specified); does not search for extensions; and accepts only .js, .mjs, and .cjs extensions for JavaScript text files. ES modules can be used to load JavaScript CommonJS modules.

Why Deno is better for security

It is well known that Deno improves security over Node. Mainly, this is because Deno, by default, does not let a program access disk, network, subprocesses, or environmental variables. When you need to allow any of these, you can opt-in with a command-line flag, which can be as granular as you like; for example, --allow-read=/tmp or Another security improvement in Deno is that it always dies on uncaught errors. Node, by contrast, will allow execution to proceed after an uncaught error, with unpredictable results.

Can you use Node.js and Deno together?

As you consider whether to use Node.js or Deno for your next server-side JavaScript project, you’ll probably wonder whether you can combine them. The answer to that is a definite “maybe.”

First off, many times, using Node packages from Deno just works. Even better, there are workarounds for many of the common stumbling blocks. These include using the std/node modules of the Deno standard library to “polyfill” the built-in modules of Node; using CDNs to access the vast majority of npm packages in ways that work under Deno; and using import maps. Moreover, Deno has a Node compatibility mode starting with Deno 1.15.

On the downside, Node’s plugin system is incompatible with Deno; Deno’s Node compatibility mode doesn’t support TypeScript; and a few built-in Node modules (such as vm) are incompatible with Deno.

 If you’re a Node user thinking of switching to Deno, here’s a cheat sheet to help you.

Using TypeScript with Deno

Deno treats TypeScript as a first-class language, just like JavaScript or WebAssembly. It converts TypeScript (as well as TSX and JSX) into JavaScript, using a combination of the TypeScript compiler, which is built into Deno, and a Rust library called swc. When the code has been type-checked (if checking is enabled) and transformed, it is stored in a cache. In other words, unlike Node.js or a browser, you don’t need to manually transpile your TypeScript for Deno with the tsc compiler.

As of Deno 1.23, there is no TypeScript type-checking in Deno by default. Since most developers interact with the type-checker through their editor, type-checking again when Deno starts up doesn’t make a lot of sense. That said, you can enable type-checking with the --check flag to Deno.

Deno Deploy for faster deployments

Deno Deploy is a distributed system that allows you to run JavaScript, TypeScript, and WebAssembly close to users, at the edge, worldwide. Deeply integrated with the V8 runtime, Deno Deploy servers provide minimal latency and eliminate unnecessary abstractions. You can develop your script locally using the Deno CLI, and then deploy it to Deno Deploy’s managed infrastructure in less than a second, with no need to configure anything.

Built on the same modern systems as the Deno CLI, Deno Deploy provides the latest and greatest in web technologies in a globally scalable way:

  • Builds on the web: Use fetch, WebSocket, or a URL just like in the browser.
  • Built-in support for TypeScript and JSX: type-safe code, and intuitive server-side rendering without a build step.
  • Web-compatible ECMAScript modules: Import dependencies just like in a browser, without the need for explicit installations.
  • GitHub integration: Push to a branch, review a deployed preview, and merge to release to production.
  • Extremely fast: Deploy in less than a second; serve globally close to users.
  • Deploy from a URL: Deploy code with nothing more than a URL.

Deno Deploy has two tiers. The free tier is limited to 100,000 requests per day, 100 GiB data transfer per month, and 10ms CPU time per request. The pro tier costs $10 per month including 5 million requests per month and 100 GiB data transfer, plus $2-per-million additional requests per month and $0.30/GiB data transfer over the included quota; the pro tier allows 50ms CPU time per request.

Which to choose: Node.js or Deno?

As you might expect, the answer of which technology is better for your use case depends on many factors. My bottom line: If you have an existing Node.js deployment that isn’t broken, then don’t fix it. If you have a new project that you intend to write in TypeScript, then I’d strongly consider Deno. However, if your TypeScript project needs to use multiple Node.js packages that do not have Deno equivalents, you will need to weigh the Deno project’s feasibility. Starting with a proof-of-concept is pretty much mandatory: It’s hard to predict whether you can make a given Node.js package work in Deno without trying it.

Posted Under: Tech Reviews
Review: Snowflake aces Python machine learning

Posted by on 3 August, 2022

This post was originally published on this site

Last year I wrote about eight databases that support in-database machine learning. In-database machine learning is important because it brings the machine learning processing to the data, which is much more efficient for big data, rather than forcing data scientists to extract subsets of the data to where the machine learning training and inference run.

These databases each work in a different way:

  • Amazon Redshift ML uses SageMaker Autopilot to automatically create prediction models from the data you specify via a SQL statement, which is extracted to an Amazon S3 bucket. The best prediction function found is registered in the Redshift cluster.
  • BlazingSQL can run GPU-accelerated queries on data lakes in Amazon S3, pass the resulting DataFrames to RAPIDS cuDF for data manipulation, and finally perform machine learning with RAPIDS XGBoost and cuML, and deep learning with PyTorch and TensorFlow.
  • BigQuery ML brings much of the power of Google Cloud Machine Learning into the BigQuery data warehouse with SQL syntax, without extracting the data from the data warehouse.
  • IBM Db2 Warehouse includes a wide set of in-database SQL analytics that includes some basic machine learning functionality, plus in-database support for R and Python.
  • Kinetica provides a full in-database lifecycle solution for machine learning accelerated by GPUs, and can calculate features from streaming data.
  • Microsoft SQL Server can train and infer machine learning models in multiple programming languages.
  • Oracle Cloud Infrastructure can host data science resources integrated with its data warehouse, object store, and functions, allowing for a full model development lifecycle.
  • Vertica has a nice set of machine learning algorithms built-in, and can import TensorFlow and PMML models. It can do prediction from imported models as well as its own models.

Now there’s another database that can run machine learning internally: Snowflake.

Snowflake overview

Snowflake is a fully relational ANSI SQL enterprise data warehouse that was built from the ground up for the cloud. Its architecture separates compute from storage so that you can scale up and down on the fly, without delay or disruption, even while queries are running. You get the performance you need exactly when you need it, and you only pay for the compute you use.

Snowflake currently runs on Amazon Web Services, Microsoft Azure, and Google Cloud Platform. It has recently added External Tables On-Premises Storage, which lets Snowflake users access their data in on-premises storage systems from companies including Dell Technologies and Pure Storage, expanding Snowflake beyond its cloud-only roots.

Snowflake is a fully columnar database with vectorized execution, making it capable of addressing even the most demanding analytic workloads. Snowflake’s adaptive optimization ensures that queries automatically get the best performance possible, with no indexes, distribution keys, or tuning parameters to manage.

Snowflake can support unlimited concurrency with its unique multi-cluster, shared data architecture. This allows multiple compute clusters to operate simultaneously on the same data without degrading performance. Snowflake can even scale automatically to handle varying concurrency demands with its multi-cluster virtual warehouse feature, transparently adding compute resources during peak load periods and scaling down when loads subside.

Snowpark overview

When I reviewed Snowflake in 2019, if you wanted to program against its API you needed to run the program outside of Snowflake and connect through ODBC or JDBC drivers or through native connectors for programming languages. That changed with the introduction of Snowpark in 2021.

Snowpark brings to Snowflake deeply integrated, DataFrame-style programming in the languages developers like to use, starting with Scala, then extending to Java and now Python. Snowpark is designed to make building complex data pipelines a breeze and to allow developers to interact with Snowflake directly without moving data.

The Snowpark library provides an intuitive API for querying and processing data in a data pipeline. Using this library, you can build applications that process data in Snowflake without moving data to the system where your application code runs.

The Snowpark API provides programming language constructs for building SQL statements. For example, the API provides a select method that you can use to specify the column names to return, rather than writing 'select column_name' as a string. Although you can still use a string to specify the SQL statement to execute, you benefit from features like intelligent code completion and type checking when you use the native language constructs provided by Snowpark.

Snowpark operations are executed lazily on the server, which reduces the amount of data transferred between your client and the Snowflake database. The core abstraction in Snowpark is the DataFrame, which represents a set of data and provides methods to operate on that data. In your client code, you construct a DataFrame object and set it up to retrieve the data that you want to use.

The data isn’t retrieved at the time when you construct the DataFrame object. Instead, when you are ready to retrieve the data, you can perform an action that evaluates the DataFrame objects and sends the corresponding SQL statements to the Snowflake database for execution.

snowpark python 01IDG

Snowpark block diagram. Snowpark expands the internal programmability of the Snowflake cloud data warehouse from SQL to Python, Java, Scala, and other programming languages.

Snowpark for Python overview

Snowpark for Python is available in public preview to all Snowflake customers, as of June 14, 2022. In addition to the Snowpark Python API and Python Scalar User Defined Functions (UDFs), Snowpark for Python supports the Python UDF Batch API (Vectorized UDFs), Table Functions (UDTFs), and Stored Procedures.

These features combined with Anaconda integration provide the Python community of data scientists, data engineers, and developers with a variety of flexible programming contracts and access to open source Python packages to build data pipelines and machine learning workflows directly within Snowflake.

Snowpark for Python includes a local development experience you can install on your own machine, including a Snowflake channel on the Conda repository. You can use your preferred Python IDEs and dev tools and be able to upload your code to Snowflake knowing that it will be compatible.

By the way, Snowpark for Python is free open source. That’s a change from Snowflake’s history of keeping its code proprietary.

The following sample Snowpark for Python code creates a DataFrame that aggregates book sales by year. Under the hood, DataFrame operations are transparently converted into SQL queries that get pushed down to the Snowflake SQL engine.

from snowflake.snowpark import Session
from snowflake.snowpark.functions import col

# fetch snowflake connection information
from config import connection_parameters

# build connection to Snowflake
session = Session.builder.configs(connection_parameters).create()

# use Snowpark API to aggregate book sales by year
booksales_df = session.table("sales")
booksales_by_year_df = booksales_df.groupBy(year("sold_time_stamp")).agg([(col("qty"),"count")]).sort("count", ascending=False)

Getting started with Snowpark Python

Snowflake’s “getting started” tutorial demonstrates an end-to-end data science workflow using Snowpark for Python to load, clean, and prepare data and then deploy the trained model to Snowflake using a Python UDF for inference. In 45 minutes (nominally), it teaches:

  • How to create a DataFrame that loads data from a stage;
  • How to perform data and feature engineering using the Snowpark DataFrame API; and
  • How to bring a trained machine learning model into Snowflake as a UDF to score new data.

The task is the classic customer churn prediction for an internet service provider, which is a straightforward binary classification problem. The tutorial starts with a local setup phase using Anaconda; I installed Miniconda for that. It took longer than I expected to download and install all the dependencies of the Snowpark API, but that worked fine, and I appreciate the way Conda environments avoid clashes among libraries and versions.

This quickstart begins with a single Parquet file of raw data and extracts, transforms, and loads the relevant information into multiple Snowflake tables.

snowpark python 03IDG

We’re looking at the beginning of the “Load Data with Snowpark” quickstart. This is a Python Jupyter Notebook running on my MacBook Pro that calls out to Snowflake and uses the Snowpark API. Step 3 originally gave me problems, because I wasn’t clear from the documentation about where to find my account ID and how much of it to include in the account field of the config file. For future reference, look in the “Welcome To Snowflake!” email for your account information.

snowpark python 04IDG

Here we are checking the loaded table of raw historical customer data and beginning to set up some transformations.

snowpark python 05IDG

Here we’ve extracted and transformed the demographics data into its own DataFrame and saved that as a table.

snowpark python 06IDG

In step 12, we extract and transform the fields for a location table. As before, this is done with a SQL query into a DataFrame, which is then saved as a table.

snowpark python 07IDG

Here we extract and transform data from the raw DataFrame into a Services table in Snowflake.

snowpark python 08IDG

Next we extract, transform, and load the final table, Status, which shows the churn status and the reason for leaving. Then we do a quick sanity check, joining the Location and Services tables into a Join DataFrame, then aggregating total charges by city and type of contract for a Result DataFrame.

snowpark python 09IDG

In this step we join the Demographics and Services tables to create a TRAIN_DATASET view. We use DataFrames for intermediate steps, and use a select statement on the joined DataFrame to reorder the columns.

Now that we’ve finished the ETL/data engineering phase, we can move on to the data analysis/data science phase.

snowpark python 10IDG

This page introduces the analysis we’re about to perform.

snowpark python 11IDG

We start by pulling in the Snowpark, Pandas, Scikit-learn, Matplotlib, datetime, NumPy, and Seaborn libraries, as well as reading our configuration. Then we establish our Snowflake database session, sample 10K rows from the TRAIN_DATASET view, and convert that to Pandas format.

snowpark python 12IDG

We continue with some exploratory data analysis using NumPy, Seaborn, and Pandas. We look for non-numerical variables and classify them as categories.

snowpark python 13IDG

Once we have found the categorical variables, then we identify the numerical variables and plot some histograms to see the distribution.

snowpark python 14IDG

All four histograms.

snowpark python 15IDG

Given the assortment of ranges we saw in the previous screen, we need to scale the variables for use in a model.

snowpark python 16IDG

Having all the numerical variables lie in the range from 0 to 1 will help immensely when we build a model.

snowpark python 17IDG

Three of the numerical variables have outliers. Let’s drop them to avoid having them skew the model.

snowpark python 18IDG

If we look at the cardinality of the categorical variables, we see they range from 2 to 4 categories.

snowpark python 19IDG

We pick our variables and write the Pandas data out to a Snowflake table, TELCO_TRAIN_SET.

Finally we create and deploy a user-defined function (UDF) for prediction, using more data and a better model.

snowpark python 20IDG

Now we set up for deploying a predictor. This time we sample 40K values from the training dataset.

snowpark python 21IDG

Now we’re setting up for model fitting, on our way to deploying a predictor. Splitting the dataset 80/20 is standard stuff.

snowpark python 22IDG

This time we’ll use a Random Forest classifier and set up a Scikit-learn pipeline that handles the data engineering as well as doing the fitting.

snowpark python 23IDG

Let’s see how we did. The accuracy is 99.38%, which isn’t shabby, and the confusion matrix shows relatively few false predictions. The most important feature is whether there is a contract, followed by tenure length and monthly charges.

snowpark python 24IDG

Now we define a UDF to predict churn and deploy it into the data warehouse.

snowpark python 25IDG

Step 18 shows another way to register the UDF, using session.udf.register() instead of a select statement. Step 19 shows another way to run the prediction function, incorporating it into a SQL select statement instead of a DataFrame select statement.

You can go into more depth by running Machine Learning with Snowpark Python, a 300-level quickstart, which analyzes Citibike rental data and builds an orchestrated end-to-end machine learning pipeline to perform monthly forecasts using Snowflake, Snowpark Python, PyTorch, and Apache Airflow. It also displays results using Streamlit.

Overall, Snowpark for Python is very good. While I stumbled over a couple of things in the quickstart, they were resolved fairly quickly with help from Snowflake’s extensibility support.

I like the wide range of popular Python machine learning and deep learning libraries and frameworks included in the Snowpark for Python installation. I like the way Python code running on my local machine can control Snowflake warehouses dynamically, scaling them up and down at will to control costs and keep runtimes reasonably short. I like the efficiency of doing most of the heavy lifting inside the Snowflake warehouses using Snowpark. I like being able to deploy predictors as UDFs in Snowflake without incurring the costs of deploying prediction endpoints on major cloud services.

Essentially, Snowpark for Python gives data engineers and data scientists a nice way to do DataFrame-style programming against the Snowflake enterprise data warehouse, including the ability to set up full-blown machine learning pipelines to run on a recurrent schedule.

Cost: $2 per credit plus $23 per TB per month storage, standard plan, prepaid storage. 1 credit = 1 node*hour, billed by the second. Higher level plans and on-demand storage are more expensive. Data transfer charges are additional, and vary by cloud and region. When a virtual warehouse is not running (i.e., when it is set to sleep mode), it does not consume any Snowflake credits. Serverless features use Snowflake-managed compute resources and consume Snowflake credits when they are used.

Platform: Amazon Web Services, Microsoft Azure, Google Cloud Platform.

Posted Under: Tech Reviews

Social Media

Bulk Deals

Subscribe for exclusive Deals

Recent Post



Subscribe for exclusive Deals

Copyright 2015 - InnovatePC - All Rights Reserved

Site Design By Digital web avenue