Quickstart

Querying in an IPython Notebook (Jupyter, Colab, etc) IPython is an interactive Python execution engine used by many notebooks (ie Jupyter, Colab, etc) to execute code blocks. The first step in using Kaskada in a notebook is to install the Kaskada Python client package:

!pip install kaskada

The next step is to create an API client. To do this, you'll need to obtain an API key in the Kaskada admin page:

import kaskada as kdafrom kaskada import compute
client = kda.client(os.getenv("CLIENT_ID"), os.getenv("CLIENT_SECRET"))

Get to know the "magic" Extension

IPython supports what it calls "magic commands" - commands prefixed with % or %%, whose implementation may be provided by arbitrary Python code.

Kaskada provides a magic command to improve the Fenl authoring experience. The IPython extension is optional and isn't required to use Kaskada. Feel free to use whichever client interface fits your workflow.

To use the magic extension you must first install the fenlmagic package in your notebook environment:

!pip install fenlmagic

Finally, load the extension to register the syntax extension with IPython:

%load_ext fenlmagic

You can make simple single-line queries by prefixing a code block with %fenl. The query results will be computed and returned as a Pandas dataframe.

%fenl Purchase.amount | sum()

You can write longer queries by using the double-percent prefix %%fenl. In this case, the query content starts on the next line and includes the rest of the code block's contents:

%%fenl

let max_amount = Purchase.amount | max()
let min_amount = Purchase.amount | min()
in (Purchase.amount - min_amount) / max_amount

The magic extension makes it easy to iterate on queries. Once you've crafted the perfect query, you might want to use the query string elsewhere, for example, to create a View saving your features. The fenl extension provides a flag you can use to assign the query string to a variable in your notebook's local environment:

%%fenl --var normalized_purchase_amount

let max_amount = Purchase.amount | max()
let min_amount = Purchase.amount | min()
in (Purchase.amount - min_amount) / max_amount

After executing this block, the variable normlized_purchase_amount will contain the query string:

from kaskada import views

views.create_view(view = {  "name": "NormalizedPurchaseAmount",
   "expression": normalized_purchase_amount,})

Simple Queries

You can make simple single-line queries by prefixing a code block with %fenl. The query results will be computed and returned as a Pandas dataframe.

%fenl Purchase.amount | sum()

You can write longer queries by using the double-percent prefix %%fenl. In this case the query content starts on the next line and includes the rest of the code block's contents:

%%fenl

let max_amount = Purchase.amount | max()
let min_amount = Purchase.amount | min()
in (Purchase.amount - min_amount) / max_amount

Naming and Sharing Features

The magic extension makes it easy to iterate on queries. Once you've crafted the perfect query, you might want to the query string elsewhere, for example, to create a named feature or view with a group of features. The fenl extension provides a flag you can use to assign the query string to a variable in your notebook's local environment.

Fenl expressions can be shared and re-used by creating a view. A view is a named expression. In subsequent Fenl expressions, the view's name is synonymous with the view's expression. Views are persisted in the Kaskada platform and are accessible to any user within an organization.

To create a view, we'll start by describing the expression we'd like to name. In this case, we're interested in describing what it means for a user to be "active". This definition depends on business logic and might require some iteration to get just right, so we'll use the IPython "magic" extension to explore the results of our query.

%%fenl --var is_active_user

count(Login, window = since(days = 30)) > 1

Notice that we added --var is_active_user to the beginning of the magic block; this causes the extension to assign the query string to the variable is_active_user when the block is run.

We can use this variable to create a view using the Python client without re-typing the expression:

from kaskada import view

view.create_view(  name = "IsActiveUser",
   expression = is_active_user,  client = client,)

We've now created a view named IsActiveUser. We can verify it was created successfully by searching for views matching the name:

view.list_views(search = "User")

{
  "views": {
    [{
         "view_name": "IsActiveUser",
         "expression": "count(Login, window = since(days = 30)) > 1"
      }]
   }
}

Using a view

Now that we've created a view, let's look at how the view can be used. We can use a view's name anywhere we could use the view's expression - the only restriction placed on views is that they must be valid Fenl expressions.

Here's an example of using a view to filter the values produced by an expression

:

%%fenl

record{  last_login: Login.time | last(),
  total_purchases: Purchase.amount | sum(),} if IsActiveUser

Views may reference other views, so we could give this expression a name and create a view for it as well if we wanted to.

Views are useful any time you need to share or re-use expressions:

  • Cleaning operations

  • Common business logic

  • Final feature vectors

Next Steps:

  • Check out our docs for a version of this quickstart with copyable code blocks and runnable notebooks.

  • Check out our examples for specific bite-sized problems and solutions

  • Check out Kaskada in action on industry-specific solutions and try it yourself!