Intermediate python

GitHub Actions

Powering scheduled jobs

Did you know that you can use GitHub actions to automatically push new parameters to a repository, or submit a prediction, or top up your balance? If you are new to GitHub actions, notice the "Actions" tab in your GitHub repository, and the "New Workflow" button. 

 

github_actions

 

GitHub Actions are scripts that can be triggered on a schedule, or on a commit, or other actions you can define. They are good for all sorts of things. If your repository is public, you don't have to pay for them. This post explores GitHub Actions through some example repositories. These are:

Repository Purpose
microprediction/offline Illustrates offline estimation of parameters, and automated code push to save them. This is used by several FitCrawlers, as noted in the crawler examples
microprediction/keymaker Fork this and you'll easily maintain a balance for any write key, or collection or write keys. 
microprediction/microactors Fork this to create your own scheduled Copula or multivariate submissions.
microprediction/microactors-plots Illustrates saving plots to a repo, and also a coalition of write keys acting like one. 

 

You can fork and use some of these directly, or you can simply steal the ideas in the action specification (YAML) files. 

 

microprediction/offline

Visit https://github.com/microprediction/offline to see a minimalist example of code repository that automatically updates parameters. 

What it does

Every half an hour this repository:

  • Runs a github action that
  • picks a stream at random and, 
  • updates some parameters,
  • writes them to file (pickle, but it need not be), and
  • commits the model file to the master branch

In this way the repository will always contain reasonably up to date model parameters.

How you can mimic

Scheduled Github actions won't run on a fork until you manually enable them. You should be prompted for that, but alternatively you can:

  1. Make a new public repository
  2. Manually create a new GitHub action.
  3. Cut and paste workflows/fit.yml into your newly created workflow file. 

Of course, you'll want to modify fit.py as well, to make it fit whatever models you need from whatever packages suit your fancy. 

 

microprediction/keymaker

Have you visited the keymaker? 

matrix-keymaker

You can use this template to periodically top up your WRITE_KEY's balance using GitHub Actions, allowing you to publish streams (or make predictions) indefinitely. 

How to use this repo

  1. Fork it
  2. Create a GitHub secret called WRITE_KEY
  3. Enable GitHub actions

Look under Settings->Secrets. 

Don't have a write_key yet?

Open up New Key notebook in colab, and run it.

 

microprediction/microactors

Next, visit microprediction/microactors to see an example of a workflow that triggers a simple script submitting predictions.  

How you use this repository

  1. Fork it
  2. Open up this notebook in colab and run it to generate yourself a write key.
  3. Save the key as a Github secret called WRITE_KEY (instructions)
  4. Click on "accept" when Github asks you if you want to enable github actions. Go to Actions and you'll see the only action used in your repo (like this one). You should be able to enable it.

That's all. Of course, you will know to go to www.microprediction.org and plug your write key into the dashboard. 

Submitting Copula Predictions

The use of Github actions is particularly well suited to copula stream prediction. Here we are using the zcurve lagged value retrieval and bivariate/trivariate submission methods recently added to MicroWriter. Have you noticed those before? Look for submit_copula and submit_zvalues to avoid having to remember where to find the space-filling curve function and its inverse. If you want historical data in bivariate or trivariate form, you can use.  

The corresponding historical data retrieval methods are in MicroReader, as you would expect. Look for get_lagged_copulas if you want bivariate or trivariate values with uniform margins, or get_lagged_zvalues if you prefer to receive them as roughly N(0,1) standard normal numbers. 

While is reasonable to assume that a continuous process might be more suited to the submission of predictions for rapidly changing data, a scheduled job may work just fine for implied copulas that don't change too often.

What are z-streams again?

If none of this makes sense, see An Introduction to Z-Streams or the microprediction frequently asked questions. Put simply, some of the seemingly univariate time series such as this one are really multi-variate implied copulas. You can retrieve them in multivariate format using the .get_lagged_copulas or .get_lagged_zvalues methods of the MicroReader.

microprediction/microactors-plots

Since we are on the topic of bivariate and trivariate prediction, we have a good excuse to produce more eye-candy. Actually, the repository microactors-plots goes a little further than producing a /gallery of copula fitting pictures like the one below. 

trivariate-copula

The python script fit.py also illustrates a syndicate pattern, in which we use four different write keys to collaborate on a (hopefully) more accurate representation of the joint distribution. If you are familiar with the Lottery Paradox (see blog article) you'll see why this can be effective. 

By default, this repository uses the Copulas package to fit various types of vine copula. I've left plenty of room for improvement. For example, the following suggested improvements might help you race up the z2~ and z3~ leaderboards. 

  • Pseudo-sampling (again, see blog article)
  • Selection of the best Copula, rather than a random choice. 

 

GitHub Action Usage Limits

To save you searching for the documentation, here are the limits as of Oct 2020. 

github_limits

 

 

 

The Fine Print

Don't abuse the intent of Github actions.

No Crypto

Don't use GitHub actions to mine cryptocurrency.  

Terms and conditions

Refer to the terms and conditions to determine if your use case is appropriate. 

Back