Intermediate python

Bouncing with Bash

One way to keep your program running indefinitely

Over time we'll include some more tips here for those of you running crawlers on linux or mac operating systems. A tiny bit of bash goes a long way. 

Automatic restarts

You probably know by now it is bad, bad, bad if your crawler stops running (without first withdrawing predictions). Rather than trying to make the crawler robust to internet outages and the like, we recommend wrapping it in a restart loop. 


#!/bin/bash 
while :
    do 
        python3 /home/me/crawlers/crawler1.py
done
You can also do this in Python, or with the watch command.

Bouncing your crawler

As a matter of taste, you might wish to restart a crawler periodically in bash, and perhaps perform some maintenance task such as upgrading the microprediction library or performing some offline task. 


#!/bin/bash 
START=`date +%s` 
while [ $(( $(date +%s) - 30000000 )) -lt $START ]; do
    set -e
    . /home/me/.virtualenvs/micro/bin/activate
    pip install git+git://github.com/microprediction/microprediction.git
    python3.8 /home/me/crawlers/crawler_with_timeout.py 82000
    sleep 60
done

Here the crawler will bounce every day, roughly. This example is for the brave, as it pulls the latest code release! It also passes a timeout to the python script running the crawler. Your crawler run method can take a timeout parameter in seconds. 

Some will prefer to use cron.

Summary

Maintaining a running process isn't that hard, even if your internet is shaky.

If you get knocked down, get up again!  

You are adding value in real-time, to ongoing operations, so bouncing back up is important. This isn't a drill!   

Expect failures

Design your programs, where possible, so they can recover gracefully. Perhaps withdraw all predictions after a crash is detected; or save parameters to disk in a separate process (as per this example) or persist state somewhere (such as with the microstate package). 

Back