Over time we'll include some more tips here for those of you running crawlers on linux or mac operating systems. A tiny bit of bash goes a long way.
You probably know by now it is bad, bad, bad if your crawler stops running (without first withdrawing predictions). Rather than trying to make the crawler robust to internet outages and the like, we recommend wrapping it in a restart loop.
You can also do this in Python, or with the watch command.
#!/bin/bash while : do python3 /home/me/crawlers/crawler1.py done
As a matter of taste, you might wish to restart a crawler periodically in bash, and perhaps perform some maintenance task such as upgrading the microprediction library or performing some offline task.
#!/bin/bash START=`date +%s` while [ $(( $(date +%s) - 30000000 )) -lt $START ]; do set -e . /home/me/.virtualenvs/micro/bin/activate pip install git+git://github.com/microprediction/microprediction.git python3.8 /home/me/crawlers/crawler_with_timeout.py 82000 sleep 60 done
Here the crawler will bounce every day, roughly. This example is for the brave, as it pulls the latest code release! It also passes a timeout to the python script running the crawler. Your crawler run method can take a timeout parameter in seconds.
Some will prefer to use cron.
Maintaining a running process isn't that hard, even if your internet is shaky.
You are adding value in real-time, to ongoing operations, so bouncing back up is important. This isn't a drill!