In the previous module we submitted predictions to a single data stream.
Now, we will create a crawler that predicts many different streams.
Here are the steps shown:
As with Module 1, we use Google colab to install the microprediction package
pip install microprediction
Then we import
from microprediction import new_key, MicroWriter
write_key = new_key(difficulty=9)
which lets us instantiate a MicroWriter as before (I recommend difficulty=10 or 11 instead).
mw = MicroWriter(write_key=write_key)
We reveal the private write_key so you can cut and paste it into the dashboard
print(write_key)
We visit Python Anywhere and set up a Hacker Account.
We establish a hacker account on PythonAnywhere, a cloud compute provider where you only pay for CPU compute seconds. This isn't the only way, and in future modules we'll cover other providers and other alternatives, such as running locally on your machine.
We open a bash console (Consoles->Bash) on PythonAnywhere and use pip3.8 to install the microprediction package.
pip3.8 install --user --upgrade microprediction
crawler = MicroCrawler(write_key="65a4sdf65as")
crawler.run()
The crawler can be run from the bash console. For example
python3.8 /home/yourusername/first_crawler/default_crawler.py
Don't ask me why I called it mw instead of crawler in the video, but that doesn't matter. What matters is where you are on the leaderboard. So, as with the previous Python module, punch your write key into the dashboard (which greets you at Microprediction.org, or from the top right corner at Microprediction.com) to see how your crawler is doing.
Once again it only took us ten minutes, and in this case two lines of Python, to kick off a crawler.
You can run your Python code anywhere you wish. I merely used PythonAnywhere as an example.
Unlike your submission from the first Python module, this crawler will predict fast moving time series constantly. If you stop the program, your predictions will rapidly become stale and you may plummet down the leaderboards. For this reason we have recommended using an always-on task at Python Anywhere. There are plenty of alternatives, however, which we will cover in future modules.