How to configure GreatAI#

GreatAI aims to provide reasonable defaults wherever possible. The current configuration is always prominently displayed (and updated) on the dashboard and in the command-line start-up banner.

Using great_ai.configure #

You can override any of the default settings by calling great_ai.configure. If you don't call configure, the default settings are applied on the first call to most great-ai functions.

Warning

You must call great_ai.configure before calling (or decorating with) any other great-ai function. However, importing other functions before calling great_ai.configure is permitted.

configure-demo.py

from great_ai import configure, RouteConfig
import logging

configure(
    version='1.0.0',
    log_level=logging.INFO,
    seed=2,
    should_log_exception_stack=False, 
    prediction_cache_size=0,  #(1)
    disable_se4ml_banner=True,
    dashboard_table_size=200,
    route_config=RouteConfig(  #(2)
        feedback_endpoints_enabled=False,
        dashboard_enabled=False
    )
)

Completely disable caching.
The unspecified routes are enabled by default.

Using remote storage#

The only aspect that cannot be automated is choosing the backing storage for the database and file storage.

Right now, you have 3 options for storing the models and large datasets: LargeFileLocal, LargeFileMongo, and LargeFileS3.

Without explicit configuration, LargeFileLocal is selected by default. This one still version-controls your files but it only stores them in a local path (which of course can be a remote volume attached by NFS, HDFS, etc.).

Important

If your working directory contains a mongo.ini or s3.ini file, an attempt is made to auto-configure LargeFileMongo or LargeFileS3 respectively.

To use LargeFileMongo or LargeFileS3 explicitly, configure them before calling any other great-ai function.

S3-compatible#

s3.ini

aws_region_name = eu-west-2
aws_access_key_id = MY_AWS_ACCESS_KEY  # ENV:MY_AWS_ACCESS_KEY would also work
aws_secret_access_key = MY_AWS_SECRET_KEY
large_files_bucket_name = bucket-for-models

use-s3.py

from great_ai.large_file import LargeFileS3
from great_ai import save_model

LargeFileS3.configure_credentials_from_file('s3.ini') #(1)

model = [4, 3]
save_model(model, 'my-model')

This line isn't strictly necessary because if s3.ini (or mongo.ini) is available in the current working directory, they are automatically used to configure their respective LargeFile implementations/databases.

Departing from AWS

With the aws_endpoint_url argument, it is possible to use any other S3-compatible service such as Backblaze. In that case, it would be aws_endpoint_url=https://s3.us-west-002.backblazeb2.com.

GridFS#

GridFS specifies how to store files in MongoDB. The official MongoDB server and many compatible implementations support it.

mongo.ini

MONGO_CONNECTION_STRING=mongodb://localhost:27017  # this is the default value
# if `MONGO_CONNECTION_STRING` is specified, this default is overridden
MONGO_CONNECTION_STRING=ENV:MONGO_CONNECTION_STRING

MONGO_DATABASE=my-database  # it is automatically created if it doesn't exist

use-mongo.py

from great_ai.large_file import LargeFileMongo
from great_ai import save_model

LargeFileMongo.configure_credentials_from_file('mongo.ini')

model = [4, 3]
save_model(model, 'my-model')

Simplifying config files

You can combine mongo.ini or s3.ini with your application's config file because the unneeded keys are ignored by the configure_credentials_from_file method.

Using a database#

By default, a thread-safe version of TinyDB is utilised for saving the prediction traces into a local file. Unfortunately, for most production needs, this method is not suitable.

MongoDB#

Currently, only MongoDB is supported as a production-ready TracingDatabase. In order to use it, you have to either place a file named mongo.ini in your working directory or explicitly call either MongoDbDriver.configure_credentials_from_file or MongoDbDriver.configure_credentials.

Last update: August 20, 2022