How to perform prediction with GreatAI#
After creating a GreatAI service by wrapping your prediction function, and optionally configuring it, it's time to do some prediction.
Let's take the following example:
from great_ai import GreatAI
@GreatAI.create
def greeter(your_name: str) -> str:
return f'Hi {your_name}!'
One-off prediction#
Even though greeter
is now an instance of GreatAI, you can continue using it as a regular function.
>>> greeter('Bob')
Trace[str]({'created': '2022-07-11T14:31:46.183764',
'exception': None,
'feedback': None,
'logged_values': {'arg:your_name:length': 3, 'arg:your_name:value': 'Bob'},
'models': [],
'original_execution_time_ms': 0.0381,
'output': 'Hi Bob!',
'tags': ['greeter', 'online', 'development'],
'trace_id': '7c284fd7-7f0d-4464-b5f8-3ef126df34af'})
As you can see, the original return value is wrapped in a Trace object (which is also persisted in your database of choice). You can access the original value under the output
property.
Online prediction#
Likely, the main way you would like to expose your model is through an HTTP API. @GreatAI.create scaffolds many REST API endpoints for your model and creates a FastAPI app available under GreatAI.app. This can be served using uvicorn or any other ASGI server.
Since most ML code lives in Jupyter notebooks, therefore, deploying a notebook containing the inference function is supported. To achieve this, uvicorn
is wrapped by the great-ai
command-line utility, which — among others — takes care of feeding a notebook into uvicorn
. It also supports auto-reloading.
In development#
Success
Your model is accessible at localhost:6060.
Some configuration options are also supported.
More options
For more options (but no Notebook support), simply use uvicorn for starting your app (available at greeter.app
).
In production#
There are three main approaches for deploying a GreatAI service.
Manual deployment#
The app is run in production-mode if the value of the ENVIRONMENT
environment variable is set to production
.
Simply run ENVIRONMENT=production great-ai deploy.ipynb
in the command-line of a production machine.
This is the crudest approach; however, it might be fitting for some contexts.
Containerised deployment#
Run the notebook directly in a container or create a service for it using your favourite container orchestrator.
You can replace
pwd
with the path to your code's folder.
Use a Platform-as-a-Service#
Similar to the previous approach, your code will run in a container. However, instead of manually managing it, you can just choose from a plethora of PaaS providers (such as AWS ECS, DO App platform, MLEM, Streamlit) that take a Docker image as a source and handle the rest of the deployment.
To this end, you can also create a custom Docker image. It is especially useful if you have third-party dependencies, such as PyTorch or TensorFlow.
FROM scoutinscience/great-ai:latest
# Remove this block if you don't have a requirements.txt
COPY requirements.txt ./
RUN pip install --no-cache-dir --requirement requirements.txt
# If you store your models in S3 or GridFS, it may be a
# good idea to cache them in the image so that you don't
# have to download it each time a container starts
RUN large-file --backend s3 --secrets s3.ini --cache my-domain-predictor
# Add your application code to the image
COPY . .
# The default ENTRYPOINT is great-ai; specify its argument using CMD
CMD ["deploy.ipynb"]
Batch prediction#
Processing larger amounts of data on a single machine is made easy by the GreatAI's process_batch method. This relies on multiprocessing (parallel_map) to take full advantage of all available CPU-cores.
>>> greeter.process_batch(['Alice', 'Bob'])
[Trace[str]({'created': '2022-07-11T14:36:37.119183',
'exception': None,
'feedback': None,
'logged_values': {'arg:your_name:length': 5, 'arg:your_name:value': 'Alice'},
'models': [],
'original_execution_time_ms': 0.1251,
'output': 'Hi Alice!',
'tags': ['greeter', 'online', 'development'],
'trace_id': '90ffa15f-e839-41c4-8e7a-3211168bc138'}),
Trace[str]({'created': '2022-07-11T14:36:37.166659',
'exception': None,
'feedback': None,
'logged_values': {'arg:your_name:length': 3, 'arg:your_name:value': 'Bob'},
'models': [],
'original_execution_time_ms': 0.0571,
'output': 'Hi Bob!',
'tags': ['greeter', 'online', 'development'],
'trace_id': 'f48e94c7-0815-48b3-a864-41349d3dae84'})]