watch(), and use request_context() so each request is traced end-to-end
with neural signals, raw payloads, and application responses.
Prerequisites
- Python 3.8+
- Flask
- PyTorch and a model you want to instrument (e.g. HuggingFace Transformers)
- API key and ingestion endpoint from the OneX Observability Dashboard
Step 1: Create the monitor
Create anOneXMonitor with your API key and endpoint. Use the dashboard to
get environment-specific endpoints (e.g. development vs production). Optional
config such as request_metadata and payload sampling helps tailor what gets
sent to the platform.
enable_logging=True to see framework detection and
signal export in the terminal.
Step 2: Load and watch the model
Load your model as usual, then pass it tomonitor.watch(). The SDK
auto-detects the framework (PyTorch, TensorFlow, JAX), attaches hooks, and
streams signals asynchronously. All inference inside a request_context will
be associated with the same request.
Step 3: Wrap routes with request context
Usemonitor.request_context() inside each prediction route. Pass the raw
request payload (e.g. user input) and any route metadata. Run your model
forward inside the context, build the API response, then call
ctx.record_response() so the platform receives both neural signals and the
application-level response.
- Raw payload: The first argument to
request_contextis sent as the request payload event (e.g.{"text": "..."}), so the platform can correlate inputs with neural signals. - Metadata: Optional
metadata(e.g.route,version) is attached to request/response events. - Response:
ctx.record_response(api_response)ensures the application response is exported with the samerequest_idas the neural signals and payload.
Complete example
Putting it together:Error handling
If an exception is raised insiderequest_context, you can still record a
failure response so the platform has a full trace. Use a try/except and
record_response with a payload that indicates failure (and optionally
success=False if you use manual instrumentation).
Graceful shutdown
Callmonitor.stop() when the Flask process exits so the SDK flushes
outstanding batches and closes cleanly. Using atexit or signal handlers
works well:
Next steps
- Configuration: See Configuration Reference for sampling, payload capture, logits/probabilities, and throughput limits.
- Integration overview: See Integration Guide for manual instrumentation, disabling request/response capture, and attention/metrics options.
- Signals: See Signals for the events the SDK sends to the platform.
