Integration Guide

This section explains how to integrate the SDK into different environments, handle request/response capture, and customize exported data.

Core Concepts

OneXMonitor: central object that configures the exporter and framework-specific adapters.
Adapters: attach framework-specific hooks (PyTorch, TensorFlow, JAX).
Exporter: batches signals and sends them to ingestion endpoints asynchronously.
Request context: optional helper to capture raw request payloads and final application responses.

Typical Flow

Instantiate OneXMonitor.
Call monitor.watch(model) for each model you want to instrument.
(Optional) Wrap incoming requests with monitor.request_context(...) to tag raw input + application response.
Call monitor.stop() when your application shuts down.

Basic Example (PyTorch)

from flask import Flask, request, jsonify
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from onex import OneXMonitor

app = Flask(__name__)

monitor = OneXMonitor(
    api_key="your-api-key",  # Retrieve from https://dashboard.observability.getonex.ai
    endpoint="https://your-ingestion-endpoint",  # Same dashboard provides the ingestion URL
    config={
        "payload_sample_items": 5,
        "payload_tensor_sample": 32,
        "request_metadata": {"app": "bert-api"},
    },
)

MODEL_NAME = "nlptown/bert-base-multilingual-uncased-sentiment"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForSequenceClassification.from_pretrained(
    MODEL_NAME,
    output_hidden_states=True,
    output_attentions=True,
)
model.eval()
model = monitor.watch(model)

@app.route("/predict", methods=["POST"])
def predict():
    payload = request.json or {}
    text = payload.get("text", "")

    with monitor.request_context({"text": text}, metadata={"route": "/predict"}) as ctx:
        inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
        with torch.no_grad():
            outputs = model(**inputs)

        probs = torch.softmax(outputs.logits, dim=-1)
        rating = int(torch.argmax(probs).item() + 1)
        confidence = float(torch.max(probs).item())

        api_response = {"rating": rating, "confidence": confidence, "text": text}
        ctx.record_response(api_response)

    return jsonify(api_response)

if __name__ == "__main__":
    app.run()

Why the request context?

Adds a raw block to the request payload event (your original JSON, not just tensors)
Emits an application-response event alongside the automatic model-output event
Reuses the same request_id for all neural signals, request payload, and response records

Manual Instrumentation

If you can’t call monitor.request_context, you can drive the adapter manually:

adapter = monitor.adapter
request_id = adapter.start_request_context(payload={"text": text})
try:
    outputs = model(**inputs)
    adapter.export_manual_response(
        request_id,
        response_payload={"rating": rating, "confidence": confidence},
        success=True,
    )
finally:
    adapter.end_request_context()

Graceful Shutdown

import signal

def shutdown(*args):
    monitor.stop()
    raise SystemExit(0)

signal.signal(signal.SIGTERM, shutdown)
signal.signal(signal.SIGINT, shutdown)