Integration Guide
This section explains how to integrate the SDK into different environments, handle request/response capture, and customize exported data.
Core Concepts
OneXMonitor: central object that configures the exporter and framework-specific adapters.- Adapters: attach framework-specific hooks (PyTorch, TensorFlow, JAX).
- Exporter: batches signals and sends them to ingestion endpoints asynchronously.
- Request context: optional helper to capture raw request payloads and final application responses.
Typical Flow
- Instantiate
OneXMonitor. - Call
monitor.watch(model)for each model you want to instrument. - (Optional) Wrap incoming requests with
monitor.request_context(...)to tag raw input + application response. - Call
monitor.stop()when your application shuts down.
Basic Example (PyTorch)
from flask import Flask, request, jsonify
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from onex import OneXMonitor
app = Flask(__name__)
monitor = OneXMonitor(
api_key="your-api-key", # Retrieve from https://dashboard.observability.getonex.ai
endpoint="https://your-ingestion-endpoint", # Same dashboard provides the ingestion URL
config={
"payload_sample_items": 5,
"payload_tensor_sample": 32,
"request_metadata": {"app": "bert-api"},
},
)
MODEL_NAME = "nlptown/bert-base-multilingual-uncased-sentiment"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForSequenceClassification.from_pretrained(
MODEL_NAME,
output_hidden_states=True,
output_attentions=True,
)
model.eval()
model = monitor.watch(model)
@app.route("/predict", methods=["POST"])
def predict():
payload = request.json or {}
text = payload.get("text", "")
with monitor.request_context({"text": text}, metadata={"route": "/predict"}) as ctx:
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=-1)
rating = int(torch.argmax(probs).item() + 1)
confidence = float(torch.max(probs).item())
api_response = {"rating": rating, "confidence": confidence, "text": text}
ctx.record_response(api_response)
return jsonify(api_response)
if __name__ == "__main__":
app.run()
Why the request context?
- Adds a
rawblock to the request payload event (your original JSON, not just tensors) - Emits an application-response event alongside the automatic model-output event
- Reuses the same
request_idfor all neural signals, request payload, and response records
Manual Instrumentation
If you can’t call monitor.request_context, you can drive the adapter manually:
adapter = monitor.adapter
request_id = adapter.start_request_context(payload={"text": text})
try:
outputs = model(**inputs)
adapter.export_manual_response(
request_id,
response_payload={"rating": rating, "confidence": confidence},
success=True,
)
finally:
adapter.end_request_context()
Graceful Shutdown
import signal
def shutdown(*args):
monitor.stop()
raise SystemExit(0)
signal.signal(signal.SIGTERM, shutdown)
signal.signal(signal.SIGINT, shutdown)