Quick Start
Install, instrument, validate, and ship Obtrace in production in under 30 minutes
This guide is intentionally opinionated: it gives you the shortest path to real production value, not just a local demo.
By the end of this quick start, you will have:
- One backend service instrumented and sending data.
- Optional frontend instrumentation connected to the same context model.
- Runtime and CI/CD context attached to incident investigation.
- AI-ready interfaces available (
Ask AI,llm.txt,mcp.json).
Prerequisites
- Obtrace account with an API key.
- Access to one production-relevant service (not a toy service).
- Access to deployment configuration (env vars/secrets).
- CI/CD pipeline access (recommended for release correlation).
What Success Looks Like
Before you start, define acceptance criteria:
- Telemetry arrives continuously for the target service.
- You can filter by
service,env, andversionwithout ambiguity. - At least one error or trace contains enough context for diagnosis.
- You can associate telemetry spikes with release/deploy metadata.
Step 1: Configure Authentication Correctly
Follow Authentication and create environment-scoped credentials.
Recommended key strategy:
- One key per environment (
dev,staging,prod). - Optional per-service keys for blast-radius control.
- Server-side keys only; never ship privileged keys to browser clients.
Step 2: Instrument the Highest-Impact Backend Service
Pick the service that creates the largest operational risk (checkout, auth, billing, API gateway).
- Select runtime in SDK Catalog.
- Install and initialize SDK.
- Add canonical attributes to every event/span:
serviceenvversionregion(if multi-region)
Why this order: backend-first usually gives the highest diagnostic value per minute invested.
Step 3: Add Frontend Instrumentation (If User Impact Matters)
If your incidents affect user interaction, add browser telemetry.
- Use JavaScript Browser SDK.
- Capture page, route, and interaction context where relevant.
- Correlate frontend failures with backend requests whenever possible.
This is what turns “backend is slow” into “which user path degraded and why”.
Step 4: Validate Ingestion and Data Quality
Do not continue rollout before this gate.
Validation checklist:
- Data flow is continuous, not bursty.
- Timestamps are sane (no major clock drift).
- Sampling is intentional and documented.
- No frequent
401/403or transport retries. - Tags are stable and standardized across services.
Step 5: Add Runtime Integration
Connect your actual runtime early:
Goal: incident context should include where the workload ran, not only app-level events.
Step 6: Attach CI/CD Context
Integrate GitHub Actions so telemetry can be read together with release events.
Minimum release context fields to propagate:
- commit SHA
- build ID
- deploy timestamp
- environment
Without release context, root-cause analysis always loses time.
Step 7: Enable AI Workflows (After Baseline Is Clean)
- Use floating Ask AI button in docs for contextual help.
- Publish machine-readable context at
/llm.txtand/mcp.json. - Review MCP for agent integrations.
AI workflows are only as good as your telemetry quality. Fix instrumentation quality first.
Common Mistakes
- Starting with too many services at once.
- Missing
service/env/versiontags. - Reusing a single key across all environments.
- Rolling out before validating ingestion quality.
- Treating docs as reference-only instead of an operations playbook.
Next Paths
- Architecture and mental model: Introduction
- Operational rollout sequence: How to use
- Runtime decision criteria: Integration Matrix