OBSERVABILITY MATURITY SELF-ASSESMENT

Question Title

* 1. What is your primary job title?

CIO / Chief Information Officer

CTO / Chief Technology Officer

VP of Engineering

VP of IT Operations

Director of Engineering / Platform Engineering

Director of IT Operations / SRE Manager

Cloud Architect / Principal Engineer

Other (please specify)

Question Title

* 2. How many employees does your organization have?

500 to 999

1,000 to 4,999

5,000 to 14,999

15,000 to 49,999

50,000 or more

Question Title

* 3. Which industry best describes your organization?

Financial Services / Insurance

Retail / Consumer Goods / eCommerce

Healthcare / Life Sciences

Technology / Software

Manufacturing / Industrial

Energy / Utilities

Media / Entertainment

Government / Public Sector

Other (please specify)

Question Title

* 4. Which cloud platforms does your organization currently use? (Select all that apply)

AWS

Microsoft Azure

Google Cloud Platform

On-premises / Private Cloud (primary)

Hybrid (cloud + on-premises)

Multi-cloud (two or more public cloud providers)

Question Title

* 5. Which observability platforms is your organization currently using? (Select all that apply)

New Relic

Datadog

Coralogix

Dynatrace

AppDynamics (Cisco)

Elastic Observability

Prometheus / Grafana (open-source stack)

Splunk

AWS CloudWatch (primary tool)

Azure Monitor (primary tool)

We do not currently use a dedicated observability platform

Other

Question Title

* 6. Approximately what percentage of your production services and applications have observability agents or instrumentation deployed?

Less than 25%

25% to 49%

50% to 74%

75% to 89%

90% or more

Question Title

* 7. Which telemetry types does your organization currently collect? (Select all that apply)

Infrastructure metrics (CPU, memory, disk, network)

Application Performance Metrics (APM / response times, error rates, throughput)

Distributed traces (end-to-end transaction tracing)

Structured application logs

Unstructured / raw logs

Synthetic monitoring (simulated user journeys)

Real User Monitoring (RUM / browser and mobile)

Business metrics (revenue, conversion, customer experience KPIs)

Security events (SIEM integration)

Question Title

* 8. How would you rate your organization's observability coverage of the following areas?

	None	Limited	Adequate	Thorough	Comprehensive
• Core production services	• Core production services None	• Core production services Limited	• Core production services Adequate	• Core production services Thorough	• Core production services Comprehensive
• Supporting / internal services	• Supporting / internal services None	• Supporting / internal services Limited	• Supporting / internal services Adequate	• Supporting / internal services Thorough	• Supporting / internal services Comprehensive
• Third-party integrations and dependencies	• Third-party integrations and dependencies None	• Third-party integrations and dependencies Limited	• Third-party integrations and dependencies Adequate	• Third-party integrations and dependencies Thorough	• Third-party integrations and dependencies Comprehensive
• Mobile and web front-end	• Mobile and web front-end None	• Mobile and web front-end Limited	• Mobile and web front-end Adequate	• Mobile and web front-end Thorough	• Mobile and web front-end Comprehensive
• Data pipelines and batch processing	• Data pipelines and batch processing None	• Data pipelines and batch processing Limited	• Data pipelines and batch processing Adequate	• Data pipelines and batch processing Thorough	• Data pipelines and batch processing Comprehensive

Question Title

* 9. Does your organization currently implement Observability-as-Code (defining dashboards, alerts, and SLOs in version-controlled configuration files)?

Yes — fully implemented with version control and CI/CD integration

Yes — partially (some elements managed as code, others manual)

We are planning to implement this in the next 12 months

No — all observability configuration is managed manually

Question Title

* 10. What is the primary barrier to expanding your observability instrumentation coverage?

Engineering capacity — we don't have the bandwidth to instrument more services

Skills gap — we lack expertise to implement instrumentation correctly

Tool cost — licensing costs are limiting our ability to expand coverage

Organizational complexity — difficult to get alignment across teams

Legacy systems — older systems are difficult to instrument

No clear ownership — unclear who is responsible for observability

We don't believe we have a coverage gap

Question Title

* 11. How many active alert rules does your organization currently have configured across all observability tools?

Fewer than 50

50 to 199

200 to 499

500 to 999

1,000 or more

We do not know

Question Title

* 12. Approximately what percentage of alerts your team receives in a typical week are genuinely actionable (require an actual response)?

Less than 15% — most alerts are noise

15% to 30%

31% to 50%

51% to 75%

More than 75% — our alerts are well-tuned

Question Title

* 13. How does your team currently prioritize incidents and alerts?

By technical severity only (CPU usage, error rate thresholds)

By service tier (Tier 1/2/3 classification)

By customer or business impact (estimated revenue or user impact)

By SLO/error budget burn rate

We do not have a formal prioritization framework

Question Title

* 14. Does your organization have formally defined Service Level Objectives (SLOs) with associated error budgets?

Yes — defined, actively monitored, and informing engineering decisions

Yes — defined, but rarely reviewed or acted upon

Partially — some services have SLOs, most do not

We are planning to implement SLOs in the next 12 months

No — we do not use SLOs

Question Title

* 15. What is your organization's current mean time to detect (MTTD) for Priority 1 production incidents?

We detect before customer impact (proactive)

Less than 5 minutes

5 to 15 minutes

15 to 30 minutes

More than 30 minutes

We typically find out from customers first

Question Title

* 16. What is your organization's current mean time to resolve (MTTR) for Priority 1 production incidents?

Less than 15 minutes

15 to 30 minutes

30 to 60 minutes

1 to 4 hours

More than 4 hours

Question Title

* 17. Do your alert definitions include documented runbooks or remediation guidance?

Yes — all alerts link to runbooks

Yes — most alerts link to runbooks (more than 75%)

Partially — fewer than half of alerts have runbooks

No — alerts exist without remediation documentation

Question Title

* 18. Can your team trace a customer's end-to-end journey through your production environment in real time?

Yes — full distributed tracing across all services in the customer journey

Partially — we can trace some services but have gaps

Only for specific high-priority journeys (e.g., checkout, login)

No — we have limited ability to trace customer journeys end-to-end

Question Title

* 19. When a Priority 1 incident occurs, can your team immediately quantify the business impact (revenue affected, users impacted)?

Yes — we have real-time business impact dashboards

Approximately — we can estimate within 30 minutes

We can calculate it but it takes more than 30 minutes

We typically cannot quantify business impact in real time

We have never attempted to quantify business impact during incidents

Question Title

* 20. How are observability dashboards and data consumed by non-engineering business stakeholders?

Executive dashboards are regularly reviewed by C-suite and VP-level leaders

Business stakeholders have self-service access to relevant metrics

Engineering provides on-request reports to business stakeholders

Business stakeholders rarely or never see observability data

Business stakeholders are not aware we have observability tooling

Question Title

* 21. Does your organization connect observability data to cloud cost management (FinOps)?

Yes — we have integrated observability and FinOps dashboards

Partially — some cost data is visible in our observability platform

No — cost management and observability are handled separately

We do not have a formal FinOps practice

Question Title

* 22. How would you rate your observability practice's contribution to the following business outcomes?

	No Contribution	Minimal Contribution	Moderate Contribution	Strong Contribution	Significant Contribution
• Reducing unplanned downtime costs	• Reducing unplanned downtime costs No Contribution	• Reducing unplanned downtime costs Minimal Contribution	• Reducing unplanned downtime costs Moderate Contribution	• Reducing unplanned downtime costs Strong Contribution	• Reducing unplanned downtime costs Significant Contribution
• Accelerating new feature delivery	• Accelerating new feature delivery No Contribution	• Accelerating new feature delivery Minimal Contribution	• Accelerating new feature delivery Moderate Contribution	• Accelerating new feature delivery Strong Contribution	• Accelerating new feature delivery Significant Contribution
• Improving customer satisfaction scores (CSAT/NPS)	• Improving customer satisfaction scores (CSAT/NPS) No Contribution	• Improving customer satisfaction scores (CSAT/NPS) Minimal Contribution	• Improving customer satisfaction scores (CSAT/NPS) Moderate Contribution	• Improving customer satisfaction scores (CSAT/NPS) Strong Contribution	• Improving customer satisfaction scores (CSAT/NPS) Significant Contribution
• Informing cloud cost optimization decisions	• Informing cloud cost optimization decisions No Contribution	• Informing cloud cost optimization decisions Minimal Contribution	• Informing cloud cost optimization decisions Moderate Contribution	• Informing cloud cost optimization decisions Strong Contribution	• Informing cloud cost optimization decisions Significant Contribution
• Supporting compliance and audit readiness	• Supporting compliance and audit readiness No Contribution	• Supporting compliance and audit readiness Minimal Contribution	• Supporting compliance and audit readiness Moderate Contribution	• Supporting compliance and audit readiness Strong Contribution	• Supporting compliance and audit readiness Significant Contribution

Question Title

* 23. How does your organization currently manage on-call responsibilities?

Formal on-call rotation with SLA-backed escalation procedures

On-call rotation exists but escalation procedures are informal

A small core team handles all incidents informally

Individual service owners are on-call for their own services only

There is no formal on-call process

Question Title

* 24. Does your organization conduct formal post-incident reviews (blameless retrospectives) after major incidents?

Yes — after every P1 incident, with documented findings shared broadly

Yes — but inconsistently, only for the most severe incidents

Informally, without documented findings or follow-through

No — we do not have a formal post-incident review process

Question Title

* 25. How would you describe the state of observability ownership within your engineering organization?

Centralized: a dedicated platform engineering or SRE team owns observability

Federated: ownership is distributed with a central team setting standards

Siloed: each team manages their own observability independently

Unclear: ownership is not formally defined

Outsourced: a third party manages our observability platform

Question Title

* 26. How frequently does your team proactively review observability data outside of incident response?

Daily — team reviews dashboards and trends every day

Weekly — formal weekly review meetings using observability data

Monthly — periodic reviews only

On-demand — only when investigating an issue

Rarely or never

Question Title

* 27. What percentage of your team would you estimate is actively using your observability platform at least weekly?

More than 75% — widespread adoption across the engineering org

50% to 75%

25% to 49%

Less than 25% — limited to a small subset of the team

We do not track platform adoption

Question Title

* 28. How would you rate the observability-related skills and knowledge within your current engineering team?

	Very Limited	Basic Awareness	Moderate Capability	Strong Capability	Advanced / Highly Skilled
Instrumentation and agent configuration	Instrumentation and agent configuration Very Limited	Instrumentation and agent configuration Basic Awareness	Instrumentation and agent configuration Moderate Capability	Instrumentation and agent configuration Strong Capability	Instrumentation and agent configuration Advanced / Highly Skilled
Dashboard design and data visualization	Dashboard design and data visualization Very Limited	Dashboard design and data visualization Basic Awareness	Dashboard design and data visualization Moderate Capability	Dashboard design and data visualization Strong Capability	Dashboard design and data visualization Advanced / Highly Skilled
SLO definition and error budget management	SLO definition and error budget management Very Limited	SLO definition and error budget management Basic Awareness	SLO definition and error budget management Moderate Capability	SLO definition and error budget management Strong Capability	SLO definition and error budget management Advanced / Highly Skilled
Distributed tracing implementation	Distributed tracing implementation Very Limited	Distributed tracing implementation Basic Awareness	Distributed tracing implementation Moderate Capability	Distributed tracing implementation Strong Capability	Distributed tracing implementation Advanced / Highly Skilled
Observability platform administration and optimization	Observability platform administration and optimization Very Limited	Observability platform administration and optimization Basic Awareness	Observability platform administration and optimization Moderate Capability	Observability platform administration and optimization Strong Capability	Observability platform administration and optimization Advanced / Highly Skilled

Question Title

* 29. How satisfied is your organization with the current ROI from your observability platform investment?

	Very Dissatisfied - We are not getting value	Dissatisfied	Neutral / Moderate Value	Satisfied	Very Satisfied - ROI is clear and measurable
How satisfied is your organization with the current ROI from your observability platform investment?	How satisfied is your organization with the current ROI from your observability platform investment? Very Dissatisfied - We are not getting value	How satisfied is your organization with the current ROI from your observability platform investment? Dissatisfied	How satisfied is your organization with the current ROI from your observability platform investment? Neutral / Moderate Value	How satisfied is your organization with the current ROI from your observability platform investment? Satisfied	How satisfied is your organization with the current ROI from your observability platform investment? Very Satisfied - ROI is clear and measurable

Question Title

* 30. How much does your organization spend annually on observability platforms and tooling (including licenses, infrastructure, and related tools)?

Less than $100K

$100K to $499K

$500K to $999K

$1M to $4.9M

$5M or more

We do not track observability-specific spend separately

Question Title

* 31. Approximately what percentage of your purchased observability platform capacity (licenses, ingest, etc.) is actively utilized?

Less than 30% — significant unused capacity

30% to 49%

50% to 74%

75% to 89%

90% or more — we are near or at capacity limits

Question Title

* 32. Has your organization experienced significant observability data ingest cost overruns in the past 12 months?

Yes — significant overruns requiring budget reallocation

Yes — minor overruns managed within existing budget

No — ingest costs are predictable and within budget

We do not actively track observability ingest costs

Question Title

* 33. How many distinct monitoring and observability tools (including APM, logging, infrastructure monitoring, AIOps) does your organization currently use?

1 to 2 tools (consolidated)

3 to 4 tools

5 to 7 tools

8 to 10 tools

More than 10 tools

Question Title

* 34. Which of the following best describes your organization's observability platform strategy over the next 12 to 24 months?

Consolidate onto fewer platforms (rationalization in progress)

Stay with current platform mix — optimize in place

Expand with additional tools or platforms

Evaluate and potentially replace primary platform

No formal observability platform strategy defined

Question Title

* 35. How is your observability budget expected to change over the next 12 months?

Significant increase (more than 20%)

Moderate increase (5% to 20%)

Flat — approximately the same

Decrease

Observability budget is not separately tracked

Question Title

* 36. Which observability capabilities are your highest investment priorities for the next 12 months? (Rank top 3)

Question Title

* 37. Is your organization currently evaluating or using AI-powered features in your observability platform?

Yes — actively using AI features with measurable value

Yes — piloting or evaluating AI features

No — planning to evaluate in the next 12 months

No — not a current priority

We are skeptical about the ROI of AI in IT operations

Question Title

* 38. Does your organization currently use or plan to use managed observability services (outsourcing platform management and/or incident response)?

Yes — we currently use a managed observability service provider

Actively evaluating managed services

Planning to evaluate in the next 12 months

No — we manage observability entirely in-house

We were not aware this option existed

Question Title

* 39. What is the single biggest barrier preventing your organization from advancing its observability maturity?

Lack of internal expertise and skills

Insufficient engineering bandwidth

Tool complexity — existing tools are difficult to use optimally

Budget constraints

Organizational silos — lack of cross-team alignment

No executive sponsor for observability investment

Legacy technology limiting instrumentation possibilities

Unclear ROI — difficult to justify investment

Question Title

* 40. How prepared is your organization for AI-powered IT operations that require high-quality observability foundations?

	Not at All Prepared Our observability foundation is minimal or fragmented, not AI supported.	Slightly Prepared Some observability tools or practices exist, but they are inconsistent or limited.	Moderately Prepared Basic observability practices (metrics, logs, monitoring) are in place.	Well Prepared Strong observability established; consistent telemetry, monitoring & visibility	Fully Prepared Our observability maturity is ready to enable AI-powered operations
How prepared is your organization for AI-powered IT operations that require high-quality observability foundations?	How prepared is your organization for AI-powered IT operations that require high-quality observability foundations? Not at All Prepared Our observability foundation is minimal or fragmented, not AI supported.	How prepared is your organization for AI-powered IT operations that require high-quality observability foundations? Slightly Prepared Some observability tools or practices exist, but they are inconsistent or limited.	How prepared is your organization for AI-powered IT operations that require high-quality observability foundations? Moderately Prepared Basic observability practices (metrics, logs, monitoring) are in place.	How prepared is your organization for AI-powered IT operations that require high-quality observability foundations? Well Prepared Strong observability established; consistent telemetry, monitoring & visibility	How prepared is your organization for AI-powered IT operations that require high-quality observability foundations? Fully Prepared Our observability maturity is ready to enable AI-powered operations