Guide

SASE Day-2 Operations: Monitoring, Troubleshooting, and Tuning

February 16, 20266 min read

TL;DR

Day-2 ops is where SASE deployments succeed or fail. Set up DEM baselines in week one. Review TLS bypass list monthly — it only grows unless you prune it. Run quarterly policy audits against your application inventory. Track three metrics: P95 latency to top 10 SaaS apps, policy exception count, and mean time to resolve access issues. If any metric trends wrong for 2 consecutive weeks, investigate immediately.

Every SASE vendor sells you the deployment. Nobody talks about what happens on day 31. The platform is live, users are flowing through SSE PoPs, SD-WAN tunnels are up, and the project team moves to the next initiative. Six months later, the TLS bypass list has grown from 12 entries to 47, half the DLP policies fire so many false positives that the SOC ignores them, and nobody can explain why latency to Salesforce increased by 40ms last Thursday. Day-2 operations is the discipline that prevents this entropy. It is not glamorous, but it is the difference between a SASE deployment that delivers ongoing value and one that slowly decays into an expensive pipe.

Establish baselines in week one

The single most important day-2 task is establishing performance baselines before the project team disbands. Use your DEM (Digital Experience Monitoring) tooling to capture baseline metrics for your top 10 SaaS applications by user count: P50 and P95 latency, DNS resolution time, TLS handshake time, and time-to-first-byte. Also capture baseline metrics for your top 5 private applications accessed through ZTNA. Store these baselines in a shared document or dashboard that the operations team can reference for the next 12 months.

Why baselines matter: when a user reports that Salesforce is slow, you need to compare current P95 latency against the baseline. If the baseline was 180ms and current is 185ms, the problem is not SASE. If current is 340ms, something changed — a PoP routing issue, a policy change that added inspection overhead, or an ISP path change. Without baselines, every performance complaint becomes a finger-pointing exercise between the network team, the security team, and the SaaS vendor.

Weekly operational cadence

Task	Frequency	Owner	Tool
Review DEM latency dashboards for top 10 apps	Weekly	Network ops	DEM dashboard (ThousandEyes, Zscaler DEX, Netskope Proactive DEM)
Review DLP incident queue — triage true positives	Daily → weekly	Security ops	SSE DLP dashboard
Review CASB shadow IT report — new unsanctioned apps	Weekly	Security ops	CASB discovery dashboard
Check tunnel health across all branch sites	Weekly	Network ops	SD-WAN orchestrator
Review posture non-compliance — devices failing policy	Weekly	Endpoint team	ZTNA posture dashboard
Audit TLS bypass list — remove entries no longer needed	Monthly	Security ops	SSE policy console
Full policy audit against application inventory	Quarterly	Security architect	SSE policy console + CMDB
Failover testing — controlled link kills at 2-3 sites	Quarterly	Network ops	SD-WAN + SSE dashboards

TLS bypass list management

The TLS inspection bypass list is the single biggest source of security debt in SASE deployments. During deployment, every application that breaks under TLS inspection gets added to the bypass list as a quick fix. Six months later, you have 40-50 domains bypassing inspection, and nobody remembers why half of them were added. Some of those domains may be serving traffic that should be inspected — certificate pinning issues that the application vendor fixed in a subsequent update, or applications that were replaced by web-based alternatives.

Implement a monthly review process: export the bypass list, check each entry against a justification document (why was it added? what breaks with inspection enabled?), and test re-enabling inspection for entries older than 6 months. Many will pass without issues because the underlying application was patched or upgraded. Set a target: bypass list should contain fewer than 20 entries for a typical enterprise. If yours is above 40, you have a hygiene problem.

Every bypass entry should have an owner (the application team responsible), a justification (the specific technical reason inspection breaks), an expiration date (when to re-test), and a remediation plan (what needs to change for inspection to work). Without this documentation, the bypass list becomes permanent — and every bypassed domain is a domain where your DLP, malware scanning, and URL filtering are blind.

Troubleshooting slow applications

When a user reports slow application access through SASE, follow this diagnostic sequence:

Check DEM end-to-end path visualization. Identify which segment is slow: endpoint to PoP, PoP inspection latency, PoP to application, or application response time. This immediately narrows the investigation from 'SASE is slow' to a specific segment.
Compare current latency against baseline. If within 10% of baseline, the problem is likely not SASE. Direct the user to the application team or their local ISP.
Check for recent policy changes. A new TLS inspection rule, DLP policy, or URL category change can add latency. Correlate the user's complaint timeline with the policy change log.
Check PoP health. If multiple users at the same site report slowness, check if the primary SSE PoP is degraded. Look at the vendor's status page and your tunnel health metrics.
Check SD-WAN path selection. If the issue is site-specific, the SD-WAN may have failed over to a secondary path with higher latency. Check path selection logs and WAN link health.
Packet capture as last resort. If the above steps do not identify the issue, capture at the endpoint and at the ZTNA connector to compare. Look for TCP retransmissions, TLS handshake failures, or DNS resolution delays.

Policy tuning after deployment

SASE policies are not set-and-forget. Application landscapes change, new SaaS tools are adopted, departments reorganize, and threat patterns evolve. Run a full policy audit quarterly. Compare your SSE policy set against your current application inventory: are all applications covered? Are there policies for applications that were decommissioned (orphaned rules)? Are DLP patterns still aligned with your data classification scheme?

The most common post-deployment policy issues: DLP false positive fatigue (overly broad regex patterns that flag legitimate business documents — tune the patterns, do not disable DLP), CASB shadow IT noise (hundreds of low-risk apps generating alerts — create a sanctioned app list and only alert on truly risky categories like file sharing and AI tools), and ZTNA posture drift (devices gradually falling out of compliance as OS updates lag — work with the endpoint team on enforcement timelines, not just reporting).

Operational metrics dashboard

Build a single-pane dashboard with these metrics. Review it weekly as a team:

Metric	Target	Red flag
P95 latency — top 10 SaaS apps	Within 15% of baseline	> 25% above baseline for 2+ days
SSE tunnel uptime	> 99.9%	Any tunnel below 99.5% in a week
Policy exception count (TLS bypasses)	< 20	> 40 or growing month-over-month
DLP true positive rate	> 60%	< 30% (noise is drowning real incidents)
ZTNA posture compliance	> 95% of devices	< 85% (too many non-compliant devices accessing apps)
Mean time to resolve access issues	< 30 minutes	> 2 hours average
SD-WAN path failover events per week	< 5 per site	> 20 per site (unstable WAN links)

Sources & further reading

Gartner, "Best Practices for SASE Operations" — gartner.com/reviews/market/single-vendor-sase
Zscaler, "Digital Experience Monitoring Best Practices" — zscaler.com/products/digital-experience-monitoring
Cisco ThousandEyes, "Network Troubleshooting Guide" — thousandeyes.com/resources
Netskope, "SSE Policy Management Guide" — netskope.com/products/security-service-edge

Frequently asked questions

For a 1,000-user deployment with 50+ branch sites: budget 0.5 FTE for the first 6 months (weekly reviews, policy tuning, troubleshooting), dropping to 0.25 FTE once operational processes are mature. For a 5,000+ user deployment, budget 1 FTE dedicated to SASE operations for the first year. This person straddles the network and security teams and owns the DEM dashboard, policy lifecycle, and vendor relationship.

Treating the TLS bypass list as permanent. During deployment, applications that break under TLS inspection get added to the bypass list as a quick fix. If nobody reviews this list, it grows to 40-50 domains within 6 months, creating blind spots where your DLP, malware scanning, and URL filtering cannot see traffic. Implement a monthly review with re-testing and you will find that 30-40% of bypass entries can be removed because the application was patched or replaced.

Track three metrics: number of security incidents blocked (SWG malware blocks, DLP data leak prevention, CASB unsanctioned app blocks), mean time to detect and respond to incidents (SASE centralizes logs and correlation), and infrastructure cost savings (decommissioned VPN concentrators, proxies, and MPLS circuits). Most organizations see 30-50% reduction in security incident volume and 40-60% reduction in remote access helpdesk tickets within 6 months of full SASE deployment.

Start with vendor dashboards — they are purpose-built and require no configuration. Within 3-6 months you will outgrow them because you need to correlate SASE metrics with non-SASE data (SIEM alerts, endpoint telemetry, application performance monitoring). At that point, export SASE logs and DEM metrics via API to your SIEM or observability platform (Splunk, Elastic, Datadog) and build unified dashboards. Budget 2-4 weeks of engineering time for the initial integration.

Related on sase.cloud

DEM Component Deep Dive →TLS Inspection Guide →Shadow IT Discovery →Branch Deployment Guide →Deployment Cheatsheet →

More guides

SASE for MSPs: Multi-Tenant Guide

How to build managed SASE services: multi-tenant architecture, vendor MSP readiness, per-tenant isolation, licensing, an...

MPLS to SD-WAN Migration Guide

Phase-by-phase guide to migrating from MPLS to SD-WAN: circuit planning, overlay deployment, application-aware routing, ...

SASE Proof of Concept (PoC) Guide

Structured framework for a SASE proof of concept: success criteria, test scenarios, evaluation scorecard, common PoC tra...

Get the next guide in your inbox

One email per publish. Unsubscribe anytime.

ShareLinkedIn X

Was this helpful?

← Previous

GenAI Data Governance with SASE

Next →

ZTNA Rollout Playbook: From Pilot to Full Deployment

Guide