Перейти к содержанию

SLA Monitoring Endpoints

Обзор Endpoints

Метод Endpoint Описание Auth требуется
GET /api/sla/status Получить текущий статус SLA ❌ Нет
GET /api/sla/report Получить полный SLA отчет ❌ Нет
GET /api/sla/metrics Получить основные SLA метрики ❌ Нет
GET /api/sla/availability Получить метрики доступности ❌ Нет
GET /api/sla/response-times Получить метрики времени отклика ❌ Нет
GET /api/sla/error-rate Получить метрики частоты ошибок ❌ Нет

GET /api/sla/status

Получить текущий статус SLA и информацию о нарушениях.

Запрос

Заголовки: | Заголовок | Значение | Обязательно | |-----------|----------|-------------| | Accept | application/json | ✅ Да |

Query параметры: Нет

Body параметры: Нет (GET запрос)

Ответ

Успех (200 OK):

{
  "success": true,
  "data": {
    "status": "healthy",
    "breaches": [],
    "timestamp": "2025-10-06T12:00:00Z",
    "healthy": true
  },
  "message": "SLA status retrieved successfully"
}

Возможные статусы:

  • healthy - Все SLA метрики в норме
  • warning - Приближение к нарушению SLA
  • critical - Нарушение SLA обнаружено

Пример с нарушениями:

{
  "success": true,
  "data": {
    "status": "critical",
    "breaches": [
      "Availability below 99.9% (current: 99.5%)",
      "P95 response time exceeds 2000ms (current: 2500ms)"
    ],
    "timestamp": "2025-10-06T12:00:00Z",
    "healthy": false
  }
}

Пример cURL

curl -X GET "http://localhost:8080/api/sla/status" \
  -H "Accept: application/json"

Пример TypeScript

interface SLAStatus {
  status: 'healthy' | 'warning' | 'critical';
  breaches: string[];
  timestamp: string;
  healthy: boolean;
}

async function getSLAStatus(): Promise<SLAStatus> {
  const response = await fetch('/api/sla/status');
  const { data } = await response.json();
  return data;
}

// Использование с real-time monitoring
async function monitorSLAStatus() {
  const status = await getSLAStatus();

  if (!status.healthy) {
    console.error(`⚠️ SLA Status: ${status.status}`);
    status.breaches.forEach(breach => {
      console.error(`  - ${breach}`);
    });

    // Отправить алерт
    await sendAlert({
      severity: status.status === 'critical' ? 'high' : 'medium',
      message: `SLA breaches detected: ${status.breaches.length}`,
      details: status.breaches
    });
  } else {
    console.log('✅ SLA Status: Healthy');
  }
}

Пример React компонента

import { useState, useEffect } from 'react';

function SLAStatusBadge() {
  const [status, setStatus] = useState<SLAStatus | null>(null);

  useEffect(() => {
    fetchStatus();
    const interval = setInterval(fetchStatus, 60000); // Каждые 60 секунд
    return () => clearInterval(interval);
  }, []);

  async function fetchStatus() {
    try {
      const data = await getSLAStatus();
      setStatus(data);
    } catch (error) {
      console.error('Failed to fetch SLA status:', error);
    }
  }

  if (!status) return null;

  const badgeClass = status.healthy ? 'badge-success' : 'badge-danger';
  const icon = status.healthy ? '✅' : '⚠️';

  return (
    <div className={`sla-badge ${badgeClass}`}>
      <span className="icon">{icon}</span>
      <span className="status">{status.status.toUpperCase()}</span>
      {status.breaches.length > 0 && (
        <span className="breach-count">{status.breaches.length}</span>
      )}
    </div>
  );
}

GET /api/sla/report

Получить полный детальный SLA отчет с историческими данными.

Запрос

Заголовки: | Заголовок | Значение | Обязательно | |-----------|----------|-------------| | Accept | application/json | ✅ Да |

Query параметры: Нет

Body параметры: Нет (GET запрос)

Ответ

Успех (200 OK):

{
  "success": true,
  "data": {
    "period": {
      "start": "2025-09-06T00:00:00Z",
      "end": "2025-10-06T23:59:59Z",
      "durationDays": 30
    },
    "summary": {
      "availability": 99.95,
      "uptime": "29d 23h 36m",
      "downtime": "24m",
      "errorRate": 0.08,
      "totalRequests": 15000000,
      "failedRequests": 12000
    },
    "targets": {
      "availability": 99.9,
      "maxErrorRate": 0.1,
      "p95ResponseTime": 2000,
      "p99ResponseTime": 5000
    },
    "compliance": {
      "availability": true,
      "errorRate": true,
      "responseTimeP95": true,
      "responseTimeP99": true,
      "overallCompliance": true
    },
    "incidents": [
      {
        "timestamp": "2025-09-15T14:30:00Z",
        "duration": "15 minutes",
        "impact": "Database connection timeout",
        "affected": ["user-api", "admin-api"],
        "resolved": true
      }
    ],
    "timestamp": "2025-10-06T12:00:00Z"
  },
  "message": "SLA report generated successfully"
}

Пример cURL

curl -X GET "http://localhost:8080/api/sla/report" \
  -H "Accept: application/json"

Пример TypeScript

interface SLAReport {
  period: {
    start: string;
    end: string;
    durationDays: number;
  };
  summary: {
    availability: number;
    uptime: string;
    downtime: string;
    errorRate: number;
    totalRequests: number;
    failedRequests: number;
  };
  targets: {
    availability: number;
    maxErrorRate: number;
    p95ResponseTime: number;
    p99ResponseTime: number;
  };
  compliance: {
    availability: boolean;
    errorRate: boolean;
    responseTimeP95: boolean;
    responseTimeP99: boolean;
    overallCompliance: boolean;
  };
  incidents: Array<{
    timestamp: string;
    duration: string;
    impact: string;
    affected: string[];
    resolved: boolean;
  }>;
  timestamp: string;
}

async function getSLAReport(): Promise<SLAReport> {
  const response = await fetch('/api/sla/report');
  const { data } = await response.json();
  return data;
}

// Генерация PDF отчета
async function generatePDFReport() {
  const report = await getSLAReport();

  const pdfData = {
    title: 'SLA Compliance Report',
    period: `${report.period.start} - ${report.period.end}`,
    sections: [
      {
        title: 'Summary',
        data: report.summary
      },
      {
        title: 'Compliance Status',
        data: report.compliance
      },
      {
        title: 'Incidents',
        data: report.incidents
      }
    ]
  };

  // Send to PDF generation service
  await generatePDF(pdfData);
}

GET /api/sla/metrics

Получить основные SLA метрики в реальном времени.

Запрос

Заголовки: | Заголовок | Значение | Обязательно | |-----------|----------|-------------| | Accept | application/json | ✅ Да |

Query параметры: Нет

Body параметры: Нет (GET запрос)

Ответ

Успех (200 OK):

{
  "success": true,
  "data": {
    "availability": 99.95,
    "uptime": "29d 23h 36m",
    "errorRate": 0.08,
    "responseTimeP50": "35ms",
    "responseTimeP95": "150ms",
    "responseTimeP99": "450ms",
    "timestamp": "2025-10-06T12:00:00Z"
  },
  "message": "SLA metrics retrieved successfully"
}

Пример cURL

curl -X GET "http://localhost:8080/api/sla/metrics" \
  -H "Accept: application/json"

Пример TypeScript

interface SLAMetrics {
  availability: number;
  uptime: string;
  errorRate: number;
  responseTimeP50: string;
  responseTimeP95: string;
  responseTimeP99: string;
  timestamp: string;
}

async function getSLAMetrics(): Promise<SLAMetrics> {
  const response = await fetch('/api/sla/metrics');
  const { data } = await response.json();
  return data;
}

// Dashboard widget с auto-refresh
function SLAMetricsWidget() {
  const [metrics, setMetrics] = useState<SLAMetrics | null>(null);

  useEffect(() => {
    fetchMetrics();
    const interval = setInterval(fetchMetrics, 30000); // Каждые 30 секунд
    return () => clearInterval(interval);
  }, []);

  async function fetchMetrics() {
    try {
      const data = await getSLAMetrics();
      setMetrics(data);
    } catch (error) {
      console.error('Failed to fetch metrics:', error);
    }
  }

  if (!metrics) return <div>Loading metrics...</div>;

  return (
    <div className="sla-metrics-widget">
      <h3>SLA Metrics</h3>
      <div className="metric">
        <span className="label">Availability:</span>
        <span className="value">{metrics.availability}%</span>
      </div>
      <div className="metric">
        <span className="label">Uptime:</span>
        <span className="value">{metrics.uptime}</span>
      </div>
      <div className="metric">
        <span className="label">Error Rate:</span>
        <span className="value">{metrics.errorRate}%</span>
      </div>
      <div className="response-times">
        <h4>Response Times</h4>
        <div>P50: {metrics.responseTimeP50}</div>
        <div>P95: {metrics.responseTimeP95}</div>
        <div>P99: {metrics.responseTimeP99}</div>
      </div>
      <small>Last updated: {new Date(metrics.timestamp).toLocaleString()}</small>
    </div>
  );
}

🟢 GET /api/sla/availability

Получить детальные метрики доступности системы.

Запрос

Заголовки: | Заголовок | Значение | Обязательно | |-----------|----------|-------------| | Accept | application/json | ✅ Да |

Query параметры: Нет

Body параметры: Нет (GET запрос)

Ответ

Успех (200 OK):

{
  "success": true,
  "data": {
    "availability": 99.95,
    "uptime": "29d 23h 36m",
    "uptimeSeconds": 2591760,
    "timestamp": "2025-10-06T12:00:00Z",
    "slaTarget": 99.9,
    "meetsSLA": true
  },
  "message": "Availability metrics retrieved successfully"
}

Пример cURL

curl -X GET "http://localhost:8080/api/sla/availability" \
  -H "Accept: application/json"

Пример TypeScript

interface AvailabilityMetrics {
  availability: number;
  uptime: string;
  uptimeSeconds: number;
  timestamp: string;
  slaTarget: number;
  meetsSLA: boolean;
}

async function getAvailability(): Promise<AvailabilityMetrics> {
  const response = await fetch('/api/sla/availability');
  const { data } = await response.json();
  return data;
}

// Availability chart component
function AvailabilityChart() {
  const [metrics, setMetrics] = useState<AvailabilityMetrics | null>(null);

  useEffect(() => {
    fetchMetrics();
    const interval = setInterval(fetchMetrics, 60000);
    return () => clearInterval(interval);
  }, []);

  async function fetchMetrics() {
    const data = await getAvailability();
    setMetrics(data);
  }

  if (!metrics) return null;

  const percentage = metrics.availability;
  const target = metrics.slaTarget;
  const meetsSLA = metrics.meetsSLA;

  return (
    <div className="availability-chart">
      <h3>System Availability</h3>
      <div className="chart-container">
        <div className="progress-bar">
          <div
            className={`progress ${meetsSLA ? 'success' : 'warning'}`}
            style={{ width: `${percentage}%` }}
          >
            {percentage}%
          </div>
          <div className="target-line" style={{ left: `${target}%` }}>
            SLA Target: {target}%
          </div>
        </div>
      </div>
      <div className="details">
        <div>Uptime: {metrics.uptime}</div>
        <div>Status: {meetsSLA ? '✅ Meeting SLA' : '⚠️ Below SLA'}</div>
      </div>
    </div>
  );
}

GET /api/sla/response-times

Получить метрики времени отклика (response time percentiles).

Запрос

Заголовки: | Заголовок | Значение | Обязательно | |-----------|----------|-------------| | Accept | application/json | ✅ Да |

Query параметры: Нет

Body параметры: Нет (GET запрос)

Ответ

Успех (200 OK):

{
  "success": true,
  "data": {
    "responseTimeP50": "35ms",
    "responseTimeP95": "150ms",
    "responseTimeP99": "450ms",
    "responseTimeP50Ms": 35,
    "responseTimeP95Ms": 150,
    "responseTimeP99Ms": 450,
    "timestamp": "2025-10-06T12:00:00Z",
    "slaTargetP95Ms": 2000,
    "slaTargetP99Ms": 5000,
    "meetsP95SLA": true,
    "meetsP99SLA": true
  },
  "message": "Response time metrics retrieved successfully"
}

SLA Targets:

  • P95: Максимум 2000ms (2 секунды)
  • P99: Максимум 5000ms (5 секунд)

Пример cURL

curl -X GET "http://localhost:8080/api/sla/response-times" \
  -H "Accept: application/json"

Пример TypeScript

interface ResponseTimeMetrics {
  responseTimeP50: string;
  responseTimeP95: string;
  responseTimeP99: string;
  responseTimeP50Ms: number;
  responseTimeP95Ms: number;
  responseTimeP99Ms: number;
  timestamp: string;
  slaTargetP95Ms: number;
  slaTargetP99Ms: number;
  meetsP95SLA: boolean;
  meetsP99SLA: boolean;
}

async function getResponseTimes(): Promise<ResponseTimeMetrics> {
  const response = await fetch('/api/sla/response-times');
  const { data } = await response.json();
  return data;
}

// Response time histogram
function ResponseTimeHistogram() {
  const [metrics, setMetrics] = useState<ResponseTimeMetrics | null>(null);

  useEffect(() => {
    fetchMetrics();
    const interval = setInterval(fetchMetrics, 30000);
    return () => clearInterval(interval);
  }, []);

  async function fetchMetrics() {
    const data = await getResponseTimes();
    setMetrics(data);
  }

  if (!metrics) return null;

  return (
    <div className="response-time-histogram">
      <h3>Response Time Distribution</h3>
      <div className="percentiles">
        <div className="percentile">
          <span className="label">P50 (Median):</span>
          <span className="value">{metrics.responseTimeP50}</span>
        </div>
        <div className={`percentile ${metrics.meetsP95SLA ? 'success' : 'warning'}`}>
          <span className="label">P95:</span>
          <span className="value">{metrics.responseTimeP95}</span>
          <span className="target">Target: {metrics.slaTargetP95Ms}ms</span>
          {metrics.meetsP95SLA ? '✅' : '⚠️'}
        </div>
        <div className={`percentile ${metrics.meetsP99SLA ? 'success' : 'warning'}`}>
          <span className="label">P99:</span>
          <span className="value">{metrics.responseTimeP99}</span>
          <span className="target">Target: {metrics.slaTargetP99Ms}ms</span>
          {metrics.meetsP99SLA ? '✅' : '⚠️'}
        </div>
      </div>
      <div className="chart">
        <Bar
          data={{
            labels: ['P50', 'P95', 'P99'],
            datasets: [{
              label: 'Response Time (ms)',
              data: [
                metrics.responseTimeP50Ms,
                metrics.responseTimeP95Ms,
                metrics.responseTimeP99Ms
              ],
              backgroundColor: [
                'rgba(75, 192, 192, 0.6)',
                metrics.meetsP95SLA ? 'rgba(75, 192, 192, 0.6)' : 'rgba(255, 99, 132, 0.6)',
                metrics.meetsP99SLA ? 'rgba(75, 192, 192, 0.6)' : 'rgba(255, 99, 132, 0.6)'
              ]
            }]
          }}
        />
      </div>
    </div>
  );
}

GET /api/sla/error-rate

Получить метрики частоты ошибок (error rate).

Запрос

Заголовки: | Заголовок | Значение | Обязательно | |-----------|----------|-------------| | Accept | application/json | ✅ Да |

Query параметры: Нет

Body параметры: Нет (GET запрос)

Ответ

Успех (200 OK):

{
  "success": true,
  "data": {
    "errorRate": "0.08",
    "timestamp": "2025-10-06T12:00:00Z",
    "slaTarget": "0.1",
    "meetsSLA": true
  },
  "message": "Error rate metrics retrieved successfully"
}

SLA Target:

  • Max Error Rate: 0.1% (максимум 1 ошибка на 1000 запросов)

Пример cURL

curl -X GET "http://localhost:8080/api/sla/error-rate" \
  -H "Accept: application/json"

Пример TypeScript

interface ErrorRateMetrics {
  errorRate: string; // SafeDecimal format
  timestamp: string;
  slaTarget: string; // SafeDecimal format
  meetsSLA: boolean;
}

async function getErrorRate(): Promise<ErrorRateMetrics> {
  const response = await fetch('/api/sla/error-rate');
  const { data } = await response.json();
  return data;
}

// Error rate gauge
function ErrorRateGauge() {
  const [metrics, setMetrics] = useState<ErrorRateMetrics | null>(null);

  useEffect(() => {
    fetchMetrics();
    const interval = setInterval(fetchMetrics, 30000);
    return () => clearInterval(interval);
  }, []);

  async function fetchMetrics() {
    const data = await getErrorRate();
    setMetrics(data);
  }

  if (!metrics) return null;

  const errorRate = parseFloat(metrics.errorRate);
  const target = parseFloat(metrics.slaTarget);
  const percentage = (errorRate / target) * 100;

  return (
    <div className="error-rate-gauge">
      <h3>Error Rate</h3>
      <div className="gauge-container">
        <div className={`gauge ${metrics.meetsSLA ? 'success' : 'danger'}`}>
          <div className="needle" style={{ transform: `rotate(${percentage * 1.8}deg)` }}></div>
          <div className="gauge-value">
            {errorRate}%
          </div>
        </div>
      </div>
      <div className="gauge-labels">
        <span className="current">Current: {metrics.errorRate}%</span>
        <span className="target">Target:  {metrics.slaTarget}%</span>
      </div>
      <div className={`status ${metrics.meetsSLA ? 'success' : 'warning'}`}>
        {metrics.meetsSLA ? '✅ Within SLA' : '⚠️ Exceeds SLA'}
      </div>
    </div>
  );
}

Распространённые сценарии использования

1. Comprehensive SLA Dashboard

async function createSLADashboard() {
  const [status, report, metrics, availability, responseTimes, errorRate] = await Promise.all([
    getSLAStatus(),
    getSLAReport(),
    getSLAMetrics(),
    getAvailability(),
    getResponseTimes(),
    getErrorRate()
  ]);

  return {
    overview: {
      status: status.status,
      healthy: status.healthy,
      breaches: status.breaches.length
    },
    compliance: report.compliance,
    realtime: {
      availability: availability.availability,
      p95ResponseTime: responseTimes.responseTimeP95Ms,
      errorRate: parseFloat(errorRate.errorRate)
    },
    targets: {
      availability: availability.slaTarget,
      p95ResponseTime: responseTimes.slaTargetP95Ms,
      errorRate: parseFloat(errorRate.slaTarget)
    }
  };
}

2. Automated SLA Alerting

async function monitorSLAAndAlert() {
  const status = await getSLAStatus();

  if (!status.healthy) {
    // Fetch detailed metrics
    const [availability, responseTimes, errorRate] = await Promise.all([
      getAvailability(),
      getResponseTimes(),
      getErrorRate()
    ]);

    // Determine alert severity
    let severity: 'low' | 'medium' | 'high' = 'medium';
    if (status.status === 'critical') {
      severity = 'high';
    }

    // Send alert with details
    await sendAlert({
      severity,
      title: 'SLA Breach Detected',
      breaches: status.breaches,
      metrics: {
        availability: !availability.meetsSLA,
        p95ResponseTime: !responseTimes.meetsP95SLA,
        p99ResponseTime: !responseTimes.meetsP99SLA,
        errorRate: !errorRate.meetsSLA
      }
    });
  }
}

// Run monitoring every minute
setInterval(monitorSLAAndAlert, 60000);

3. Historical SLA Trend Analysis

interface SLATrend {
  timestamp: string;
  availability: number;
  errorRate: number;
  p95ResponseTime: number;
}

const slaHistory: SLATrend[] = [];

async function trackSLATrends() {
  const [availability, responseTimes, errorRate] = await Promise.all([
    getAvailability(),
    getResponseTimes(),
    getErrorRate()
  ]);

  const dataPoint: SLATrend = {
    timestamp: new Date().toISOString(),
    availability: availability.availability,
    errorRate: parseFloat(errorRate.errorRate),
    p95ResponseTime: responseTimes.responseTimeP95Ms
  };

  slaHistory.push(dataPoint);

  // Keep only last 24 hours
  const cutoff = Date.now() - 24 * 60 * 60 * 1000;
  const filtered = slaHistory.filter(
    point => new Date(point.timestamp).getTime() > cutoff
  );

  return filtered;
}

4. SLA Compliance Report Generation

async function generateComplianceReport(period: '7d' | '30d' | '90d') {
  const report = await getSLAReport();

  const complianceReport = {
    period: period,
    generatedAt: new Date().toISOString(),
    summary: {
      overallCompliance: report.compliance.overallCompliance,
      availability: {
        actual: report.summary.availability,
        target: report.targets.availability,
        compliant: report.compliance.availability
      },
      responseTime: {
        p95: {
          compliant: report.compliance.responseTimeP95,
          target: report.targets.p95ResponseTime
        },
        p99: {
          compliant: report.compliance.responseTimeP99,
          target: report.targets.p99ResponseTime
        }
      },
      errorRate: {
        actual: report.summary.errorRate,
        target: report.targets.maxErrorRate,
        compliant: report.compliance.errorRate
      }
    },
    incidents: report.incidents,
    recommendations: generateRecommendations(report)
  };

  return complianceReport;
}

function generateRecommendations(report: SLAReport): string[] {
  const recommendations: string[] = [];

  if (!report.compliance.availability) {
    recommendations.push('Improve system availability through redundancy');
  }

  if (!report.compliance.responseTimeP95) {
    recommendations.push('Optimize database queries and add caching');
  }

  if (!report.compliance.errorRate) {
    recommendations.push('Review error logs and fix recurring issues');
  }

  return recommendations;
}

5. Real-time SLA Monitoring Widget

function SLAMonitoringWidget() {
  const [status, setStatus] = useState<SLAStatus | null>(null);
  const [metrics, setMetrics] = useState<SLAMetrics | null>(null);

  useEffect(() => {
    fetchData();
    const interval = setInterval(fetchData, 30000); // Every 30 seconds
    return () => clearInterval(interval);
  }, []);

  async function fetchData() {
    try {
      const [statusData, metricsData] = await Promise.all([
        getSLAStatus(),
        getSLAMetrics()
      ]);
      setStatus(statusData);
      setMetrics(metricsData);
    } catch (error) {
      console.error('Failed to fetch SLA data:', error);
    }
  }

  if (!status || !metrics) return <div>Loading...</div>;

  return (
    <div className={`sla-widget ${status.healthy ? 'healthy' : 'unhealthy'}`}>
      <div className="status-header">
        <h3>SLA Status</h3>
        <span className={`badge ${status.status}`}>
          {status.status.toUpperCase()}
        </span>
      </div>

      <div className="metrics-grid">
        <div className="metric">
          <span className="label">Availability</span>
          <span className="value">{metrics.availability}%</span>
        </div>
        <div className="metric">
          <span className="label">Error Rate</span>
          <span className="value">{metrics.errorRate}%</span>
        </div>
        <div className="metric">
          <span className="label">P95 Response</span>
          <span className="value">{metrics.responseTimeP95}</span>
        </div>
      </div>

      {status.breaches.length > 0 && (
        <div className="breaches">
          <h4>⚠️ SLA Breaches</h4>
          <ul>
            {status.breaches.map((breach, idx) => (
              <li key={idx}>{breach}</li>
            ))}
          </ul>
        </div>
      )}

      <small>Last updated: {new Date(status.timestamp).toLocaleString()}</small>
    </div>
  );
}

Связанная документация

Технические детали

SLA Targets

Проект Saga использует следующие SLA targets:

Метрика Target Критичность
Availability ≥ 99.9% 🔴 Critical
Error Rate ≤ 0.1% 🔴 Critical
P95 Response Time ≤ 2000ms 🟡 High
P99 Response Time ≤ 5000ms 🟡 High

Monitoring Architecture

Real-time Metrics Collection:

  • Метрики собираются каждые 10 секунд
  • Aggregation в 1-минутные, 5-минутные и hourly buckets
  • Retention: 7 дней детальных метрик, 90 дней aggregated

Alerting Thresholds:

  • Warning: 95% of SLA target (например, 99.855% availability)
  • Critical: Below SLA target (например, < 99.9% availability)

Calculation Methods

Availability Calculation:

Availability = (Uptime / Total Time) * 100

Error Rate Calculation:

Error Rate = (Failed Requests / Total Requests) * 100

Response Time Percentiles:

  • P50 (Median): 50% of requests faster than this value
  • P95: 95% of requests faster than this value
  • P99: 99% of requests faster than this value

Best Practices

Monitoring Best Practices:

  • ✅ Monitor SLA metrics every 30-60 seconds
  • ✅ Set up automated alerts for SLA breaches
  • ✅ Review SLA reports weekly
  • ✅ Investigate degraded status immediately
  • ✅ Maintain historical trends for capacity planning

Response to SLA Breaches:

  1. Immediate: Check /api/sla/status for breach details
  2. Investigation: Review /api/sla/report for incident context
  3. Mitigation: Address root cause based on specific metric breach
  4. Post-mortem: Document incident and prevention measures

Версия документа: 2.0.0 Дата обновления: 2025-10-06 Связанные модули: sla, monitoring, health, performance