파이썬/Fast API

FastAPI 프로젝트 배포 자동화 가이드 Part 3: 무중단 배포와 모니터링

코샵 2025. 2. 5. 10:15
반응형

이번 파트에서는 무중단 배포 구성과 모니터링 시스템 구축에 대해 알아보겠습니다.

무중단 배포 설정

Blue-Green 배포 방식을 구현하기 위한 스크립트입니다.

# scripts/blue_green_deploy.py
import docker
import os
import time

class BlueGreenDeployer:
    def __init__(self):
        self.client = docker.from_env()
        self.blue_port = 8000
        self.green_port = 8001

    def get_current_deployment(self):
        containers = self.client.containers.list(
            filters={"label": "app=fastapi"}
        )
        return "blue" if any(
            c.labels.get("environment") == "blue" for c in containers
        ) else "green"

    def deploy(self, image_name: str):
        current = self.get_current_deployment()
        new_color = "green" if current == "blue" else "blue"
        port = self.green_port if new_color == "green" else self.blue_port

        # 새 버전 배포
        container = self.client.containers.run(
            image_name,
            detach=True,
            ports={f"{port}/tcp": port},
            labels={
                "app": "fastapi",
                "environment": new_color
            }
        )

        # 헬스체크
        time.sleep(10)

        # Nginx 설정 업데이트
        self.update_nginx_config(port)

        # 이전 버전 제거
        for c in self.client.containers.list(
            filters={"label": f"environment={current}"}
        ):
            c.stop()
            c.remove()

if __name__ == "__main__":
    deployer = BlueGreenDeployer()
    deployer.deploy(os.getenv("IMAGE_NAME"))

동적 Nginx 설정

# scripts/nginx_config_generator.py
from jinja2 import Template

nginx_template = """
upstream fastapi_backend {
    server localhost:{{ port }};
}

server {
    listen 80;
    server_name {{ domain }};

    location / {
        proxy_pass http://fastapi_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}
"""

def generate_nginx_config(port: int, domain: str):
    template = Template(nginx_template)
    return template.render(port=port, domain=domain)

모니터링 스택 구성

Prometheus, Grafana, Loki를 사용한 종합 모니터링 설정입니다.

# docker-compose.monitoring.yml
version: '3.8'

services:
  prometheus:
    image: prom/prometheus
    volumes:
      - ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/usr/share/prometheus/console_libraries'
      - '--web.console.templates=/usr/share/prometheus/consoles'
    ports:
      - "9090:9090"
    networks:
      - monitoring

  grafana:
    image: grafana/grafana
    volumes:
      - ./grafana/provisioning:/etc/grafana/provisioning
      - grafana_data:/var/lib/grafana
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
      - GF_USERS_ALLOW_SIGN_UP=false
    ports:
      - "3000:3000"
    networks:
      - monitoring

  loki:
    image: grafana/loki
    ports:
      - "3100:3100"
    command: -config.file=/etc/loki/local-config.yaml
    networks:
      - monitoring

  promtail:
    image: grafana/promtail
    volumes:
      - /var/log:/var/log
      - ./promtail-config.yml:/etc/promtail/config.yml
    command: -config.file=/etc/promtail/config.yml
    networks:
      - monitoring

networks:
  monitoring:

volumes:
  prometheus_data:
  grafana_data:

커스텀 메트릭 수집

FastAPI 애플리케이션에 메트릭 수집 기능을 추가합니다.

# app/core/metrics.py
from prometheus_client import Counter, Histogram, Info
from fastapi import Request
import time

REQUEST_COUNT = Counter(
    "http_requests_total",
    "Total HTTP requests",
    ["method", "endpoint", "status"]
)

REQUEST_LATENCY = Histogram(
    "http_request_duration_seconds",
    "HTTP request latency",
    ["method", "endpoint"]
)

APP_INFO = Info("fastapi_app", "Application information")

async def metrics_middleware(request: Request, call_next):
    start_time = time.time()

    response = await call_next(request)

    duration = time.time() - start_time
    REQUEST_COUNT.labels(
        method=request.method,
        endpoint=request.url.path,
        status=response.status_code
    ).inc()

    REQUEST_LATENCY.labels(
        method=request.method,
        endpoint=request.url.path
    ).observe(duration)

    return response

알림 설정

Grafana를 사용한 알림 설정입니다.

# grafana/provisioning/alerting/alerts.yml
groups:
  - name: FastAPI Alerts
    rules:
      - alert: HighErrorRate
        expr: sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) > 0.1
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: High HTTP error rate
          description: Error rate is above 10% for more than 5 minutes

      - alert: SlowResponses
        expr: histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le)) > 1
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: Slow response times
          description: 95th percentile of response times is above 1 second

로그 수집 설정

# promtail-config.yml
server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://loki:3100/loki/api/v1/push

scrape_configs:
  - job_name: system
    static_configs:
      - targets:
          - localhost
        labels:
          job: varlogs
          __path__: /var/log/*log

  - job_name: containers
    static_configs:
      - targets:
          - localhost
        labels:
          job: containerlogs
          __path__: /var/lib/docker/containers/*/*.log

롤백 전략 구현

# scripts/rollback.py
import docker
import sys
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class Rollback:
    def __init__(self):
        self.client = docker.from_env()
        self.registry = "your-registry"

    def get_previous_image(self):
        images = self.client.images.list(
            filters={"label": "app=fastapi"}
        )
        return sorted(
            images,
            key=lambda x: x.attrs['Created']
        )[-2].tags[0]

    def execute_rollback(self):
        try:
            previous_image = self.get_previous_image()
            logger.info(f"Rolling back to {previous_image}")

            # 현재 실행 중인 컨테이너 중지
            current = self.client.containers.list(
                filters={"label": "app=fastapi"}
            )[0]
            current.stop()

            # 이전 버전 실행
            self.client.containers.run(
                previous_image,
                detach=True,
                ports={'8000/tcp': 8000},
                labels={"app": "fastapi"}
            )

            logger.info("Rollback completed successfully")
            return True

        except Exception as e:
            logger.error(f"Rollback failed: {str(e)}")
            return False

if __name__ == "__main__":
    rollback = Rollback()
    success = rollback.execute_rollback()
    sys.exit(0 if success else 1)

이렇게 구성된 시스템은 다음과 같은 이점을 제공합니다:

  1. 무중단 배포로 서비스 중단 없이 업데이트
  2. 상세한 모니터링으로 시스템 상태 실시간 파악
  3. 문제 발생 시 신속한 롤백 가능
  4. 로그 중앙화로 효율적인 문제 해결

배포 자동화와 모니터링 시스템은 지속적으로 개선하고 보완해야 하는 영역입니다. 시스템의 규모와 요구사항에 맞춰 적절히 조정하여 사용하시기 바랍니다.