Version: 1.0
Fecha: 25 de Marzo, 2026
Contexto: Fase 3 del plan de migracion backend — eliminacion del header X-Tenant-ID, resolucion de tenant desde JWT, inyeccion de sesion de BD correcta por request, y adaptacion de modelos SQLAlchemy
Arquitecto: Carlos Alberto Torres Camargo
Clasificacion: Interno — Arquitectura
Documentar la transformacion del modelo de multi-tenancy de DataVault: de aislamiento logico (filtro por tenant_id en BD unica) a aislamiento fisico (BD separada por tenant, resuelta desde claims JWT via SimappeAdmin).
middleware/multitenancy.py)Archivo: backend/app/middleware/multitenancy.py (44 lineas)
# Estado actual — Lee tenant del header HTTP
class TenantContext:
def __init__(self):
self.tenant_id: Optional[uuid.UUID] = None
self.tenant: Optional[Tenant] = None
tenant_context = TenantContext()
async def get_tenant_from_header(request: Request) -> Optional[uuid.UUID]:
"""Extract tenant ID from X-Tenant-ID header."""
tenant_header = request.headers.get("X-Tenant-ID")
if not tenant_header:
return None
try:
return uuid.UUID(tenant_header)
except ValueError:
return None
async def validate_user_tenant_access(user_id: uuid.UUID, tenant_id: uuid.UUID) -> bool:
"""Validate that user has access to the specified tenant."""
async with AsyncSessionLocal() as db:
result = await db.execute(
select(UserTenant).filter(
UserTenant.user_id == user_id,
UserTenant.tenant_id == tenant_id
)
)
user_tenant = result.scalar_one_or_none()
return user_tenant is not None
Problemas:
| # | Problema | Riesgo |
|---|---|---|
| 1 | X-Tenant-ID se puede falsificar desde el cliente |
Seguridad — cualquier usuario podria enviar tenant_id de otro |
| 2 | tenant_context es un singleton global — race condition en concurrencia |
Bug — requests concurrentes pueden leer tenant de otro request |
| 3 | validate_user_tenant_access consulta BD local (tabla UserTenant) |
Esa tabla no existira en modelo database-per-tenant |
| 4 | Cada query debe incluir filter(Model.tenant_id == tenant_id) manualmente |
Si se olvida un filtro → fuga de datos entre tenants |
tenant_id en ModelosEjemplo en models/employee.py:
class Employee(Base):
__tablename__ = "employees"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
tenant_id = Column(UUID(as_uuid=True), ForeignKey("tenants.id"), nullable=False) # ← Filtro manual
first_name = Column(String(100), nullable=False)
last_name = Column(String(100), nullable=False)
...
Patron en todos los routers:
# Patron repetido en CADA endpoint de CADA router
@router.get("/employees")
async def list_employees(
current_user = Depends(get_current_active_user),
db: AsyncSession = Depends(get_db)
):
tenant_id = request.headers.get("X-Tenant-ID") # ← Manual
result = await db.execute(
select(Employee)
.filter(Employee.tenant_id == tenant_id) # ← Filtro manual
.order_by(Employee.created_at.desc())
)
...
Entidades con tenant_id (a eliminar):
| Modelo | Archivo | Tiene tenant_id FK |
|---|---|---|
| Employee | models/employee.py |
Si |
| Contract | models/contract.py |
Si |
| Certification | models/certification.py |
Si |
| Training | models/training.py |
Si |
| Absence | models/absence.py |
Si |
| HRDocument | models/hr_document.py |
Si |
| Notification | models/notification.py |
Si |
| Project | models/ingest.py |
Si |
| Ingest | models/ingest.py |
Si |
| AuditLog | models/audit_log.py |
Si |
| LegalHold | models/legal_hold.py |
Si |
| RetentionPolicy | models/document_retention.py |
Si |
| DocumentRetention | models/document_retention.py |
Si |
| ModulePermission | models/module_permission.py |
Si |
| FileAssignment | models/file_assignment.py |
Si |
| PreservationMessage | models/preservation_message.py |
Si |
| Tenant | models/tenant.py |
N/A (es la tabla) |
| UserTenant | models/user.py |
FK a Tenant |
┌──────────────────────────────────────────────────────────────────┐
│ ANTES (Actual) │
│ │
│ BD: datavault (unica) │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ tenants │ users │ employees (tenant_id FK) │ │
│ │ ──────── │ ────── │ ──────────────────────── │ │
│ │ id: ABC │ id: 1 │ id: 10, tenant_id: ABC │ │
│ │ id: XYZ │ id: 2 │ id: 11, tenant_id: ABC │ │
│ │ │ │ id: 12, tenant_id: XYZ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ Problema: si olvidas WHERE tenant_id = X → fuga de datos │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ DESPUES (Objetivo) │
│ │
│ BD: vault_abc (tenant ABC) BD: vault_xyz (tenant XYZ) │
│ ┌──────────────────────────┐ ┌──────────────────────────┐ │
│ │ employees │ │ employees │ │
│ │ ──────── │ │ ──────── │ │
│ │ id: 10 (sin tenant_id) │ │ id: 12 (sin tenant_id) │ │
│ │ id: 11 (sin tenant_id) │ │ │ │
│ └──────────────────────────┘ └──────────────────────────┘ │
│ La BD ES el tenant → imposible fuga de datos │
└──────────────────────────────────────────────────────────────────┘
middleware/multitenancy.pyAccion: Reescribir completamente. Ya no lee headers, trabaja con JWT.
# ============================================================
# NUEVO middleware/multitenancy.py
# ============================================================
import logging
from contextvars import ContextVar
from typing import Optional
from app.utils.auth import SimappeUserSession
logger = logging.getLogger(__name__)
# ContextVar para almacenar la sesion del usuario por request
# (thread-safe y asyncio-safe, a diferencia del singleton anterior)
_current_user_session: ContextVar[Optional[SimappeUserSession]] = ContextVar(
"current_user_session", default=None
)
_current_raw_token: ContextVar[Optional[str]] = ContextVar(
"current_raw_token", default=None
)
def set_request_context(user_session: SimappeUserSession, raw_token: str):
"""
Establece el contexto del request actual.
Llamado desde el middleware de autenticacion.
"""
_current_user_session.set(user_session)
_current_raw_token.set(raw_token)
# Almacenar token en el user_session para que get_tenant_db() lo use
user_session._raw_token = raw_token
def get_current_user_session() -> Optional[SimappeUserSession]:
"""Obtiene la sesion del usuario del request actual."""
return _current_user_session.get()
def get_current_raw_token() -> Optional[str]:
"""Obtiene el token JWT crudo del request actual."""
return _current_raw_token.get()
def clear_request_context():
"""Limpia el contexto al finalizar el request."""
_current_user_session.set(None)
_current_raw_token.set(None)
Diferencias criticas con la version anterior:
| Aspecto | Antes | Despues |
|---|---|---|
| Almacenamiento de contexto | TenantContext (singleton, shared) |
ContextVar (per-request, asyncio-safe) |
| Fuente del tenant | Header X-Tenant-ID |
Claims JWT (customerId + companyId) |
| Validacion de acceso | Consulta BD local (UserTenant) |
Implicita en el JWT (si tiene el claim, tiene acceso) |
| Riesgo de race condition | Alto (singleton mutable) | Ninguno (ContextVar aislada por coroutine) |
Archivo: backend/app/main.py
Cambio: Agregar middleware que inyecta contexto en cada request.
# ============================================================
# NUEVO middleware en main.py
# ============================================================
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import Response
from app.middleware.multitenancy import set_request_context, clear_request_context
from app.utils.auth import validate_simappe_jwt
class TenantContextMiddleware(BaseHTTPMiddleware):
"""
Middleware que extrae el JWT del request y establece el contexto
de tenant para toda la cadena de procesamiento.
Rutas excluidas: /health, /actuator/health, /docs, /openapi.json
"""
EXCLUDED_PATHS = {"/health", "/actuator/health", "/docs", "/openapi.json", "/"}
async def dispatch(self, request: Request, call_next) -> Response:
# No procesar rutas publicas
if request.url.path in self.EXCLUDED_PATHS:
return await call_next(request)
# Extraer token del header
auth_header = request.headers.get("Authorization", "")
if auth_header.startswith("Bearer "):
token = auth_header[7:]
user_session = validate_simappe_jwt(token)
if user_session:
set_request_context(user_session, token)
try:
response = await call_next(request)
return response
finally:
clear_request_context()
# En main.py, ANTES de incluir los routers:
app.add_middleware(TenantContextMiddleware)
Accion: Eliminar tenant_id de TODOS los modelos de negocio. Eliminar tablas tenants y user_tenants.
Ejemplo — Employee (patron aplicable a todos los modelos):
# ============================================================
# ANTES — models/employee.py
# ============================================================
class Employee(Base):
__tablename__ = "employees"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
tenant_id = Column(UUID(as_uuid=True), ForeignKey("tenants.id"), nullable=False) # ← ELIMINAR
first_name = Column(String(100), nullable=False)
last_name = Column(String(100), nullable=False)
...
# ============================================================
# DESPUES — models/employee.py
# ============================================================
class Employee(Base):
__tablename__ = "employees"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
# tenant_id ELIMINADO — la BD entera es del tenant
first_name = Column(String(100), nullable=False)
last_name = Column(String(100), nullable=False)
...
Lista completa de cambios por modelo:
| Modelo | Cambio | Columnas a eliminar | Relaciones a eliminar |
|---|---|---|---|
| Employee | Eliminar tenant_id FK |
tenant_id |
— |
| Contract | Eliminar tenant_id FK |
tenant_id |
— |
| Certification | Eliminar tenant_id FK |
tenant_id |
— |
| Training | Eliminar tenant_id FK |
tenant_id |
— |
| Absence | Eliminar tenant_id FK |
tenant_id |
— |
| HRDocument | Eliminar tenant_id FK |
tenant_id |
— |
| Notification | Eliminar tenant_id FK |
tenant_id |
— |
| Project | Eliminar tenant_id FK |
tenant_id |
— |
| Ingest | Eliminar tenant_id FK |
tenant_id |
— |
| AuditLog | Eliminar tenant_id FK |
tenant_id |
— |
| LegalHold | Eliminar tenant_id FK |
tenant_id |
— |
| RetentionPolicy | Eliminar tenant_id FK |
tenant_id |
— |
| DocumentRetention | Eliminar tenant_id FK |
tenant_id |
— |
| ModulePermission | Eliminar tenant_id FK |
tenant_id |
— |
| FileAssignment | Eliminar tenant_id FK |
tenant_id |
— |
| PreservationMessage | Eliminar tenant_id FK |
tenant_id |
— |
| Tenant | Eliminar tabla completa | Toda la tabla | Todas |
| UserTenant | Eliminar tabla completa | Toda la tabla | Todas |
| User | Evaluar: ¿se mantiene local o se usa Simappe? | user_tenants relationship |
user_tenants |
Cada uno de los 24 archivos de router debe actualizar su patron de dependency injection.
Patron de cambio (aplicable a TODOS los routers):
# ============================================================
# ANTES — patron actual en api/employees.py (y todos los demas)
# ============================================================
from app.utils.security import get_current_active_user
from app.database import get_db
@router.get("/")
async def list_employees(
current_user = Depends(get_current_active_user), # ← Busca User en BD local
db: AsyncSession = Depends(get_db), # ← BD unica fija
request: Request = None
):
tenant_id = request.headers.get("X-Tenant-ID") # ← Manual del header
result = await db.execute(
select(Employee)
.filter(Employee.tenant_id == tenant_id) # ← Filtro manual
)
...
# ============================================================
# DESPUES — patron nuevo
# ============================================================
from app.utils.security import require_company_selected, SimappeUserSession
from app.database import get_tenant_db
@router.get("/")
async def list_employees(
user_session: SimappeUserSession = Depends(require_company_selected), # ← Claims JWT
db: AsyncSession = Depends(get_tenant_db), # ← BD del tenant
):
# Ya NO necesita filtrar por tenant_id
# La BD entera es del tenant → SELECT directo
result = await db.execute(
select(Employee)
# SIN .filter(Employee.tenant_id == ...)
)
...
Routers a actualizar (24 archivos):
| Router | Archivo | Tamano | Complejidad |
|---|---|---|---|
| auth | api/auth.py |
14.6 KB | Ya resuelto en Doc 01 |
| employees | api/employees.py |
11.8 KB | Media |
| contracts | api/contracts.py |
11.8 KB | Media |
| certifications | api/certifications.py |
12.3 KB | Media |
| trainings | api/trainings.py |
10.2 KB | Media |
| absences | api/absences.py |
22.3 KB | Alta (import Excel) |
| hr_documents | api/hr_documents.py |
13.4 KB | Alta (file upload) |
| notifications | api/notifications.py |
5.6 KB | Baja |
| audit | api/audit.py |
4.7 KB | Baja |
| repository | api/repository.py |
166 KB | Muy alta |
| file_assignments | api/file_assignments.py |
23.6 KB | Media |
| ingest | api/ingest.py |
70.5 KB | Alta |
| module_permissions | api/module_permissions.py |
20.6 KB | Media |
| dashboard | api/dashboard.py |
42.3 KB | Alta |
| admin_tenants | api/admin_tenants.py |
9.1 KB | Evaluar eliminacion |
| admin_users | api/admin_users.py |
16.5 KB | Adaptar a Simappe |
| admin_user_tenants | api/admin_user_tenants.py |
17.3 KB | Evaluar eliminacion |
| legal_hold | api/legal_hold.py |
45.0 KB | Alta |
| document_retention | api/document_retention.py |
69.8 KB | Alta |
| preservation | api/preservation.py |
54.9 KB | Alta |
| preservation_messages | api/preservation_messages.py |
4.2 KB | Baja |
| oais_integration | api/oais_integration.py |
10.2 KB | Media |
| cloud | api/cloud.py |
40.9 KB | Alta |
| tenants | api/tenants.py |
9.8 KB | Evaluar eliminacion |
# ============================================================
# Migracion Alembic: eliminar tenant_id de todas las tablas
# ============================================================
"""Remove tenant_id from all models - switch to database-per-tenant
Revision ID: xxxx
"""
from alembic import op
import sqlalchemy as sa
def upgrade():
# 1. Eliminar FK constraints primero
tables_with_tenant_id = [
'employees', 'contracts', 'certifications', 'trainings',
'absences', 'hr_documents', 'notifications', 'projects',
'ingests', 'audit_logs', 'legal_holds', 'retention_policies',
'document_retentions', 'module_permissions', 'file_assignments',
'preservation_messages',
]
for table in tables_with_tenant_id:
# Eliminar FK constraint
op.drop_constraint(
f'fk_{table}_tenant_id', table, type_='foreignkey'
)
# Eliminar columna
op.drop_column(table, 'tenant_id')
# 2. Eliminar tablas de tenancy local
op.drop_table('user_tenants')
op.drop_table('tenants')
def downgrade():
# Recrear tablas de tenancy (rollback)
op.create_table('tenants',
sa.Column('id', sa.UUID(), primary_key=True),
sa.Column('name', sa.String(255), nullable=False),
sa.Column('created_at', sa.DateTime()),
sa.Column('updated_at', sa.DateTime()),
)
# ... recrear user_tenants y columnas tenant_id
ADVERTENCIA: Esta migracion debe ejecutarse EN CADA BD de tenant por separado. No se ejecuta en una BD central.
# ============================================================
# Script: crear BD para un nuevo tenant
# ============================================================
async def initialize_tenant_database(db_config: TenantDatabaseConfig):
"""
Crea el schema y tablas en la BD de un nuevo tenant.
Ejecuta migraciones Alembic hasta la version mas reciente.
"""
# 1. Crear engine temporal para migraciones
engine = create_async_engine(db_config.asyncpg_url)
# 2. Crear tablas
async with engine.begin() as conn:
await conn.run_sync(Base.metadata.create_all)
# 3. Marcar migraciones como aplicadas
# (las tablas ya estan en el estado final)
await engine.dispose()
| # | Riesgo | Control | Verificacion |
|---|---|---|---|
| R1 | Router olvida cambiar get_db por get_tenant_db |
Busqueda global: grep -r "get_db" api/ debe retornar 0 resultados (excepto get_tenant_db) |
Script de CI/CD |
| R2 | Modelo olvida eliminar tenant_id |
Busqueda: grep -r "tenant_id" models/ debe retornar 0 |
Script de CI/CD |
| R3 | Query con filtro tenant_id residual |
Busqueda: grep -r "tenant_id" api/ debe retornar 0 |
Script de CI/CD |
| R4 | Migracion Alembic falla en alguna BD de tenant | Ejecutar en modo dry-run primero. Rollback disponible | Test previo en BD de prueba |
| R5 | Datos huerfanos al eliminar tenant_id |
La migracion solo elimina la columna, no los datos. Los datos ya estan en la BD correcta | Verificacion manual |
1. Reescribir middleware/multitenancy.py (ContextVar)
└── No rompe nada — el middleware antiguo sigue existiendo
2. Agregar TenantContextMiddleware a main.py
└── Inyecta contexto pero no lo usa aun
3. Actualizar modelos (eliminar tenant_id)
└── Requiere migracion Alembic en BD de prueba
4. Actualizar routers (get_db → get_tenant_db)
└── Cambiar de a uno, empezando por los mas simples:
a) notifications (5.6 KB)
b) audit (4.7 KB)
c) preservation_messages (4.2 KB)
d) employees (11.8 KB)
... hasta repository (166 KB, al final)
5. Eliminar tablas Tenant y UserTenant
└── Solo cuando TODOS los routers esten migrados
6. Verificar: grep residual de tenant_id, X-Tenant-ID, get_db
| Version | Fecha | Autor | Descripcion |
|---|---|---|---|
| 1.0.0 | 2026-03-25 | Carlos Torres | Creacion del documento de migracion multi-tenancy |