rpmjp/portfolio
rpmjp/projects/skillbridge/database-design.md
CompletedFebruary – April 2026

SkillBridge AI — Test Prep Academy Platform

Multi-tenant AI learning platform for test prep academies. Socratic AI tutor that refuses to give answers, AI quiz generation from uploaded files, async pre-grading, weak-spot detection, and AI parent summaries. Full role-based platform across admin, instructor, student, and parent.

LIVE DEMO
Python 3.12FastAPIPostgreSQL 16SQLAlchemy 2.0ReactTypeScriptTailwindGroq / Llama
Languages
Python54.2%
TypeScript45.1%
CSS0.3%
JavaScript0.2%
Dockerfile0.1%
Mako0.1%
database-design.md

Database Design

PostgreSQL 16, accessed through SQLAlchemy 2.0, migrated with Alembic. The schema is built around three principles: multi-tenant isolation, JSONB for data that evolves, and soft deletes for anything an academy might want back.


Core entities

academies        (id, slug, name, created_at)            -- the tenant
users            (id, tenant_id FK, email, password_hash, role, ...)
                 -- role: admin | instructor | student | parent
courses          (id, tenant_id FK, title, description, instructor_id FK, ...)
enrollments      (id, tenant_id FK, course_id FK, student_id FK, ...)
assignments      (id, tenant_id FK, course_id FK, title, instructions,
                  max_points, due_at, tutor_mode, rubric JSONB, ...)
submissions      (id, tenant_id FK, assignment_id FK, student_id FK,
                  content, files, grade, feedback, status, ...)
quizzes          (id, tenant_id FK, title, questions JSONB,
                  standard_id, topic_label, ...)
quiz_submissions (id, tenant_id FK, quiz_id FK, student_id FK,
                  answers JSONB, ai_suggestions JSONB, ...)
tutor_sessions   (id, tenant_id FK, assignment_id FK, student_id FK,
                  messages JSONB, mode, ...)
study_plans      (id, tenant_id FK, student_id FK, plan JSONB, ...)
parent_links     (id, tenant_id FK, parent_id FK, student_id FK)
exam_questions   (id, type, prompt, options JSONB, answer, ...)
gamification     (id, tenant_id FK, user_id FK, xp, badges JSONB, ...)

Key design decisions

tenant_id on every domain row. The isolation foundation. See multi-tenancy.md for the full reasoning. Every table above (except the shared exam_questions bank) carries it.

JSONB for data that evolves. Several columns are JSONB rather than normalized tables:

  • assignments.rubric — criteria are a list of {name, max_points, description}. Rubric shape varies per assignment; JSONB absorbs that without a rubric_criteria join table and the migrations it would need.
  • quizzes.questions — a quiz is a list of questions with per-type shapes (multiple-choice has options, short-answer has a reference answer). JSONB stores the whole structure; normalizing it into questions + options tables would add joins and rigidity for data that's always read and written as a unit.
  • tutor_sessions.messages — the full chat thread. Append-only, always read whole. A messages table would be over-engineering.
  • study_plans.plan, quiz_submissions.answers, quiz_submissions.ai_suggestions, gamification.badges — same logic.

The rule: if the data is always read and written as a single unit and its shape varies, JSONB. If it's queried and filtered across rows, a real column.

String columns over Postgres ENUM types for status and role fields. Adding a new submission status or user role shouldn't require an ALTER TYPE migration with its locking behavior — it's an application-layer change. String columns keep deploys simple. The application validates the allowed values via Pydantic and Python enums; the database stores strings.

Soft delete via deleted_at. Deleting a course, assignment, or user sets a timestamp rather than removing the row. Queries filter deleted_at IS NULL by default. This means an accidental delete is recoverable and the audit trail survives — important for an academy that might delete a course and then want last semester's data back.


The exam question bank is shared, not tenant-scoped

One deliberate exception to tenant scoping: exam_questions. The SAT and GRE question banks are standardized content — every academy prepping students for the GRE wants the same GRE questions. Scoping them per-tenant would mean duplicating identical questions across every academy. The bank is shared platform content; only the students' attempts at those questions are tenant-scoped.

The bank is populated by import scripts (import_question_bank.py, import_gre_question_bank.py) and includes GRE question types beyond plain multiple-choice — quantitative comparison and multiple-select. See exam-engine.md.