Impact Comparison Specification

Overview

Impact Comparison measures the change in survey scores over time within a single survey. By comparing a respondent's first attempt (baseline) to their latest attempt, organizations can quantify whether their programs are creating measurable change.

All comparisons happen within one survey. Cross-survey comparison is not supported because questions, order, and context differ between surveys, making the comparison statistically invalid.

Core Concept

A respondent may take a survey multiple times (attempt 1, 2, 3...). The impact comparison always uses: ✅

Baseline: Attempt #1 (the first time they took it)
Latest: The most recent completed attempt

If a respondent has only one attempt, they are excluded from the comparison (no delta to calculate).

If a respondent has three or more attempts, only the first and last are compared. Intermediate attempts are not used in the impact calculation (they can be viewed in the respondent's individual history).

What Gets Compared

Per Respondent ✅

For each respondent who has 2+ completed attempts:

Each scored question's answer from attempt #1 vs their latest attempt
Per-question delta (absolute and percentage change)
Overall score delta

Aggregate (Organization Level) ✅

Across all respondents with 2+ attempts:

Average baseline score (mean of all first attempts)
Average latest score (mean of all latest attempts)
Overall impact delta
Per-question average delta
Number of respondents included in comparison
Number of respondents excluded (only 1 attempt)

Calculation

Per-Question Delta ✅

question_delta = latest_score - baseline_score
question_delta_percent = (question_delta / baseline_score) * 100  (if baseline > 0)

Overall Survey Delta

respondent_baseline = sum(question_scores_attempt_1) / sum(max_possible_scores)
respondent_latest = sum(question_scores_latest) / sum(max_possible_scores)
respondent_delta = respondent_latest - respondent_baseline

overall_delta = mean(all respondent_deltas)

Direction ✅

if delta > 0.01: "improved"
if delta < -0.01: "declined"
else: "unchanged"

The 0.01 threshold prevents floating point noise from showing as change.

Real-Time Computation

Impact comparison is computed on demand, not pre-calculated. When an admin requests the impact report for a survey, the system:

Fetches all completed responses for that survey
Groups by respondent (email or user_id)
For each respondent with 2+ attempts, takes first and last
Computes per-question and overall deltas
Aggregates across all respondents
Returns the result

This is acceptable for up to ~1000 respondents. No caching or background jobs needed at current scale.

Survey Immutability Rule

Once a survey has received responses, its questions cannot be modified.

Changing questions after people have responded would invalidate the comparison (answers to the old question are no longer comparable to answers to the new question). If an organization needs to change questions:

They must delete all existing responses first
Or create a new survey with the updated questions

The system should enforce this: any PUT to a survey's questions returns an error if response_count > 0. ✅

API

Get Impact Report for a Survey

GET /api/v1/surveys/{survey_id}/impact

Returns the impact comparison computed in real time.

Response:

{
  "survey_id": "abc123",
  "survey_title": "Community Health Assessment",
  "total_respondents": 50,
  "respondents_with_multiple_attempts": 35,
  "respondents_excluded": 15,
  "baseline_avg_score": 45.2,
  "latest_avg_score": 72.1,
  "overall_delta": 26.9,
  "overall_delta_percent": 59.5,
  "per_question": [
    {
      "question_text": "How would you rate your access to healthcare?",
      "question_id": "q1",
      "baseline_avg": 2.1,
      "latest_avg": 4.3,
      "delta": 2.2,
      "delta_percent": 104.8,
      "direction": "improved"
    },
    {
      "question_text": "How often do you visit a healthcare provider?",
      "question_id": "q2",
      "baseline_avg": 3.0,
      "latest_avg": 3.1,
      "delta": 0.1,
      "delta_percent": 3.3,
      "direction": "unchanged"
    }
  ],
  "summary": "Overall impact improved by 59.5% across 35 respondents with multiple attempts. Strongest improvement in 'Access to healthcare' (+2.2). 15 respondents had only one attempt and were excluded.",  // ✅ Auto-generated summary
  "computed_at": "2026-03-29T01:00:00Z"
}

Access Control

Only OWNER and ADMIN of the survey's organization can view impact reports
Respondents can see their own individual attempt history (already exists via sessions)

Frontend

Impact Report Page (`/surveys/{id}/impact`) ✅

Stats Cards (top row):

Total respondents
Respondents with 2+ attempts
Overall baseline avg
Overall latest avg
Impact delta (with direction arrow)

Per-Question Table:

Question	Baseline Avg	Latest Avg	Delta	Direction
Access to healthcare	2.1	4.3	+2.2	Improved
Visit frequency	3.0	3.1	+0.1	Unchanged

Visual Bar Chart: Horizontal bars showing baseline (gray) vs latest (colored) for each question, making improvement visually obvious.

Summary Text: Auto-generated narrative paragraph describing the overall impact, strongest improvements, and any declines.

Respondent Breakdown (expandable): List of individual respondents showing their personal baseline → latest delta. Allows drilling into who improved most and who needs more support.

What This Replaces

The current impact/comparisons API (which compares two separate surveys by matching question text) should be replaced with this single-survey approach. The cross-survey comparison is invalid because:

Questions created independently are not the same question, even if the text matches
Question order and surrounding context affect responses
Modifying one survey's questions silently breaks the text matching
There is no guarantee the same respondents took both surveys

Relationship to Existing Features

Survey Sessions provide the multi-attempt infrastructure (attempt numbering, per-user history)
Scoring System provides the per-question score calculation
Impact Comparison adds the aggregate first-vs-latest delta across all respondents
Maslow Analytics already does per-user attempt comparison; this extends the pattern to all surveys at the org level

Impact Comparison