Analytics & Statistics

F20.04 11 features

At a glance

Analytics & Statistics turns raw survey responses into a publishable evidence base, combining descriptive statistics, inferential tests, NPS calculation, verbatim text analysis, and ranking aggregation in a real-time dashboard. A no-code query builder supports ad-hoc crosstabs and segment exploration, while small-cell suppression and privacy-aware CSV/XLSX/JSON exports keep results GDPR-compliant from the first chart all the way to the final dataset shared with stakeholders.

How it works

Every submitted response is streamed into an analytics pipeline that recomputes per-question statistics in near real time. Numeric and Likert questions surface mean, median, standard deviation, and a 95% confidence interval; NPS questions render the classic split into promoters (9-10), passives (7-8), and detractors (0-6) with the score and trend over time. Categorical relationships are explored with Chi-square tests and Cramer's V to quantify association strength, while numeric pairs use Pearson and Spearman correlations rendered as an interactive matrix with significance shading.

Segment breakdowns let analysts pivot any question by age band, gender, region, license type, or discipline, and the dashboard plots side-by-side distributions with a small-cell suppression rule (k-anonymity, n<5) that automatically masks segments too small to publish without re-identification risk. Open-ended text answers feed a text-analysis module that produces TF-IDF term scores, k-means clusters of similar verbatims, and word-frequency clouds so qualitative themes surface alongside quantitative results. Ranking questions are aggregated with the Borda count to produce a robust collective preference order independent of vote splitting.

For ad-hoc work a query builder exposes filters, breakdowns, and crosstab axes in a no-code form; queries can be saved, shared, or pinned to the dashboard. Exports are available as CSV, XLSX, and JSON, and they re-apply the survey's privacy mode at export time — anonymous datasets strip user_id columns, pseudonymous exports replace it with the hash, and attributed exports require explicit role permission and are logged for audit. All charts include sample-size badges and confidence-interval whiskers so consumers can judge reliability at a glance.

Key capabilities

Descriptive statistics with 95% confidence intervals and classic NPS scoring
Correlation matrix (Pearson/Spearman) and Cramer's V for association strength
Chi-square tests for segment independence with significance highlighting
Segment breakdowns by age, gender, region, license type, and discipline
k-anonymity small-cell suppression (n<5) for GDPR-safe publication
Text analysis (TF-IDF, k-means, frequency) and Borda count for rankings
Ad-hoc query builder with crosstabs and CSV/XLSX/JSON export honoring privacy mode

In practice

A federation analyst opens the dashboard the morning after closing a consultation. NPS sits at +24 with a clear upward trend across four weeks of fieldwork. She pivots satisfaction by region and the small-cell rule masks two micro-regions, prompting her to widen the breakdown to NUTS-2.

A correlation matrix shows a strong Spearman link between perceived officiating quality and willingness to renew licenses, and a Chi-square test confirms the relationship is significant across age bands. She drops the open-ended verbatims into the text analysis module, identifies a cluster about junior coaching, builds an ad-hoc crosstab to size it by club, and exports a privacy-aware XLSX for the board pack.

Features in this subsystem

ID	Status	Features
F20.04.01	Shipped	Descriptive statistics — mean, median, stddev, 95% CI per question ✅ PL-T079
F20.04.02	Shipped	NPS calculation — classic Net Promoter Score with promoters/passives/detractors ✅ PL-T079
F20.04.03	Shipped	Correlation matrix — Pearson/Spearman between numeric questions ✅ PL-T079
F20.04.04	Shipped	Cramer's V — categorical association strength ✅ PL-T079
F20.04.05	Shipped	Chi-square test — independence test for segment breakdowns ✅ PL-T079
F20.04.06	Shipped	Segment breakdowns — pivot by age, gender, region, license type, discipline ✅ PL-T079
F20.04.07	Shipped	Small-cell suppression — k-anonymity (n<5) for GDPR compliance ✅ PL-T079
F20.04.08	Shipped	Text analysis — TF-IDF, k-means clustering, word frequency ✅ PL-T079
F20.04.09	Shipped	Borda count — ranking aggregation for preference questions ✅ PL-T079
F20.04.10	Shipped	Query builder — ad-hoc analytics queries with crosstab support ✅ PL-T079
F20.04.11	Shipped	Export — CSV, XLSX, JSON export with privacy mode enforcement ✅ PL-T079

Related subsystems

F20.03

Response Collection

20. Surveys & Research

F20.05

Quality Control

20. Surveys & Research

F20.06

Cross-Tenant Federation

20. Surveys & Research

Previous subsystem

F20.03 Response Collection

Next subsystem

F20.05 Quality Control