GTPS-CV Methodology - AKTI SURVEY

Executive Summary

GTPS-CV is a field-driven survey methodology designed to produce representative, manipulation-resistant data in environments where digital surveys are vulnerable to farming, outsourcing, or coordinated bias. Version 1.0 introduces structured sampling protocols, expanded demographic capture, situational location classification, and statistical weighting frameworks to achieve population representativeness.

Core Principles

Trust earned through performance — not inherited or assumed
Geographic diversity as a quality signal — preventing regional capture
Cross-validation among coordinators — detecting anomalies through comparison
Stratified sampling with post-stratification weighting — achieving representativeness
Diversity over volume — rewarding coverage breadth, not response counts

Part 1: Coordinator Structure and Trust System

1.1 Trust Hierarchy

The platform operates through a hierarchical coordinator network:

Tier	Trust Range	Appointment	Capacity
Tier 1 (Anchor)	90–100	Site-appointed	10–20 coordinators
Tier 2	70–89	Recruited by Tier 1	Up to 10 per parent
Tier 3	50–69	Recruited by Tier 2	Up to 10 per parent
Tier 4 (Field)	30–49	Recruited by Tier 3	Up to 10 per parent

1.2 Trust Initialization

New coordinators are initialized at:

Initial Trust = min(Parent Trust × 0.85, Tier Maximum) - Admin Adjustment

Where:
- Parent Trust × 0.85 creates 15% minimum decay
- Tier Maximum caps trust at tier ceiling
- Admin Adjustment allows manual reduction (0–20 points)

1.3 Trust Dynamics

Trust scores update after each survey period based on four factors:

Factor	Weight	Measurement
Cross-validation consistency	35%	Deviation from area peers
Geographic cluster diversity	30%	Unique location clusters covered
Demographic cluster diversity	20%	Spread across demographic cells
Protocol compliance	15%	Geo-fence adherence, timing rules

Trust Update Formula:

New Trust = Current Trust + (Performance Score - 50) × Learning Rate × Trust Modifier

Where:
- Performance Score = weighted sum of factors (0–100)
- Learning Rate = 0.1 (slow adjustment)
- Trust Modifier = 1.0 for Trust > 70, 1.2 for Trust ≤ 70 (faster recovery)

Trust is bounded: minimum 10, maximum 100. Coordinators falling below 20 are flagged for review.

Part 2: Sampling Framework

2.1 Location Selection Protocol

Representativeness begins with systematic location selection, not convenience.

Primary Sampling Units (PSUs)

Each survey defines PSUs based on:

Administrative boundaries (districts, municipalities)
Population density zones
Known demographic distributions

Location Assignment

Coordinators receive location assignments through:

Randomized Grid Assignment: Geographic area divided into grid cells; coordinators assigned random cells
Quota-Based Distribution: Cells weighted by population; more coordinators assigned to denser areas
Rotation Schedule: Assignments rotate weekly to prevent familiarity bias

2.2 Respondent Selection Protocol

Within assigned locations, coordinators follow systematic selection:

Time-Interval Sampling

Begin collection at assigned start time
Approach first eligible adult after arrival
After each completion, wait 3 minutes before next approach
Continue until quota met or time window closes

Systematic Selection Rules

Approach adults (18+) who appear available
If refused, wait 1 minute, approach next eligible person
No targeting based on appearance, dress, or perceived demographics
Record all refusals for response rate calculation

2.3 Situational Location Classification

Every response captures situational context to enable stratification and bias detection.

Location Categories

Category Code	Location Type	Expected Demographics
RES-APT	Apartment complex	Mixed urban
RES-HSE	Residential house area	Suburban/family
EDU-SCH	School vicinity	Parents, staff
EDU-COL	College/University	Young adults 18–25
TRN-STA	Train/Metro station	Commuters, mixed
TRN-RDE	Train/Metro ride	Commuters, mixed
BUS-STP	Bus stop	Mixed, lower-middle income
BUS-RDE	Bus ride	Mixed, lower-middle income
COM-MLL	Shopping mall	Consumers, mixed
COM-MKT	Street market/bazaar	Local community
COM-CAF	Cafe/Restaurant	Urban, varied income
REL-MOS	Mosque vicinity	Muslim community
REL-CHR	Church vicinity	Christian community
REL-TMP	Temple vicinity	Hindu/Buddhist community
REL-OTH	Other religious site	Varies
REC-PRK	Public park	Families, recreation
REC-SPT	Sports facility	Active adults
WRK-OFF	Office district	White-collar workers
WRK-IND	Industrial area	Blue-collar workers
GOV-OFF	Government office	Citizens, bureaucrats
HLT-HSP	Hospital vicinity	Patients, caregivers
OTH	Other (specify)	Requires description

Urban/Rural Classification

Code	Definition	Criteria
URB-1	Metro urban	City population > 1 million
URB-2	Urban	City population 100K–1M
URB-3	Semi-urban	Town population 20K–100K
RUR-1	Rural town	Population 5K–20K
RUR-2	Rural village	Population < 5K

Part 3: Data Collection Specification

3.1 Geographic Data (Geo-Cluster)

Every response captures a three-part geo-cluster:

Component	Field	Source	Required
Physical	GPS Coordinates	Browser GPS	Yes
Situational	Location Type	Coordinator selection	Yes
Settlement	Urban/Rural Code	Derived + Coordinator	Yes

GPS Requirements:

Accuracy threshold: ≤ 50 meters
Responses without GPS are flagged as "unverified"
Unverified responses weighted at 50% in analysis
More than 20% unverified from a coordinator triggers review
IP geolocation used only for cross-validation, never as primary

Geo-Cluster Example:

{
  "gps": { "lat": 23.8103, "lng": 90.4125, "accuracy": 12 },
  "situational": "TRN-STA",
  "settlement": "URB-1"
}

3.2 Demographic Data (Demo-Cluster)

Four demographic attributes form the demo-cluster:

Gender

Code	Label
M	Male
F	Female
X	Other/Prefer not to say

Age Bracket

Code	Range
A1	18–24
A2	25–34
A3	35–44
A4	45–54
A5	55–64
A6	65+

Occupation Category

Code	Category	Examples
OCC-STU	Student	School, college, university
OCC-UNE	Unemployed/Seeking	Job seekers
OCC-HOM	Homemaker	Primary household managers
OCC-RET	Retired	Pensioners
OCC-AGR	Agriculture/Fishing	Farmers, fishers, laborers
OCC-MAN	Manual/Trade	Construction, factory, drivers
OCC-SVC	Service sector	Retail, hospitality, security
OCC-CLR	Clerical/Office	Admin, data entry, reception
OCC-PRO	Professional	Engineers, doctors, lawyers, teachers
OCC-MGT	Management/Executive	Managers, directors, business owners
OCC-GOV	Government/Public	Civil servants, military, police
OCC-OTH	Other	Specify in notes

Income Level (Self-Reported)

Code	Description	Anchor Question
INC-1	Struggling	"Difficulty meeting basic needs"
INC-2	Getting by	"Cover basics with little extra"
INC-3	Comfortable	"Meet needs with some savings"
INC-4	Well-off	"Comfortable with regular savings"
INC-5	Affluent	"No financial concerns"
INC-X	Prefer not to say	—

Part 4: Diversity Scoring and Rewards

4.1 Core Principle: Diversity Over Volume

Coordinators are never rewarded for volume. The incentive system rewards coverage diversity across geographic and demographic clusters. A coordinator with 20 responses across 15 unique clusters scores higher than one with 100 responses from 3 clusters.

4.2 Diversity Score Calculation

Each coordinator's performance is measured by two diversity indices:

Geographic Diversity Index (GDI)

Measures spread across unique geo-clusters:

GDI = (Unique Geo-Clusters Covered / Total Possible Geo-Clusters in Area) × 100

Where Unique Geo-Cluster = unique combination of:
- GPS grid cell (500m × 500m)
- Situational location type
- Urban/Rural code

Demographic Diversity Index (DDI)

Measures spread across demographic cells:

DDI = (Unique Demo-Clusters Covered / Target Demo-Clusters) × 100

Where Unique Demo-Cluster = unique combination of:
- Gender (3 options)
- Age bracket (6 options)
- Occupation (12 options)
- Income level (6 options)

Maximum theoretical cells = 3 × 6 × 12 × 6 = 1,296
Practical target cells (based on population distribution) ≈ 100–200

Combined Diversity Score (CDS)

CDS = (GDI × 0.5) + (DDI × 0.5)

4.3 Reward Point System

Points are earned based on diversity contribution, not response count:

Action	Points	Condition
New geo-cluster coverage	10	First response in that geo-cluster for this survey
New demo-cluster coverage	10	First response in that demo-cluster for this survey
Repeat geo-cluster	1	Additional response in already-covered geo-cluster
Repeat demo-cluster	1	Additional response in already-covered demo-cluster
Cross-validation bonus	5	Response aligns with nearby coordinator (±8% margin)
Hard-to-reach bonus	15	Response from designated underserved area/demographic

Point Decay for Repetition:

Points per repeat = 1 / (1 + repeat_count_in_cluster)

Response 1 in cluster: 10 points (new)
Response 2 in cluster: 1 / (1 + 1) = 0.5 points
Response 3 in cluster: 1 / (1 + 2) = 0.33 points
Response 10 in cluster: 1 / (1 + 9) = 0.1 points

4.4 Leaderboard Rankings

Leaderboards display coordinators ranked by:

Primary Rank: Combined Diversity Score (CDS)
Secondary Rank: Cross-validation consistency rate
Tertiary Rank: Protocol compliance rate

Volume (total responses) is displayed but never used for ranking.

4.5 Reward Tiers (When Sponsorship Available)

Tier	CDS Threshold	Reward Type
Platinum	CDS >= 80	Monetary + Certificate + Badge
Gold	CDS 60–79	Monetary + Certificate
Silver	CDS 40–59	Certificate + Recognition
Bronze	CDS 20–39	Recognition
Participant	CDS < 20	Participation acknowledgment

4.6 Non-Monetary Recognition

Recognition	Criteria
Digital volunteer certificate	Complete at least one survey with CDS >= 20
"Diversity Champion" badge	Top 10% CDS in any survey
"Coverage Pioneer" badge	First to cover a hard-to-reach cluster
"Trusted Collector" badge	Trust score >= 85 for 3+ consecutive surveys
Public leaderboard ranking	All active coordinators
Social media shoutout	Weekly top 5 by CDS

Part 5: Cross-Validation System

5.1 Validation Clusters

Coordinators operating in overlapping areas are grouped into validation clusters. A validation cluster requires:

Minimum 3 coordinators
Minimum 20 responses per coordinator
Same survey, same time window

5.2 Statistical Comparison

For each survey question, answer distributions are compared:

Deviation Score = |Coordinator Distribution - Cluster Mean Distribution|

Measured using Jensen-Shannon Divergence (JSD):
- JSD = 0: Identical distributions
- JSD = 1: Completely different distributions

Thresholds

JSD Score	Interpretation	Action
0.00–0.05	Excellent alignment	Trust boost (+2)
0.05–0.10	Acceptable variance	No change
0.10–0.20	Elevated variance	Flag for review
0.20+	Significant deviation	Trust penalty (-5), manual review

5.3 Handling Legitimate Outliers

Some coordinators may collect from genuinely different sub-populations. To prevent penalizing accurate outliers:

Cluster Segmentation: If a coordinator's geo/demo profile differs significantly from peers, they form a separate validation sub-cluster
Historical Comparison: Coordinator's results compared to their own historical patterns
Manual Override: Flagged cases reviewed by Tier 1 coordinators before trust penalties apply

Part 6: Post-Stratification Weighting

6.1 Purpose

Raw survey data rarely matches population proportions. Post-stratification weighting adjusts for over/under-representation to produce population-representative estimates.

6.2 Weighting Cells

Responses are grouped into weighting cells based on:

Dimension	Categories	Source of Population Proportions
Region	Administrative units	Census data
Settlement	URB-1, URB-2, URB-3, RUR-1, RUR-2	Census data
Gender	M, F, X	Census data
Age	A1–A6	Census data

6.3 Weight Calculation

Cell Weight = (Population Proportion in Cell / Sample Proportion in Cell)

Example:

Population: 25% are URB-1, Female, A2
Sample: 35% are URB-1, Female, A2
Weight = 0.25 / 0.35 = 0.71

6.4 Weight Trimming

Extreme weights distort variance. Weights are trimmed:

Trimmed Weight = max(0.2, min(5.0, Raw Weight))

Cells with weights outside 0.2–5.0 are flagged as poorly sampled.

6.5 Effective Sample Size

Weighting reduces statistical power. Report effective sample size:

n_eff = (Sum of weights)^2 / Sum(weights^2)

Results should report both raw n and n_eff.

Part 7: Anti-Manipulation Design

7.1 Structural Defenses

Threat	Defense Mechanism
GPS spoofing	Cross-reference with IP, device fingerprint, movement patterns
Fake coordinators	Trust decay, referral accountability, minimum activity thresholds
Answer farming	Cross-validation detects anomalous uniformity
Demographic targeting	Diversity scoring discourages cluster concentration
Coordinator collusion	Random validation cluster assignment, rotation
Volume gaming	Zero reward for volume; diversity-only incentives

7.2 Detection Signals

The system monitors for:

Velocity anomalies: Too many responses too quickly
Pattern uniformity: Identical or near-identical answer sequences
Geo-impossibility: Responses from locations too far apart in time
Demographic skew: Extreme concentration in single demo-cluster
Cross-validation failure: Persistent deviation from area peers

7.3 Response Actions

Signal Severity	Automatic Action	Human Review
Low	Flag for monitoring	No
Medium	Reduce trust score	Optional
High	Suspend data acceptance	Required
Critical	Suspend coordinator	Required

Part 8: Reporting Standards

8.1 Required Disclosures

All published results must include:

Methodology version used (e.g., GTPS-CV v1.0)
Raw sample size and effective sample size
Collection period and geographic scope
Weighting variables and trimming applied
Coverage gaps: Which geo/demo clusters are under-represented
Coordinator network size and average trust score
Cross-validation pass rate

8.2 Confidence Reporting

Results presented with:

Point estimate
95% confidence interval (accounting for design effect)
Margin of error

8.3 Limitations Statement

Every report includes standardized limitations:

"GTPS-CV produces probability-approximating samples through systematic field protocols, but is not a true probability sample. Results should be interpreted as indicative of population sentiment with the disclosed margins of uncertainty. Post-stratification weights adjust for known demographic imbalances but cannot correct for unmeasured biases."

Part 9: System Parameters (Configurable)

Parameter	Default	Range	Description
trust_decay_rate	0.85	0.70–0.95	Trust multiplier for new recruits
learning_rate	0.10	0.05–0.20	Speed of trust adjustment
cv_margin_acceptable	0.10	0.05–0.15	JSD threshold for acceptable variance
weight_trim_lower	0.20	0.10–0.50	Minimum weight
weight_trim_upper	5.00	3.00–10.00	Maximum weight
gps_accuracy_threshold	50m	20–100m	Maximum acceptable GPS uncertainty
min_cluster_size_cv	3	2–5	Minimum coordinators for cross-validation
new_cluster_points	10	5–20	Points for new cluster coverage
repeat_cluster_base	1	0.5–2	Base points for repeat cluster
gdi_weight	0.50	0.30–0.70	Weight of GDI in CDS calculation
ddi_weight	0.50	0.30–0.70	Weight of DDI in CDS calculation

Part 10: Governance and Sponsor Neutrality

10.1 Sponsor Restrictions

Sponsors cannot influence:

Survey question wording or answer options
Trust algorithm parameters
Geographic targeting or exclusion
Weighting methodology
Data acceptance logic
Coordinator selection or rewards
Diversity scoring formulas

10.2 Sponsor Permissions

Sponsors may:

Fund the reward pool
Request specific geographic coverage (without exclusions)
Receive anonymized, weighted aggregate results
Display brand acknowledgment in survey interface

10.3 Editorial Independence

Survey design and analysis remain under platform editorial control. Sponsor-requested surveys undergo review to ensure:

Questions are non-leading
Answer options are balanced
Topic is appropriate for field collection

Appendix A: Glossary

Term	Definition
Geo-cluster	Unique combination of GPS grid cell + situational location + settlement type
Demo-cluster	Unique combination of gender + age + occupation + income
CDS	Combined Diversity Score: weighted average of GDI and DDI
GDI	Geographic Diversity Index
DDI	Demographic Diversity Index
JSD	Jensen-Shannon Divergence: statistical measure of distribution difference
PSU	Primary Sampling Unit
Post-stratification	Statistical adjustment to align sample with population proportions
Trust decay	Reduction in initial trust score across coordinator tiers
Cross-validation	Comparison of results between coordinators in same area
n_eff	Effective sample size after weighting adjustment

Appendix B: Comparison with Original Methodology

Aspect	Original	Version 1.0
Location data	GPS only	GPS + Situational + Settlement (3-part geo-cluster)
Demographics	Gender, Age	Gender, Age, Occupation, Income (4-part demo-cluster)
Sampling protocol	Undefined ("random in public")	Systematic time-interval with location assignment
Reward basis	Volume-influenced points	Diversity-only (CDS-based)
Cross-validation	Fixed ±5–10% threshold	Statistical (Jensen-Shannon Divergence)
Weighting	Geographic normalization only	Full post-stratification with trimming
Trust formula	Described but unspecified	Published formula with parameters
Reporting standards	None specified	Required disclosures and limitations
Outlier handling	Not addressed	Cluster segmentation + historical comparison

Appendix C: Implementation Checklist

Appendix D: Data Schema Summary

Geo-Cluster Schema

geo_cluster:
  gps:
    lat: float (required)
    lng: float (required)
    accuracy: integer meters (required)
  situational: enum [RES-APT, RES-HSE, EDU-SCH, EDU-COL, TRN-STA,
                     TRN-RDE, BUS-STP, BUS-RDE, COM-MLL, COM-MKT,
                     COM-CAF, REL-MOS, REL-CHR, REL-TMP, REL-OTH,
                     REC-PRK, REC-SPT, WRK-OFF, WRK-IND, GOV-OFF,
                     HLT-HSP, OTH] (required)
  settlement: enum [URB-1, URB-2, URB-3, RUR-1, RUR-2] (required)

Demo-Cluster Schema

demo_cluster:
  gender: enum [M, F, X] (required)
  age: enum [A1, A2, A3, A4, A5, A6] (required)
  occupation: enum [OCC-STU, OCC-UNE, OCC-HOM, OCC-RET, OCC-AGR,
                    OCC-MAN, OCC-SVC, OCC-CLR, OCC-PRO, OCC-MGT,
                    OCC-GOV, OCC-OTH] (required)
  income: enum [INC-1, INC-2, INC-3, INC-4, INC-5, INC-X] (required)

Document Version: 1.0

Methodology Status: Production-Ready Framework

Next Review: After first 1,000-response pilot

GTPS-CV Methodology v1.0