Skip to content

Schema reference — S3 layout

All structured data is in Hetzner Object Storage at s3://tnelection2026/. Total: ~2.2 GB across 9 dataset folders.

results/

ECI 2026 candidate-level + round-wise. Already in S3 for all 234 ACs.

results/curated/year=2026/ac={N}/candidate_totals.parquet

ColumnTypeSource
snintECI table row number
candidatestrcandidate full name
partystrparty name (ECI spelling)
evm_votesintEVM tally
postal_votesintpostal/ETPBS tally
total_votesintevm + postal
pct_votesfloatpercent of valid votes in AC
statestr"TN"
ac_noint1..234
ac_namestre.g. "KOLATHUR"
yearint2026
run_datestrfetch date

results/curated/year=2026/ac={N}/round_votes.parquet

ColumnType
round_noint
candidatestr
partystr
brought_forwardint (running total at start of round)
current_roundint (votes this round)
totalint (running total at end of round)
state, ac_no, ac_name, year, run_datemetadata

historical/

2021 + 2016 + 2011 TN AE results. Used as the swing baseline.

historical/curated/year=2021/kracekumar_detailed.parquet

4,232 rows — one per (AC, candidate) for the 2021 TN AE.

ColumnType
snint (O.S.N. in source)
Candidatestr
Partystr
evm_votes, postal_votes, total_votesint
pct_votesfloat
ac_namestr
ac_noint
district_namestr
constituency_typestr ("General" / "SC" / "ST")
positionint (rank, 1 = winner)
Genderstr ("M" / "F")

demographics/

Census 2011 religion mix at district level.

demographics/curated/district_religion.parquet

32 rows — one per TN district as of Census 2011.

ColumnType
district_codestr (Census 6-digit MDDS code)
district_namestr
Total_personsint
Hindu_persons, Hindu_pctint, float
Muslim_persons, Muslim_pctint, float
Christian_persons, Christian_pctint, float
Sikh_*, Buddhist_*, Jain_*, Other_*, NotStated_*int, float

demographics/curated/ac_demographics.parquet

234 rows — each AC joined to its district's religion mix.

ColumnSource
ac_no, ac_name, district_name, constituency_type2021 results
district_code, Hindu_pct, Muslim_pct, Christian_pct, Sikh_pct, ...Census 2011
dist_normnormalised district name (used for the join)

Gap

31 of 234 ACs have null religion fields. Those are in districts created after Census 2011 (Tirupathur, Mayiladuthurai, Ranipet, Chengalpattu, Kallakurichi, Tenkasi).

insights/

insights/curated/swing_2021_to_2026.parquet

3,393 rows — one per (AC × party) for parties contesting in 2021 or 2026.

ColumnType
ac_noint
ac_name, district_name, constituency_typefrom 2021
party_normstr — normalised to
pct_2021float (null if party didn't contest in 2021, e.g. TVK)
pct_2026float (null if party didn't contest in 2026)
swing_pctfloat = pct_2026 − pct_2021 (with nulls treated as 0)
votes_2021, votes_2026, total_2021, total_2026int

Other folders

FolderPurposeStatus
form20/Booth-level PDFs + (partial) booth × candidate parquetspartial — 12 ACs of 234
voters/Voter-roll PDFs + OCR'd voter listspartial — 10 booths in Kolathur
candidates/MyNeta candidate datadeprioritized
caste/Reservation + Wikipedia prose per AC234 ACs (sparse)
religion/State-baseline religion per AC234 ACs (state-level only)
alliance/Party→alliance map per year4 years (2026, 2021, 2016, 2011)
geo/DataMeet shapefile + parquet WKT234 polygons

Quick DuckDB query

sql
INSTALL httpfs; LOAD httpfs;
SET s3_endpoint='hel1.your-objectstorage.com';
SET s3_access_key_id='...';
SET s3_secret_access_key='...';

-- Most-swung TVK ACs by district
SELECT ac_no, ac_name, district_name, swing_pct
FROM read_parquet('s3://tnelection2026/insights/curated/swing_2021_to_2026.parquet')
WHERE party_norm = 'TVK'
ORDER BY swing_pct DESC
LIMIT 10;

Built from public data — ECI, Census 2011, kracekumar/tn_elections.