About Neer Vazhvu

An open-source water intelligence platform for Indian cities - Chennai and Madurai live, Bengaluru onboarding - built to make public data accessible and actionable.

Sections are collapsed by default. Click any heading to expand it.

Reading this dashboard

How “Days of Water Left” Works

We compute three scenarios based on current reservoir storage, daily consumption, and inflow patterns:

Pessimistic (no rain): Assumes zero inflow. Storage divided by net daily demand (consumption minus desalination).

Current trend: Uses the 7-day rolling average inflow. Storage divided by (demand minus recent inflow).

Seasonal rains: Uses the historical average inflow for this calendar month across all available years.

Default Assumptions

Parameter	Default	Source
Daily consumption	830 MLD	CMWSSB annual report
Desalination output	190 MLD	Minjur (100) + Nemmeli (100)
Groundwater supply	Not modeled	Conservative assumption
Evaporation losses	Not modeled	Planned for V2

Daily consumption

830 MLD

CMWSSB annual report

Desalination output

190 MLD

Minjur (100) + Nemmeli (100)

Groundwater supply

Not modeled

Conservative assumption

Evaporation losses

Not modeled

Planned for V2

Users can adjust consumption and desalination values via sliders on the dashboard.

What each page shows

Dashboard & reservoirs

The dashed violet line on the storage trend chart shows an ARIMAX-based forecast for each reservoir, extending 6 months into the future. The shaded band around it represents an 80% confidence interval - the range within which actual storage is expected to fall, 4 out of 5 times.

Technique

We use AutoARIMA library with statsforecast exogenous regressors (ARIMAX). AutoARIMA automatically selects the best ARIMA(p,d,q) order and seasonal component by testing multiple model configurations and choosing the one with the lowest information criterion (AICc).

Each reservoir is forecasted independently; six separate models are fitted.
The model is re-trained daily as new data arrives from the CMWSSB scraper.
Exogenous variables: Inflow and outflow (cusecs) are fed as external regressors alongside storage.
Future flow estimation: Since future inflow/outflow are unknown, the model uses historical seasonal averages as proxy values for the forecast horizon.
Graceful fallback: If a reservoir has sparse inflow/outflow data (less than 30% non-zero in the last 2 years), the model automatically falls back to pure ARIMA without exogenous variables.
All predictions are clamped to [0, reservoir capacity].

How we measure reservoir catchment rainfall

This is used for the dashboard catchment rainfall card for the four core Chennai supply reservoirs.

We use reviewed operational catchment polygons for Poondi, Red Hills, Chembarambakkam, and Cholavaram. These are hybrid review geometries built from HydroBASINS, MERIT Hydro, and local drainage review rather than simple circles around reservoir centroids.
For each catchment, we sum CHIRPS rainfall over the last 7, 30, and 90 days.
We compare those totals with a same-season historical baseline built from the prior 20 years of CHIRPS windows.
The app does not expose the raw rainfall rasters. It only shows the bucketed result: well below, below, near normal, above, or well above normal.

Groundwater page

The choropleth map shows depth to water table in metres below ground level (mbgl) for each of Chennai’s 200 GCC wards. Lower values mean the water table is closer to the surface (healthier). Thresholds are based on CGWB classification for South Indian alluvial aquifers.

Year-over-year trends compare the same month across consecutive years. A change of more than 0.5m is classified as improving or declining.

Ward Risk Scoring: Each of Chennai’s 200 wards receives a composite risk score (0-100) based on groundwater depth (40%), year-over-year trend (30%), city-wide reservoir stress (20%), and seasonal vulnerability (10%). Scores are fully explainable.

Live CGWB station overlay

The ward choropleth is sourced from OpenCity's monthly ward-level groundwater dataset, which is authoritative but usually lags by weeks to months. To pair it with ground-truth readings, we also plot ~35 CGWB (Central Ground Water Board) monitoring stations in Chennai district as circle markers, pulled directly from the India WRIS Ground Water Level API.

Manual vs Telemetric stations - why the two can differ sharply

Manual stations are quarterly CGWB field-crew readings, usually from shallow dug wells (~5-11 m deep) sampling the unconfined aquifer. This is the water table residents actually pump from.
Telemetric stations are DWLR sensors that transmit readings daily. They are usually installed on deeper bore wells or piezometers tapping confined or semi-confined aquifers (often 19-200 m deep), so they answer a different question than the manual dug wells and the two readings should not be conflated.
The station panel surfaces the well type, total well depth, and aquifer type from WRIS metadata so you can tell which well is which before comparing readings.

Sensor quality flags

DWLRs fail silently - a broken sensor keeps reporting the same depth forever, which would poison a naive dashboard. The groundwater_wris_latest database view scores every station with a data_quality_flag so suspect readings are surfaced explicitly rather than averaged into the ward colours:

stuck stuck - Telemetric station with >=10 readings in the last 60 days whose median daily delta is under 1cm. This is robust to one-off step changes: a genuinely steady aquifer still passes, but a flat-lined sensor gets caught.
stale stale - The latest reading is older than the station's expected cadence. Mode-aware: Telemetric becomes stale after 14 days (a DWLR should report daily), manual only after 180 days (CGWB resurveys it seasonally).
ok ok - Station has at least one recent reading and is neither stuck nor stale.

On the map, suspect stations render with a neutral grey fill and a dashed amber ring so they never get confused with trustworthy readings, and the station panel shows an amber 'Possible sensor failure' banner with the exact 60-day range and reading count. The legend's sensor-status sub-section exposes toggles so reviewers can hide stuck or stale markers entirely.

Water bodies & restoration

The Water Bodies page combines current OpenStreetMap polygons, a curated set of 15 historically significant lost or encroached water bodies, and a new satellite context layer for a reviewed Phase 1 target set. For selected lakes and reservoirs, the detail panel now shows historical persistence, current spread versus the usual seasonal baseline, and an observation freshness/confidence label.

How we measure water-body spread by season

This is used for the "Satellite Context" block shown on selected lakes and reservoirs.

We start from a curated Phase 1 target list instead of all 1,787 mapped water bodies, so we can QA the outputs and avoid noisy tiny ponds or industrial water features.
Current spread is estimated from a 45-day Sentinel-2 NDWI composite. NDWI compares green and near-infrared light to detect water, including turbid and dark water that other classifiers miss. We turn the water signal inside each polygon into an observed water-spread area in hectares.
Seasonal baseline comes from JRC Global Surface Water monthly recurrence for the same calendar month. This gives us an expected wet-area footprint for March vs April vs monsoon months, instead of comparing everything to one annual average.
We compare observed spread to the seasonal baseline, compute a simple anomaly ratio, and label it as much lower, lower, near normal, higher, or much higher. We also compute historical persistence as the share of months where the water body meaningfully holds water.

Rich-Data Deep-Zoom Panel (flagship bodies)

Seven Chennai water bodies have a dedicated full-screen panel layered on top of the standard /water-bodies map. Clicking a flagship body opens yearly satellite imagery 1984-present, cumulative water-loss and built-gain tints, per-year zonal stats, and a sources & methodology modal.

Onboarded today (7): Pallikaranai Marsh (TNSWA gazetted Ramsar Site #2481 boundary), Sholavaram Lake, Red Hills Reservoir (Puzhal), Chembarambakkam Lake, Porur Lake, Velachery Lake, Perumbakkam Lake. The last six use the OpenStreetMap relation as the primary boundary.

Yearly chips 1984-present - Landsat 5 TM (1984-1998), Landsat 5+7 (1999-2012), Landsat 7+8 (2013-2018), Sentinel-2 SR Harmonized (2019-present). All chips are pre-loaded on panel open so the time-lapse plays without flicker.
Cumulative water-loss tint over the body polygon, computed from JRC Global Surface Water v1.4 (baseline 1988-92 vs end 2017-21).
Cumulative built-gain tint over the 1 km halo, computed from Google Dynamic World V1 (baseline 2016-18 vs end 2023-25).
Per-year stats - water surface % in body (JRC 1984-2021, spliced with Dynamic World water class 2022-now so the chart runs continuous through 2026), built fraction % in halo (Dynamic World 2016-now), buildings in halo, buildings in body (Overture Maps quarterly, Open Buildings v3 fallback).
Caveats stated in-panel - the JRC v1.4 series ends at 2021, so post-2021 water-fraction readings come from Dynamic World (slight methodology step at the splice). The 1 km halo is editorial reference (not a legally codified buffer) for every body except Pallikaranai, where it aligns with the NGT 1 km eco-sensitive zone reference.
Pallikaranai-only set-algebra - For Pallikaranai both a gazetted TNSWA boundary (1246.76 ha) and an OSM natural=wetland polygon (1073.06 ha) exist; the panel surfaces the 233.06 ha gap between the two.

How we simplify the outputs for users

Water-body spread uses an observed/baseline ratio: below 0.60 = much lower, 0.60-0.85 = lower, 0.85-1.15 = near normal, 1.15-1.40 = higher, above 1.40 = much higher.
Catchment rainfall uses anomaly buckets against the historical seasonal baseline: <= -50% well below, <= -20% below, < 20% near normal, < 50% above, and >= 50% well above.
Low-confidence satellite rows are hidden from the water-body detail panel. We only show the summary when optical coverage is good enough to be useful.
Not every mapped water body shows this yet. Phase 1 is limited to a reviewed target set so we can quality-check the outputs before expanding coverage.

Lake Restoration Priority

The restoration ranker scores all 1,787 water bodies on restoration priority using a 5-component spatial analysis model. Each component is scored 0-100 and combined as a weighted average:

Water Body Size (25%): Larger water bodies provide greater groundwater recharge and flood mitigation impact.

Proximity to Lost Water Bodies (20%): Water bodies near historically lost lakes are in stressed areas where restoration compensates for lost water surface.

Proximity to Polluted Rivers (20%): Water bodies near dead or degraded river stretches (by dissolved oxygen readings) could serve as settling or treatment wetlands.

Industrial Pollution Proximity (15%): Water bodies near industrial discharge zones face greater contamination risk; restoring them helps protect groundwater.

Water Body Type (20%): Reservoirs and natural lakes are prioritised over canals, drains, and wastewater infrastructure.

Scores are computed from static spatial data and do not account for population density, land ownership, or restoration cost. Designed to support GCC budget allocation for lake restoration programmes.

Lost & Encroached Water Bodies - Per-Record Sources

Water body	Status	Source / Reference
Long Tank (Otteri Nullah)	Fully lost	Care Earth Trust / IIT Madras Water Bodies Study
Nungambakkam Tank	Fully lost	Survey of India historical maps / CMDA
Kodambakkam Lake	Fully lost	Care Earth Trust water body survey
Virugambakkam Lake	Severely reduced	CMDA Master Plan / Care Earth Trust
Pallikaranai Marsh	Severely reduced	Care Earth Trust / Ashoka Trust for Research in Ecology
Perungudi Lake	Fully lost	Care Earth Trust / Greater Chennai Corporation records
Madipakkam Lake	Partially encroached	Madras High Court / NGT records
Sholinganallur Marshland	Severely reduced	Salim Ali Centre for Ornithology / Care Earth Trust
Villivakkam Lake	Partially encroached	CMDA / Revenue records / Care Earth Trust
Kolathur Lake	Partially encroached	Care Earth Trust water body inventory
Tambaram Tank	Severely reduced	Survey of India maps / Tamil Nadu PWD records
Manali Wetlands	Severely reduced	TNPCB / Tamil Nadu Pollution Control Board reports
Ennore Creek Wetlands	Severely reduced	Coastal Management Society / National Green Tribunal (Chennai)
Muttukadu Backwaters	Partially encroached	CMDA Coastal Regulation Zone / Care Earth Trust
Chetpet Lake	Severely reduced	GCC / Care Earth Trust / Madras High Court order 2018

Long Tank (Otteri Nullah)Fully lost

Care Earth Trust / IIT Madras Water Bodies Study

Nungambakkam TankFully lost

Survey of India historical maps / CMDA

Kodambakkam LakeFully lost

Care Earth Trust water body survey

Virugambakkam LakeSeverely reduced

CMDA Master Plan / Care Earth Trust

Pallikaranai MarshSeverely reduced

Care Earth Trust / Ashoka Trust for Research in Ecology

Perungudi LakeFully lost

Care Earth Trust / Greater Chennai Corporation records

Madipakkam LakePartially encroached

Madras High Court / NGT records

Sholinganallur MarshlandSeverely reduced

Salim Ali Centre for Ornithology / Care Earth Trust

Villivakkam LakePartially encroached

CMDA / Revenue records / Care Earth Trust

Kolathur LakePartially encroached

Care Earth Trust water body inventory

Tambaram TankSeverely reduced

Survey of India maps / Tamil Nadu PWD records

Manali WetlandsSeverely reduced

TNPCB / Tamil Nadu Pollution Control Board reports

Ennore Creek WetlandsSeverely reduced

Coastal Management Society / National Green Tribunal (Chennai)

Muttukadu BackwatersPartially encroached

CMDA Coastal Regulation Zone / Care Earth Trust

Chetpet LakeSeverely reduced

GCC / Care Earth Trust / Madras High Court order 2018

Rivers page

The river map shows four rivers - Cooum, Adyar, Buckingham Canal, and Kosasthalaiyar - colour-coded by overall water quality status derived from CPCB monitoring data.

Flood page

Overlays historical flood hazard zones from OpenCity on the ward map together with the Greater Chennai Corporation storm water drain (SWD) network. The goal is to let residents see whether their street sits inside a documented hazard footprint and whether a drain is mapped nearby.

My Ward page

A single-page rollup of every spatial layer the dashboard knows about - reservoirs, groundwater depth, ward risk score, water bodies, rivers, flood zones, drains, sewerage, CGWB stations - all filtered to one ward. It is the deep-link target when you click a ward anywhere in the app.

Ward Report Card

Each of Chennai's 200 wards is ranked on 5 governance-quality metrics. Percentile-based A-F grades compare every ward against the full city. All density metrics are area-normalized (per sq km). The composite score is a weighted sum of per-metric percentiles.

Metric	Weight	Direction
Drainage coverage	25%	Higher = better
Sewerage infrastructure	25%	Higher = better
Flood exposure	25%	Lower = better
Water body health	15%	Lower = better
Water body density	10%	Higher = better

Grades: A (80th+ percentile), B (60-79th), C (40-59th), D (20-39th), F (below 20th). The overall grade uses the same thresholds on the composite score's percentile rank.

Uplift Planner

The uplift planner answers: "If I had INR X crore for my ward, where should I invest it?" It uses a greedy budget optimizer to allocate a hypothetical budget across 5 intervention types, maximizing the ward's composite improvement per crore spent.

How it works

Gap analysis: compares the ward's current value on each metric against the city distribution to identify where it lags.
Greedy optimizer: at each step, evaluates every feasible intervention and picks the one with the highest composite-score improvement per crore. Repeats until the budget is spent or all caps are hit.
Exact projection: builds a modified ward profile with the projected metric values and reruns the full ranking engine (computeWardRankings) to determine the exact after-state grade and percentile - not an approximation.

Data-backed caps

Each intervention is capped by real ward data: flood mitigation limited to the actual number of high/very-high hazard zones; water body restoration limited to bodies rated critical or high; revival limited to documented lost bodies. Infrastructure interventions (drains, sewerage) have practical per-ward caps.

Cost estimates

All cost ranges come from published GCC, CMWSSB, Smart Cities Mission, NDMA, and NGO project reports. Each allocation shows a low-high range; the optimizer uses the midpoint. These are illustrative - actual costs depend on site conditions, land, and procurement.

Intervention	Cost/unit	Metric
Build storm drains	1.5-3.0 Cr/km	Drainage coverage
Extend sewage network	3.0-6.0 Cr/km	Sewerage infrastructure
Flood zone mitigation	5-15 Cr/zone	Flood exposure
Restore water bodies	2-8 Cr/body	Water body health
Revive lost water bodies	10-25 Cr/body	Water body density

Chennai Water Facts page

A journalist-ready snapshot page at /facts that surfaces Chennai's water state as quotable numbers with sources, dates, and methodology attached. Organised into four freshness tiers so staleness is never hidden: Today (live from monitoring feeds), This Year (latest government publications with vintage year), Chennai Water History (milestone events and peak records), and Infrastructure (structural capacity). Every card has copy-quote, tweet, and copy-link buttons. Powered by Schema.org Dataset + Observation structured data for search engines, with a public JSON API at /api/facts.

Intelligence & AI Narratives

Beyond raw data display, Neer Vazhvu runs three intelligence modules daily to generate actionable insights.

Daily Briefing: A template-based intelligence summary generated each morning with a headline, key metrics, threshold-based alerts, and actionable recommendations. No LLM required - purely data-driven.

AI-Generated Narratives Claude AI reads live reservoir, groundwater, and risk data to generate a daily city briefing and monthly ward-level analysis in both English and Tamil. Each narrative includes source data freshness dates.

Ward Profile Index Every data layer - water bodies, flood zones, drainage, sewerage, rivers, industrial zones - is spatially mapped to each of Chennai's 200 wards. This enables cross-domain context: click any feature on any map to see its ward's complete water picture.

Cascade reconstruction methodology - Chennai

Chennai's tanks were once organised into chained cascades (system kanmoi): water from upper tanks overflowed through feeder channels into lower tanks, which fed the next, and so on. Most cascade channels are now broken by encroachment. The cascade overlay surfaces a terrain-derived hypothesis of how the cascade structure should have been organised, given the actual elevation and flow direction of the land.

See cascade health scores: Tank cascades at risk - Chennai→ ranks every documented and auto-derived cascade by fragility + priority, with citations and court / restoration anchors where known.

What you are seeing

Sky-blue circles (720 tanks): one per OpenStreetMap water-body polygon at least 1 ha in size. Size encodes cascade depth (deeper-in-the-chain tanks render larger).
Sky-blue lines (430 edges): predicted tank-to-tank cascade links. Each upstream tank has at most one outflow.
Amber lines (50 outflows): tanks whose flow direction points to a river within ~2 km, modelling the river itself as the terminal sink.

Inputs

Tank polygons: OpenStreetMap water=* features.water_type in {river, canal, stream, drain, ditch, wastewater} is excluded so river segments don't get treated as tanks.
Elevation: WWF/HydroSHEDS/03CONDEM- HydroSHEDS conditioned DEM at 3 arc-second (~90 m) resolution. "Conditioned" means sinks have been pre-filled so flow routing behaves predictably.
Flow direction: WWF/HydroSHEDS/03DIR - the corresponding ESRI D8 flow-direction raster. Each pixel encodes which of its eight neighbours water drains to.
River barriers: the {city}-rivers.geojson we already use on the map.

Algorithm (per tank)

Compute centroid; sample DEM elevation and D8 flow direction at that point in a single batched Earth Engine call.
Find all other tanks within 3 km whose elevation is lower.
Reject candidates that fall outside ±67.5° of the upstream tank's flow-direction bearing - terrain-aware directionality, not just "is downhill".
Reject candidates whose straight-line edge would cross a mapped river segment - water doesn't flow across rivers.
Pick the single steepest remaining candidate (elevation drop / distance) as this tank's outflow.
For tanks with no tank-to-tank outflow but a flow direction pointing to a river within 2 km: mark drains_to_river and draw an amber arrow to the nearest in-cone river point.

What this is NOT

Not a registry of historical channels.We don't claim that any specific cascade link historically existed; we claim the terrain would have organised water this way.
Not full hydrological flow accumulation.A stricter approach would trace flow paths pixel-by-pixel through the DEM. We use a "downhill within a flow-direction cone" heuristic that's correct for most obvious cases but can miss subtle terrain features that aren't river-mapped.
Not a real-time water transport model. Edge existence does not imply current water flow.
Not a model of any inflow that isn't tank-to-tank. Reservoirs receive water from at least four sources that this graph cannot represent: (a) direct rainfall on the lake surface, (b) catchment runoff via unmapped channels and overland flow, (c) the river the reservoir dams (rivers are deliberately excluded from cascade nodes), and (d) engineered canals, pipelines, and trans-basin diversions. A reservoir showing 0 cascade inflows here is not isolated in real life - Chembarambakkam Lake, for example, is fed by all four kinds of inflow (its 71.6 km² Adyar catchment, the upper Adyar itself, plus Krishna water via the Kandaleru-Poondi canal and Cauvery water from Veeranam) yet none of those appears in this layer. The cascade graph is solely about tank-to-tank structure derived from terrain.

Known limitations

DEM resolution ~90 m. Adequate for district-scale cascade structure; may miss very small channels. In flat terrain (e.g. coastal Chennai) elevation differences often round to the same integer metre, so the flow-direction cone does most of the work.
Single outflow per tank (default). Real tanks often have one feeder channel and one separate surplus channel; the V1 algorithm models only the steepest candidate edge per upstream. A per-district allow_multi_outflowopt-in relaxes this and keeps near-tied candidates (within 30% of the best score by default), modelling tanks with both feeder and surplus. Off by default for Chennai; we plan to enable it for plateau-geography districts where terrain gradients are weaker and multi-branch cascades are documented in the historical record.
River-coverage gaps. The river-crossing barrier is only as complete as the OSM river polylines. Where the polyline is sparse, edges may slip through.
Edges are labelled predicted only. A future iteration will cross-check predicted edges against OSM waterway=* tags and Sentinel-1/2 monsoon imagery, then label each edge as intact / partial / broken / encroached.
OSM water_type=reservoir is ambiguousin this region. In Madurai roughly 87% of cascade nodes carry that tag, including many traditional kanmoi tanks that historically fed downstream cascades. The algorithm therefore does NOT auto-classify reservoirs as terminal sinks. A per-district curation hook (terminal_sink_osm_ids) exists for marking specific known engineered reservoirs (large dams whose outflow is via spillway / canal rather than via gravity to another tank); it is currently empty pending validation against TN PWD / DHAN inventories.

Reading `cascade_position = 1`: headwater, not source

Tanks at cascade_position = 1have no tank-to-tank inflow in this graph. They are the shallowest nodes in the network, not the literal source of water in the basin. Real inflow into these tanks comes from rainfall on the lake surface, surface runoff from the surrounding catchment via channels not in OpenStreetMap, and (in dammed basins) the river itself - none of which are modelled here.

We call these headwatertanks rather than "sources" to avoid implying the cascade graph accounts for where water actually originates. A reservoir with cascade_position = 1 is not isolated from rainfall and runoff; it just sits at the top of whatever tank-to-tank chain the terrain organises.

Edge confidence

Each predicted edge carries a confidence field bucketed by its score_m_per_km (elevation drop normalised by edge length). Thresholds:

HIGH(≥ 5 m/km): a clear downhill gradient unambiguous even given HydroSHEDS 90 m elevation noise.
MEDIUM(1-5 m/km): plausible cascade link with moderate confidence. Most kanmoi-cascade edges fall here.
LOW(< 1 m/km): below 0.2 m drop per 200 m. Near the noise floor of the conditioned DEM; the edge may be terrain noise as much as real flow.

For Chennai: 126 high (29%), 248 medium (58%), 56 low (13%).

Isolated tanks: why each one is isolated

A tank is "isolated" in this graph when it has no tank-to-tank inflow, no tank-to-tank outflow, and no river sink. The pipeline re-walks the candidate-evaluation gates for each such tank and stamps it with one of these reasons, surfaced in the on-map hover tooltip:

elevation_sampling_failed- the HydroSHEDS DEM returned no value at the tank's centroid, so the algorithm has nothing to compare against. Usually data-coverage at the DEM's 90 m resolution boundaries.
no_neighbors_in_range- no other tanks within the 3 km radius the cascade window uses. Real geographic effect, common on the rural fringe of the district.
all_neighbors_uphill - in-range tanks exist but every one of them is at a higher elevation. The tank sits at a local basin low; water has nowhere downhill to go through the tank network in this window.
all_neighbors_out_of_cone- downhill tanks exist in range, but all sit outside the ±67.5° cone aligned with the upstream tank's D8 flow direction. The terrain wants water to go somewhere other than where the nearest downhill tank is.
all_neighbors_river_blocked - downhill, in-cone, in-range tanks exist, but every edge to them would cross a mapped river LineString. May indicate either real river-cut isolation or a gap where the OSM river polylines are over-segmented relative to ground truth.
unknown_isolation - defensive fallback. Should be empty in practice.

What you can use it for today

Spot likely historical hubs: tanks with high in-degree are where multiple terrain-driven flow paths converge. Maximum cascade depth in Chennai is 6.
Surface river-front tanks: anything with an amber outflow is a tank that drains directly into a river - useful for restoration prioritisation since the ecological functions differ from internal-cascade tanks.
Identify isolated tanks: tanks with neither inflow, outflow, nor river sink carry an isolation_reason field distinguishing genuine basin orphans from data-coverage gaps. See the bucket-by-bucket breakdown above.

Parameter rationale + sensitivity

Each tunable parameter was chosen with a stated rationale. The sensitivity tables below show how each output statistic responds when the parameter is varied. Generated by the cascade pipeline's sensitivity stage; raw data at public/data/cascade/chennai-cascade-sensitivity.json.

max_downstream_distance_km default 3

How far an upstream tank looks for a downhill neighbour. 3 km is the historical median spacing between tanks in well-documented kanmoi networks (DHAN Vayalagam field data). Below 1.5 km the graph fragments sharply; above 5 km the algorithm starts connecting tanks that have no historical relationship.

value	nodes	edges	isolated	max depth	outlets
1.5	720	228	325	5	58
2	720	315	228	5	54
3	720	430	130	6	50
4	720	488	90	7	48
5	720	525	73	8	46

cone half-angle (degrees) default 67.5

How wide the directional cone around the upstream tank's D8 flow direction must be for a candidate to qualify. 67.5 degrees admits the principal D8 cell plus its two neighbours on each side (5 of 8 D8 cells). The default trades local D8 instability (the 90 m DEM produces noisy flow directions in flat terrain) against false-positive edges (a 90-degree cone admits half-plane candidates that the water would never actually reach).

value	nodes	edges	isolated	max depth	outlets
22.5	720	262	306	5	33
45	720	372	178	6	45
67.5	720	430	130	6	50
90	720	468	98	7	51

min_tank_area_ha default 1

Minimum OSM water_type polygon size to enter the graph. 1 ha excludes most temple tanks, garden ponds, and roadside catchments while preserving the structural cascade. Raising the threshold thins the graph rapidly: at 5 ha Madurai keeps 72% of nodes; at 10 ha only 56%.

value	nodes	edges	isolated	max depth	outlets
1	720	430	130	6	50
2	570	324	107	6	45
5	418	214	98	6	32
10	324	135	104	5	29

max_river_outlet_distance_km default 2

Distance budget within which a tank with no tank-to-tank outflow can register a 'drains to river' arrow. 2 km matches typical surplus-channel lengths in TN sub-basin engineering. Tightening to 1 km loses ~30% of river-outlet arrows; loosening to 3 km adds plausible-but-uncertain outlets that may be drainage rather than designed surplus.

value	nodes	edges	isolated	max depth	outlets
1	720	430	140	6	32
2	720	430	130	6	50
3	720	430	122	6	62

Data Source Index

All operational data is collected by the Python pipeline and supporting scripts that power the dashboard. Raw source data and Earth Engine summaries are upserted into Supabase (PostgreSQL) and then exposed as small, readable product signals.

Reservoirs & weather

CMWSSB Lake Level PageDaily (scraped at 06:00 IST)

Daily reservoir levels for 6 reservoirs: Poondi, Cholavaram, Red Hills, Chembarambakkam, Veeranam, and Kannankottai. Includes storage (mcft), water level (ft), inflow/outflow (cusecs), and rainfall (mm).

Open-MeteoDaily (zero lag)

Primary weather source for Chennai (13.08°N, 80.27°E): precipitation, temperature, humidity, reference evapotranspiration (ET₀), and wind speed. Zero data lag, no API key required. ET₀ is used in the ARIMAX forecasting model to account for reservoir evaporation losses.

NASA POWER (fallback)Daily (2-day lag)

Fallback weather source. Satellite-derived data for Chennai: precipitation, max/min temperature, and relative humidity. Activated automatically when Open-Meteo is unreachable. 2-day data lag.

OpenCity Chennai (Lake Storage)Historical (2003-2021)

Monthly reservoir storage data (mcft) for all 6 reservoirs, spanning 2003-2021. Used as historical seed for the forecasting model.

IMD Gridded Rainfall (via imdlib)One-time generation (refreshed annually)

56-year monthly rainfall history (1970-2025) from IMD's 0.25-degree gridded dataset, extracted for the Chennai grid cell. Includes annual totals and long-term monthly normals for drought/flood/Day Zero year identification.

Groundwater

India WRIS Ground Water Level API (CGWB Stations)Daily (zero lag)

Station-level groundwater time series for ~35 CGWB monitoring stations in Chennai district, pulled daily from the India WRIS Ground Water Level API. Mix of Manual (quarterly dug wells, unconfined aquifer) and Telemetric (daily DWLR bore wells, confined aquifer) stations with well type, well depth, and aquifer metadata. Each station is scored server-side with a stuck/stale/ok data quality flag.

India WRIS / CGWB (Block Exploitation)Static fetch (refreshed periodically)

Block-level groundwater exploitation data (2011-2024) from CGWB via India WRIS ArcGIS API. Shows classification (Safe to Over-Exploited), development percentage, net availability, and extraction draft for ~15 blocks in and around Chennai.

OpenCity Chennai (Groundwater)Monthly (fetched days 1-3)

Ward-wise depth to water table (metres below ground level) for all 200 GCC wards across 15 zones. Sourced from CGWB/GCC monitoring wells. Data available from 2021 onwards.

Water bodies & historical

First Census of Water Bodies (data.gov.in)One-time historical seed

305 Chennai water bodies from the First Census of Water Bodies (2018-19) by the Ministry of Jal Shakti. Includes ownership, storage capacity (original vs present), encroachment status, depth, construction year, and basin information. Overlaid as markers on the Water Bodies map.

Kaggle Chennai Water ManagementOne-time historical seed

15 years of daily reservoir data (2004-2019) compiled by Sudalai Rajkumar. Used as additional historical training data for the forecasting model.

Care Earth Trust / NGT / CMDA: Lost Water BodiesManually curated (static)

15 manually curated lost or encroached water bodies, compiled from published research, court records, and environmental organisation reports. See the Water Bodies Map section below for per-record provenance.

Rivers & pollution

Chennai Rivers Restoration Trust (CRRT)Manually curated (static)

9 restoration projects across Adyar, Cooum, Buckingham Canal, and Kosasthalaiyar rivers from the Chennai Rivers Restoration Trust. Includes project status, budget, area, implementing agencies, and outcome metrics.

CPCB National Water Monitoring Programme (NWMP)Annual (manually refreshed)

Annual reports from the Central Pollution Control Board's National Water Monitoring Programme. Source for DO, BOD, pH, and conductivity readings at monitoring stations on the Cooum, Adyar, Buckingham Canal, and Kosasthalaiyar rivers. Supplemented by IIT Madras / Anna University peer-reviewed studies and NGT Chennai bench orders.

Nethaji Mariappan et al. (2017): Cooum Sewage InletsOne-time historical seed

31 geo-located sewage inlets along the Cooum river, Otteri Nullah tributary, and Buckingham Canal. Discharge volumes (m3/day) from PWD Chennai, published in Nature Environment and Pollution Technology, Vol. 16, No. 3. Supplemented by Sheriff & Hussain (2012) groundwater contamination study.

NGT Southern Bench / TNPCB / CPCB: Industrial Pollution SourcesManually curated (static)

7 major industrial facilities in the Ennore-Manali corridor, curated from NGT Southern Bench orders (2017-2022), TNPCB enforcement records, CPCB industrial monitoring reports, and academic studies. Each facility entry includes pollutant types, documented incidents with volumes and dates, and NGT order summaries.

Flood & drainage

OpenCity Chennai (Flood Hazard Data)Static GeoJSON (re-run script to refresh)

CFLOWS 1.0 flood hazard zones (5 categories, operationalized Nov 2019 by IIT Bombay + IIT Madras + NCCR; not publicly updated since), 2015 Chennai flood hotspots with vulnerability ratings, 2015 inundation depth readings, 2020 Cyclone Nivar hotspots, and return period flood maps (5-200yr). Newer models (JICA Chennai Flood Control Master Plan 2024; TN RTFF & SDSS live Oct 2025 at chennaifloodmonitor.tn.gov.in) are not publicly redistributable as GIS.

GCC Storm Water Drain SurveyStatic GeoJSON (re-run script to refresh)

10,308 official storm water drain segments from GCC survey (2023) across 197 wards, with street name, drain type, depth, width, length, material, and condition status.

CMWSSB Sewerage NetworkStatic GeoJSON (re-run script to refresh)

CMWSSB sewerage infrastructure: 13 operational sewage treatment plants (STPs) with 745 MLD total installed capacity across 6 major campuses (Kodungaiyur, Koyambedu, Nesapakkam, Perungudi, Alandur, Sholinganallur). Map shows 8 treatment-site points; several campuses have multiple plant units commissioned in different years. Also 348 pumping stations (SPS) linked to STPs, and 3,834 pumping main segments with pipe material and size.

Satellite & Earth Engine

NDWI Water Detection (via Sentinel-2)Periodic summary refresh

NDWI (Normalized Difference Water Index) water masks computed from Sentinel-2 green and near-infrared bands via Google Earth Engine. Used for both the water spread summary numbers and the satellite evidence overlay, replacing Dynamic World for more accurate detection of turbid and dark water.

JRC Global Surface Water (Monthly Recurrence)Historical monthly baseline

JRC Global Surface Water monthly recurrence used as the seasonal baseline for the same calendar month. This is how we judge whether current spread is lower or higher than usual for this time of year.

CHIRPS Daily RainfallDaily (zero lag)

CHIRPS daily rainfall over reviewed reservoir catchments. We use this for 7, 30, and 90 day rainfall totals and seasonal anomaly buckets on the dashboard.

Copernicus Sentinel-2 (via Earth Engine)Evidence refresh (manual dispatch)

True-color satellite imagery for reviewed evidence frames. Sentinel-2 captures 10 m resolution optical imagery every 5 days, used to produce visual evidence of water presence at flagship water bodies.

HydroBASINS / MERIT HydroStatic fetch (refreshed periodically)

HydroBASINS and MERIT Hydro support the reviewed operational catchment polygons used for the four core Chennai supply reservoirs. These geometries are reviewed for storytelling use, not presented as formal legal boundaries.

Base geography

OpenStreetMap (Overpass API)Static GeoJSON (re-run script to refresh)

All current water bodies (lakes, tanks, reservoirs, ponds, marshes) within the Chennai metropolitan bounding box. Queried via the Overpass API and saved as a static GeoJSON. 1,635 polygon features, ~95,000 ha total surface. Also source for river polyline geometry (Cooum, Adyar, Buckingham Canal, Kosasthalaiyar) and industrial zone polygons in the north Chennai corridor. Data reflects OSM contributor edits as of the last script run.

AI narratives

Anthropic Claude APIDaily / Monthly

AI-generated city and ward narratives connecting reservoir, groundwater, and risk data (Claude Sonnet for city, Haiku for wards)

Data quality & limitations

How we classify river health

CPCB publishes two parallelriver-water-quality classification systems, and they don't always agree. Knowing which one drives our river status badges matters for reading the dashboard honestly.

Designated Best-Use classes (A-E)

Computed from current dissolved-oxygen, BOD and coliform thresholds at each NWMP station. Updates every reading. Class A = drinking with disinfection only; Class B = outdoor bathing; Class C = drinking with conventional treatment; Class D = fisheries/wildlife; Class E = irrigation only. Below E = practically dead.

Polluted River Stretch (PRS) Priority I-V

A historical, multi-year stretch-level designation reflecting cumulative pollution. Slow to update; once a stretch is on the list it tends to stay there even if recent readings improve. Priority I = worst, Priority V = least bad of the polluted stretches.

Our status badges ("dead", "severely degraded", "degraded", "stressed", "healthy") are computed from current readings via the Designated Best-Use thresholds— not from the PRS Priority list. We take the worst classification across a river's monitored stations and surface that as the river-level status. Cooum, Adyar, and Buckingham Canal hold their labels under both signals (data and PRS Priority I agree); rivers where the two disagree (Vaigai's PRS Priority III vs Class C/D NWMP readings) reflect what the data shows now.

Methodology lives in src/lib/utils/river-classification.ts; readings come from CPCB NWMP annual River Water Quality reports.

Known Data Quality Issues

Government census data is invaluable but not perfect. We document known issues here for transparency. If you spot an error, please report it on GitHub.

Census: Mixed units in water_spread_area

The MoJS census methodology specifies hectares for water spread area, but 39 of 286 Chennai records appear to use square meters instead. Example: RETTAI ERI is recorded as 1,053,177 - this is sq m (~105 ha), confirmed against satellite imagery and Wikipedia (87–114 ha actual). Due to this inconsistency, census markers on the map use a uniform size as location indicators rather than representing actual water body area.

Census: Encroachment vs. storage capacity mismatch

Storage capacity and encroachment were surveyed independently. Some water bodies show 70%+ encroachment but 100% storage capacity remaining - the capacity figure was not revised to reflect lost area. These cases are flagged with an amber warning in the detail panel.

Census: Point coordinates only, no boundary polygons

The census provides only a single lat/lon point per water body, not boundary shapes. Where possible, census records are matched to nearby OpenStreetMap polygons (within 200m) so the actual water body shape is shown and census metadata (ownership, encroachment, capacity) appears in the detail panel. Unmatched census records are shown as small dots at the reported location.

Satellite: seasonal baseline is month-level, not day-level

The current satellite context compares a recent 45-day observation window to JRC monthly recurrence for the same calendar month. This is a strong seasonal reference, but it does not mean we know the exact expected spread for every day of the month.

Catchments: reviewed operational geometry, not legal survey boundary

Poondi, Red Hills, Chembarambakkam, and Cholavaram catchments are built from a mix of HydroBASINS, MERIT Hydro, and local drainage review. They are appropriate for rainfall context and inflow-support storytelling, but should not be treated as official cadastral boundaries.

Known Limitations

Estimates are approximations. Actual water availability depends on factors not modeled (groundwater extraction, Krishna water transfer, distribution losses, industrial use).
CMWSSB data may occasionally be stale (weekends, holidays). The dashboard shows a freshness indicator.
Groundwater data from OpenCity may lag by months. The map always shows the most recent available period.
Forecasts use ARIMAX (AutoARIMA with inflow/outflow as exogenous regressors) and work best with 2+ years of daily data.
Risk scores are relative indicators for comparison between wards, not absolute measures of water safety.
Satellite spread is a summary of surface water extent, not a direct measure of storage volume, water quality, or inflow source. A lake can look broad and still hold less usable water than expected.
Reservoir catchment polygons are reviewed operational geometries for rainfall context, not official legal boundaries. This matters especially in Chennai's managed canal and transfer system.
Current satellite context relies on optical Sentinel-2 observations. During persistently cloudy periods, some water bodies may temporarily lose this insight until a radar fallback is added.

About the project

Disclaimer

Not an official government tool. Neer Vazhvu is an independent, open-source project. It is not affiliated with, endorsed by, or connected to CMWSSB, GCC, CGWB, or any government body.

Informational purposes only. All data, estimates, and forecasts are provided “as is” for general awareness. Always refer to official CMWSSB advisories for critical decisions.

No personal data collected. Neer Vazhvu does not collect, store, or process any personal information. There are no user accounts, cookies, or analytics trackers.

Open Source

Neer Vazhvu is fully open source. The code, data pipeline, and methodology are transparent and available on GitHub. Contributions, bug reports, and data corrections are welcome.

View on GitHub

Support this project

Neer Vazhvu is free and open source. If you find it useful, consider supporting us on Patreon to help cover satellite data, hosting, and API costs.

Support on Patreon

value	nodes	edges	isolated	max depth	outlets
1.5	720	228	325	5	58
2	720	315	228	5	54
3	720	430	130	6	50
4	720	488	90	7	48
5	720	525	73	8	46

value	nodes	edges	isolated	max depth	outlets
1.5	720	228	325	5	58
2	720	315	228	5	54
3	720	430	130	6	50
4	720	488	90	7	48
5	720	525	73	8	46