About Neer Vazhvu

An open-source water intelligence dashboard for Chennai, built to make public data accessible and actionable.

Sections are collapsed by default. Click any heading to expand it.

Reading this dashboard

How “Days of Water Left” Works

We compute three scenarios based on current reservoir storage, daily consumption, and inflow patterns:

Pessimistic (no rain): Assumes zero inflow. Storage divided by net daily demand (consumption minus desalination).
Current trend: Uses the 7-day rolling average inflow. Storage divided by (demand minus recent inflow).
Seasonal rains: Uses the historical average inflow for this calendar month across all available years.

Default Assumptions

Daily consumption
830 MLD
CMWSSB annual report
Desalination output
190 MLD
Minjur (100) + Nemmeli (100)
Groundwater supply
Not modeled
Conservative assumption
Evaporation losses
Not modeled
Planned for V2

Users can adjust consumption and desalination values via sliders on the dashboard.

What each page shows

Dashboard & reservoirs

The dashed violet line on the storage trend chart shows an ARIMAX-based forecast for each reservoir, extending 6 months into the future. The shaded band around it represents an 80% confidence interval - the range within which actual storage is expected to fall, 4 out of 5 times.

Technique

We use AutoARIMA library with statsforecast exogenous regressors (ARIMAX). AutoARIMA automatically selects the best ARIMA(p,d,q) order and seasonal component by testing multiple model configurations and choosing the one with the lowest information criterion (AICc).

  • Each reservoir is forecasted independently; six separate models are fitted.
  • The model is re-trained daily as new data arrives from the CMWSSB scraper.
  • Exogenous variables: Inflow and outflow (cusecs) are fed as external regressors alongside storage.
  • Future flow estimation: Since future inflow/outflow are unknown, the model uses historical seasonal averages as proxy values for the forecast horizon.
  • Graceful fallback: If a reservoir has sparse inflow/outflow data (less than 30% non-zero in the last 2 years), the model automatically falls back to pure ARIMA without exogenous variables.
  • All predictions are clamped to [0, reservoir capacity].

How we measure reservoir catchment rainfall

This is used for the dashboard catchment rainfall card for the four core Chennai supply reservoirs.

  • We use reviewed operational catchment polygons for Poondi, Red Hills, Chembarambakkam, and Cholavaram. These are hybrid review geometries built from HydroBASINS, MERIT Hydro, and local drainage review rather than simple circles around reservoir centroids.
  • For each catchment, we sum CHIRPS rainfall over the last 7, 30, and 90 days.
  • We compare those totals with a same-season historical baseline built from the prior 20 years of CHIRPS windows.
  • The app does not expose the raw rainfall rasters. It only shows the bucketed result: well below, below, near normal, above, or well above normal.

Groundwater page

The choropleth map shows depth to water table in metres below ground level (mbgl) for each of Chennai’s 200 GCC wards. Lower values mean the water table is closer to the surface (healthier). Thresholds are based on CGWB classification for South Indian alluvial aquifers.

Year-over-year trends compare the same month across consecutive years. A change of more than 0.5m is classified as improving or declining.

Ward Risk Scoring: Each of Chennai’s 200 wards receives a composite risk score (0-100) based on groundwater depth (40%), year-over-year trend (30%), city-wide reservoir stress (20%), and seasonal vulnerability (10%). Scores are fully explainable.

Live CGWB station overlay

The ward choropleth is sourced from OpenCity's monthly ward-level groundwater dataset, which is authoritative but usually lags by weeks to months. To pair it with ground-truth readings, we also plot ~35 CGWB (Central Ground Water Board) monitoring stations in Chennai district as circle markers, pulled directly from the India WRIS Ground Water Level API.

Manual vs Telemetric stations - why the two can differ sharply
  • Manual stations are quarterly CGWB field-crew readings, usually from shallow dug wells (~5-11 m deep) sampling the unconfined aquifer. This is the water table residents actually pump from.
  • Telemetric stations are DWLR sensors that transmit readings daily. They are usually installed on deeper bore wells or piezometers tapping confined or semi-confined aquifers (often 19-200 m deep), so they answer a different question than the manual dug wells and the two readings should not be conflated.
  • The station panel surfaces the well type, total well depth, and aquifer type from WRIS metadata so you can tell which well is which before comparing readings.
Sensor quality flags

DWLRs fail silently - a broken sensor keeps reporting the same depth forever, which would poison a naive dashboard. The groundwater_wris_latest database view scores every station with a data_quality_flag so suspect readings are surfaced explicitly rather than averaged into the ward colours:

  • stuck stuck - Telemetric station with >=10 readings in the last 60 days whose median daily delta is under 1cm. This is robust to one-off step changes: a genuinely steady aquifer still passes, but a flat-lined sensor gets caught.
  • stale stale - The latest reading is older than the station's expected cadence. Mode-aware: Telemetric becomes stale after 14 days (a DWLR should report daily), manual only after 180 days (CGWB resurveys it seasonally).
  • ok ok - Station has at least one recent reading and is neither stuck nor stale.

On the map, suspect stations render with a neutral grey fill and a dashed amber ring so they never get confused with trustworthy readings, and the station panel shows an amber 'Possible sensor failure' banner with the exact 60-day range and reading count. The legend's sensor-status sub-section exposes toggles so reviewers can hide stuck or stale markers entirely.

Water bodies & restoration

The Water Bodies page combines current OpenStreetMap polygons, a curated set of 15 historically significant lost or encroached water bodies, and a new satellite context layer for a reviewed Phase 1 target set. For selected lakes and reservoirs, the detail panel now shows historical persistence, current spread versus the usual seasonal baseline, and an observation freshness/confidence label.

How we measure water-body spread by season

This is used for the "Satellite Context" block shown on selected lakes and reservoirs.

  • We start from a curated Phase 1 target list instead of all 1,787 mapped water bodies, so we can QA the outputs and avoid noisy tiny ponds or industrial water features.
  • Current spread is estimated from a 45-day Sentinel-2 NDWI composite. NDWI compares green and near-infrared light to detect water, including turbid and dark water that other classifiers miss. We turn the water signal inside each polygon into an observed water-spread area in hectares.
  • Seasonal baseline comes from JRC Global Surface Water monthly recurrence for the same calendar month. This gives us an expected wet-area footprint for March vs April vs monsoon months, instead of comparing everything to one annual average.
  • We compare observed spread to the seasonal baseline, compute a simple anomaly ratio, and label it as much lower, lower, near normal, higher, or much higher. We also compute historical persistence as the share of months where the water body meaningfully holds water.

How we produce reviewed satellite evidence frames

For flagship lakes and reservoirs, the detail panel offers a "See Satellite Evidence" button that opens a dialog with actual Sentinel-2 true-color imagery and a toggleable NDWI water-mask overlay.

  • For each flagship water body and each monthly reference date, the pipeline searches Sentinel-2 imagery within a configurable window.
  • Scenes are ranked by usable coverage, proximity to the reference date, and cloud percentage. The best scene is selected and downloaded as a true-color thumbnail.
  • An NDWI water-mask overlay is computed from the same Sentinel-2 scene's green and near-infrared bands, clipped to the water body boundary, and stored alongside the true-color image.
  • Frames are visually reviewed before being published. Only reviewed frames appear in the evidence dialog by default.

How we simplify the outputs for users

  • Water-body spread uses an observed/baseline ratio: below 0.60 = much lower, 0.60-0.85 = lower, 0.85-1.15 = near normal, 1.15-1.40 = higher, above 1.40 = much higher.
  • Catchment rainfall uses anomaly buckets against the historical seasonal baseline: <= -50% well below, <= -20% below, < 20% near normal, < 50% above, and >= 50% well above.
  • Low-confidence satellite rows are hidden from the water-body detail panel. We only show the summary when optical coverage is good enough to be useful.
  • Not every mapped water body shows this yet. Phase 1 is limited to a reviewed target set so we can quality-check the outputs before expanding coverage.

Lake Restoration Priority

The restoration ranker scores all 1,787 water bodies on restoration priority using a 5-component spatial analysis model. Each component is scored 0-100 and combined as a weighted average:

Water Body Size (25%): Larger water bodies provide greater groundwater recharge and flood mitigation impact.

Proximity to Lost Water Bodies (20%): Water bodies near historically lost lakes are in stressed areas where restoration compensates for lost water surface.

Proximity to Polluted Rivers (20%): Water bodies near dead or degraded river stretches (by dissolved oxygen readings) could serve as settling or treatment wetlands.

Industrial Pollution Proximity (15%): Water bodies near industrial discharge zones face greater contamination risk; restoring them helps protect groundwater.

Water Body Type (20%): Reservoirs and natural lakes are prioritised over canals, drains, and wastewater infrastructure.

Scores are computed from static spatial data and do not account for population density, land ownership, or restoration cost. Designed to support GCC budget allocation for lake restoration programmes.

Lost & Encroached Water Bodies - Per-Record Sources

Long Tank (Otteri Nullah)Fully lost
Care Earth Trust / IIT Madras Water Bodies Study
Nungambakkam TankFully lost
Survey of India historical maps / CMDA
Kodambakkam LakeFully lost
Care Earth Trust water body survey
Virugambakkam LakeSeverely reduced
CMDA Master Plan / Care Earth Trust
Pallikaranai MarshSeverely reduced
Care Earth Trust / Ashoka Trust for Research in Ecology
Perungudi LakeFully lost
Care Earth Trust / Greater Chennai Corporation records
Madipakkam LakePartially encroached
Madras High Court / NGT records
Sholinganallur MarshlandSeverely reduced
Salim Ali Centre for Ornithology / Care Earth Trust
Villivakkam LakePartially encroached
CMDA / Revenue records / Care Earth Trust
Kolathur LakePartially encroached
Care Earth Trust water body inventory
Tambaram TankSeverely reduced
Survey of India maps / Tamil Nadu PWD records
Manali WetlandsSeverely reduced
TNPCB / Tamil Nadu Pollution Control Board reports
Ennore Creek WetlandsSeverely reduced
Coastal Management Society / National Green Tribunal (Chennai)
Muttukadu BackwatersPartially encroached
CMDA Coastal Regulation Zone / Care Earth Trust
Chetpet LakeSeverely reduced
GCC / Care Earth Trust / Madras High Court order 2018

Rivers page

The river map shows four rivers - Cooum, Adyar, Buckingham Canal, and Kosasthalaiyar - colour-coded by overall water quality status derived from CPCB monitoring data.

Flood page

Overlays historical flood hazard zones from OpenCity on the ward map together with the Greater Chennai Corporation storm water drain (SWD) network. The goal is to let residents see whether their street sits inside a documented hazard footprint and whether a drain is mapped nearby.

My Ward page

A single-page rollup of every spatial layer the dashboard knows about - reservoirs, groundwater depth, ward risk score, water bodies, rivers, flood zones, drains, sewerage, CGWB stations - all filtered to one ward. It is the deep-link target when you click a ward anywhere in the app.

Ward Report Card

Each of Chennai's 200 wards is ranked on 5 governance-quality metrics. Percentile-based A-F grades compare every ward against the full city. All density metrics are area-normalized (per sq km). The composite score is a weighted sum of per-metric percentiles.

MetricWeightDirection
Drainage coverage25%Higher = better
Sewerage infrastructure25%Higher = better
Flood exposure25%Lower = better
Water body health15%Lower = better
Water body density10%Higher = better

Grades: A (80th+ percentile), B (60-79th), C (40-59th), D (20-39th), F (below 20th). The overall grade uses the same thresholds on the composite score's percentile rank.

Uplift Planner

The uplift planner answers: "If I had INR X crore for my ward, where should I invest it?" It uses a greedy budget optimizer to allocate a hypothetical budget across 5 intervention types, maximizing the ward's composite improvement per crore spent.

How it works
  1. Gap analysis: compares the ward's current value on each metric against the city distribution to identify where it lags.
  2. Greedy optimizer: at each step, evaluates every feasible intervention and picks the one with the highest composite-score improvement per crore. Repeats until the budget is spent or all caps are hit.
  3. Exact projection: builds a modified ward profile with the projected metric values and reruns the full ranking engine (computeWardRankings) to determine the exact after-state grade and percentile - not an approximation.
Data-backed caps

Each intervention is capped by real ward data: flood mitigation limited to the actual number of high/very-high hazard zones; water body restoration limited to bodies rated critical or high; revival limited to documented lost bodies. Infrastructure interventions (drains, sewerage) have practical per-ward caps.

Cost estimates

All cost ranges come from published GCC, CMWSSB, Smart Cities Mission, NDMA, and NGO project reports. Each allocation shows a low-high range; the optimizer uses the midpoint. These are illustrative - actual costs depend on site conditions, land, and procurement.

InterventionCost/unitMetric
Build storm drains1.5-3.0 Cr/kmDrainage coverage
Extend sewage network3.0-6.0 Cr/kmSewerage infrastructure
Flood zone mitigation5-15 Cr/zoneFlood exposure
Restore water bodies2-8 Cr/bodyWater body health
Revive lost water bodies10-25 Cr/bodyWater body density

Intelligence & AI Narratives

Beyond raw data display, Neer Vazhvu runs three intelligence modules daily to generate actionable insights.

Daily Briefing: A template-based intelligence summary generated each morning with a headline, key metrics, threshold-based alerts, and actionable recommendations. No LLM required - purely data-driven.
AI-Generated Narratives Claude AI reads live reservoir, groundwater, and risk data to generate a daily city briefing and monthly ward-level analysis in both English and Tamil. Each narrative includes source data freshness dates.
Ward Profile Index Every data layer - water bodies, flood zones, drainage, sewerage, rivers, industrial zones - is spatially mapped to each of Chennai's 200 wards. This enables cross-domain context: click any feature on any map to see its ward's complete water picture.

Data Source Index

All operational data is collected by the Python pipeline and supporting scripts that power the dashboard. Raw source data and Earth Engine summaries are upserted into Supabase (PostgreSQL) and then exposed as small, readable product signals.

Reservoirs & weather

CMWSSB Lake Level PageDaily (scraped at 06:00 IST)

Daily reservoir levels for 6 reservoirs: Poondi, Cholavaram, Red Hills, Chembarambakkam, Veeranam, and Kannankottai. Includes storage (mcft), water level (ft), inflow/outflow (cusecs), and rainfall (mm).

Open-MeteoDaily (zero lag)

Primary weather source for Chennai (13.08°N, 80.27°E): precipitation, temperature, humidity, reference evapotranspiration (ET₀), and wind speed. Zero data lag, no API key required. ET₀ is used in the ARIMAX forecasting model to account for reservoir evaporation losses.

NASA POWER (fallback)Daily (2-day lag)

Fallback weather source. Satellite-derived data for Chennai: precipitation, max/min temperature, and relative humidity. Activated automatically when Open-Meteo is unreachable. 2-day data lag.

OpenCity Chennai (Lake Storage)Historical (2003-2021)

Monthly reservoir storage data (mcft) for all 6 reservoirs, spanning 2003-2021. Used as historical seed for the forecasting model.

IMD Gridded Rainfall (via imdlib)One-time generation (refreshed annually)

56-year monthly rainfall history (1970-2025) from IMD's 0.25-degree gridded dataset, extracted for the Chennai grid cell. Includes annual totals and long-term monthly normals for drought/flood/Day Zero year identification.

Groundwater

Station-level groundwater time series for ~35 CGWB monitoring stations in Chennai district, pulled daily from the India WRIS Ground Water Level API. Mix of Manual (quarterly dug wells, unconfined aquifer) and Telemetric (daily DWLR bore wells, confined aquifer) stations with well type, well depth, and aquifer metadata. Each station is scored server-side with a stuck/stale/ok data quality flag.

India WRIS / CGWB (Block Exploitation)Static fetch (refreshed periodically)

Block-level groundwater exploitation data (2011-2024) from CGWB via India WRIS ArcGIS API. Shows classification (Safe to Over-Exploited), development percentage, net availability, and extraction draft for ~15 blocks in and around Chennai.

OpenCity Chennai (Groundwater)Monthly (fetched days 1-3)

Ward-wise depth to water table (metres below ground level) for all 200 GCC wards across 15 zones. Sourced from CGWB/GCC monitoring wells. Data available from 2021 onwards.

Water bodies & historical

305 Chennai water bodies from the First Census of Water Bodies (2018-19) by the Ministry of Jal Shakti. Includes ownership, storage capacity (original vs present), encroachment status, depth, construction year, and basin information. Overlaid as markers on the Water Bodies map.

Kaggle Chennai Water ManagementOne-time historical seed

15 years of daily reservoir data (2004-2019) compiled by Sudalai Rajkumar. Used as additional historical training data for the forecasting model.

15 manually curated lost or encroached water bodies, compiled from published research, court records, and environmental organisation reports. See the Water Bodies Map section below for per-record provenance.

Rivers & pollution

9 restoration projects across Adyar, Cooum, Buckingham Canal, and Kosasthalaiyar rivers from the Chennai Rivers Restoration Trust. Includes project status, budget, area, implementing agencies, and outcome metrics.

Annual reports from the Central Pollution Control Board's National Water Monitoring Programme. Source for DO, BOD, pH, and conductivity readings at monitoring stations on the Cooum, Adyar, Buckingham Canal, and Kosasthalaiyar rivers. Supplemented by IIT Madras / Anna University peer-reviewed studies and NGT Chennai bench orders.

31 geo-located sewage inlets along the Cooum river, Otteri Nullah tributary, and Buckingham Canal. Discharge volumes (m3/day) from PWD Chennai, published in Nature Environment and Pollution Technology, Vol. 16, No. 3. Supplemented by Sheriff & Hussain (2012) groundwater contamination study.

7 major industrial facilities in the Ennore-Manali corridor, curated from NGT Southern Bench orders (2017-2022), TNPCB enforcement records, CPCB industrial monitoring reports, and academic studies. Each facility entry includes pollutant types, documented incidents with volumes and dates, and NGT order summaries.

Flood & drainage

OpenCity Chennai (Flood Hazard Data)Static GeoJSON (re-run script to refresh)

CFLOWS model flood hazard zones (5 categories), 2015 Chennai flood hotspots with vulnerability ratings, 2015 inundation depth readings, 2020 Cyclone Nivar hotspots, and return period flood maps (5-200yr).

GCC Storm Water Drain SurveyStatic GeoJSON (re-run script to refresh)

10,308 official storm water drain segments from GCC survey (2023) across 197 wards, with street name, drain type, depth, width, length, material, and condition status.

CMWSSB Sewerage NetworkStatic GeoJSON (re-run script to refresh)

CMWSSB sewerage infrastructure: 8 sewage treatment plants (STPs) with capacity, 348 pumping stations (SPS) linked to STPs, and 3,834 pumping main segments with pipe material and size.

Satellite & Earth Engine

NDWI (Normalized Difference Water Index) water masks computed from Sentinel-2 green and near-infrared bands via Google Earth Engine. Used for both the water spread summary numbers and the satellite evidence overlay, replacing Dynamic World for more accurate detection of turbid and dark water.

JRC Global Surface Water monthly recurrence used as the seasonal baseline for the same calendar month. This is how we judge whether current spread is lower or higher than usual for this time of year.

CHIRPS Daily RainfallDaily (zero lag)

CHIRPS daily rainfall over reviewed reservoir catchments. We use this for 7, 30, and 90 day rainfall totals and seasonal anomaly buckets on the dashboard.

Copernicus Sentinel-2 (via Earth Engine)Evidence refresh (manual dispatch)

True-color satellite imagery for reviewed evidence frames. Sentinel-2 captures 10 m resolution optical imagery every 5 days, used to produce visual evidence of water presence at flagship water bodies.

HydroBASINS / MERIT HydroStatic fetch (refreshed periodically)

HydroBASINS and MERIT Hydro support the reviewed operational catchment polygons used for the four core Chennai supply reservoirs. These geometries are reviewed for storytelling use, not presented as formal legal boundaries.

Base geography

OpenStreetMap (Overpass API)Static GeoJSON (re-run script to refresh)

All current water bodies (lakes, tanks, reservoirs, ponds, marshes) within the Chennai metropolitan bounding box. Queried via the Overpass API and saved as a static GeoJSON. 1,635 polygon features, ~95,000 ha total surface. Also source for river polyline geometry (Cooum, Adyar, Buckingham Canal, Kosasthalaiyar) and industrial zone polygons in the north Chennai corridor. Data reflects OSM contributor edits as of the last script run.

AI narratives

Anthropic Claude APIDaily / Monthly

AI-generated city and ward narratives connecting reservoir, groundwater, and risk data (Claude Sonnet for city, Haiku for wards)

Data quality & limitations

Known Data Quality Issues

Government census data is invaluable but not perfect. We document known issues here for transparency. If you spot an error, please report it on GitHub.

Census: Mixed units in water_spread_area

The MoJS census methodology specifies hectares for water spread area, but 39 of 286 Chennai records appear to use square meters instead. Example: RETTAI ERI is recorded as 1,053,177 - this is sq m (~105 ha), confirmed against satellite imagery and Wikipedia (87–114 ha actual). Due to this inconsistency, census markers on the map use a uniform size as location indicators rather than representing actual water body area.

Census: Encroachment vs. storage capacity mismatch

Storage capacity and encroachment were surveyed independently. Some water bodies show 70%+ encroachment but 100% storage capacity remaining - the capacity figure was not revised to reflect lost area. These cases are flagged with an amber warning in the detail panel.

Census: Point coordinates only, no boundary polygons

The census provides only a single lat/lon point per water body, not boundary shapes. Where possible, census records are matched to nearby OpenStreetMap polygons (within 200m) so the actual water body shape is shown and census metadata (ownership, encroachment, capacity) appears in the detail panel. Unmatched census records are shown as small dots at the reported location.

Satellite: seasonal baseline is month-level, not day-level

The current satellite context compares a recent 45-day observation window to JRC monthly recurrence for the same calendar month. This is a strong seasonal reference, but it does not mean we know the exact expected spread for every day of the month.

Catchments: reviewed operational geometry, not legal survey boundary

Poondi, Red Hills, Chembarambakkam, and Cholavaram catchments are built from a mix of HydroBASINS, MERIT Hydro, and local drainage review. They are appropriate for rainfall context and inflow-support storytelling, but should not be treated as official cadastral boundaries.

Known Limitations

  • Estimates are approximations. Actual water availability depends on factors not modeled (groundwater extraction, Krishna water transfer, distribution losses, industrial use).
  • CMWSSB data may occasionally be stale (weekends, holidays). The dashboard shows a freshness indicator.
  • Groundwater data from OpenCity may lag by months. The map always shows the most recent available period.
  • Forecasts use ARIMAX (AutoARIMA with inflow/outflow as exogenous regressors) and work best with 2+ years of daily data.
  • Risk scores are relative indicators for comparison between wards, not absolute measures of water safety.
  • Satellite spread is a summary of surface water extent, not a direct measure of storage volume, water quality, or inflow source. A lake can look broad and still hold less usable water than expected.
  • Reservoir catchment polygons are reviewed operational geometries for rainfall context, not official legal boundaries. This matters especially in Chennai's managed canal and transfer system.
  • Current satellite context relies on optical Sentinel-2 observations. During persistently cloudy periods, some water bodies may temporarily lose this insight until a radar fallback is added.

About the project

Disclaimer

Not an official government tool. Neer Vazhvu is an independent, open-source project. It is not affiliated with, endorsed by, or connected to CMWSSB, GCC, CGWB, or any government body.

Informational purposes only. All data, estimates, and forecasts are provided “as is” for general awareness. Always refer to official CMWSSB advisories for critical decisions.

No personal data collected. Neer Vazhvu does not collect, store, or process any personal information. There are no user accounts, cookies, or analytics trackers.

Open Source

Neer Vazhvu is fully open source. The code, data pipeline, and methodology are transparent and available on GitHub. Contributions, bug reports, and data corrections are welcome.

View on GitHub

Support this project

Neer Vazhvu is free and open source. If you find it useful, consider supporting us on Patreon to help cover satellite data, hosting, and API costs.

Support on Patreon