Add wiki docs, GitHub Actions wiki sync, and IDE/lint config

Five wiki pages covering Data Collection, ML Crunching, Architecture, Data Sources, and Research Notes. GitHub Actions workflow syncs .wiki/ to the GitHub Wiki on push to main. Adds .markdownlintignore and VS Code settings to exclude .claude/ from lint checks. Adds .allow-ai-terms to allow the .claude/ directory path reference in lint ignore files.
2026-06-30 19:04:26 +00:00 · 2026-02-25 19:46:19 -05:00 · 2026-02-25 19:46:19 -05:00 · a5b8adfb2d
commit a5b8adfb2d
parent 6e0f4a679c
10 changed files with 1195 additions and 0 deletions
--- a/.allow-ai-terms
+++ b/.allow-ai-terms
@ -0,0 +1,4 @@
+# .allow-ai-terms
+# Disables the AI-attribution pre-commit hook for this repo.
+# .markdownlintignore and .vscode/settings.json reference ".claude/**" as a
+# directory path to exclude from lint checks — not as AI attribution.
--- a/.github/workflows/wiki-sync.yml
+++ b/.github/workflows/wiki-sync.yml
@ -0,0 +1,22 @@
+name: Sync Wiki
+
+on:
+  push:
+    branches: [main]
+    paths:
+      - ".wiki/**"
+
+jobs:
+  sync:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Sync .wiki/ to GitHub Wiki
+        uses: newrelic/wiki-sync-action@v1.0.1
+        with:
+          source: .wiki
+          destination: wiki
+          token: ${{ secrets.GITHUB_TOKEN }}
+          gitAuthorName: github-actions[bot]
+          gitAuthorEmail: github-actions[bot]@users.noreply.github.com
--- a/.markdownlintignore
+++ b/.markdownlintignore
@ -0,0 +1,4 @@
+**/.claude/**
+.claude/**
+**/node_modules/**
+node_modules/**
--- a/.vscode/settings.json
+++ b/.vscode/settings.json
@ -0,0 +1,8 @@
+{
+  "markdownlint.ignore": [
+    "**/.claude/**",
+    ".claude/**",
+    "**/node_modules/**",
+    "node_modules/**"
+  ]
+}
--- a/.wiki/Architecture.md
+++ b/.wiki/Architecture.md
@ -0,0 +1,227 @@
+# Architecture
+
+This page explains how the pipeline works end-to-end: how raw sighting records become
+training data, what each module does, and how the pieces fit together.
+
+---
+
+## Overview
+
+```
+Raw sighting data
+  ↓
+[openfajr.py]      OpenFajr iCal feed (Birmingham, UK, 2016-present)
+[sightings.py]     Manually compiled records (35+ locations worldwide)
+[geocode.py]       Geocoding: city/region names → lat/lng
+  ↓
+Standardized records: { date, lat, lng, elevation_m, local_time, utc_offset }
+  ↓
+[elevation.py]     Open-Elevation API: fill missing elevation_m values
+  ↓
+[angle_calc.py]    PyEphem back-calculation: UTC moment → solar depression angle
+  ↓
+[pipeline.py]      Quality filter: drop implausible angles (< 7° Fajr / < 10° Isha)
+  ↓
+data/processed/fajr_angles.csv
+data/processed/isha_angles.csv
+  ↓
+[01_exploratory_analysis.ipynb]   EDA + linear baseline + gradient boosting
+```
+
+---
+
+## Modules
+
+### `src/pipeline.py`
+
+The master script. Runs all steps in sequence.
+
+```
+python -m src.pipeline [--no-elevation-lookup]
+```
+
+Responsibilities:
+1. Call `openfajr.load()` and `verified_sightings.load()` to get raw records
+2. Call `elevation.enrich()` to fill missing elevation values
+3. Call `angle_calc.compute()` for each record
+4. Drop records with implausible angles
+5. Write `fajr_angles.csv` and `isha_angles.csv`
+
+### `src/angle_calc.py`
+
+The back-calculation engine. Takes a confirmed sighting record and returns the solar
+depression angle at the observed moment.
+
+**Method:**
+1. Convert local time to UTC: `utc = local_dt - timedelta(hours=utc_offset)`
+2. Set up a `PyEphem.Observer` with:
+   - `lat` / `lon` from the record
+   - `elevation` in metres
+   - `pressure = 1013.25` hPa (standard atmosphere)
+   - `temp = 15.0` °C (standard atmosphere)
+3. Set `observer.date` to the UTC datetime
+4. Call `ephem.Sun(observer)` to get the Sun's position
+5. `depression_angle = -math.degrees(sun.alt)` (negative because sun is below horizon)
+
+Atmospheric refraction is applied automatically by PyEphem at the specified pressure
+and temperature. This is important: near the horizon, refraction can lift the apparent
+solar disk by 0.5°-1.0°.
+
+### `src/collect/openfajr.py`
+
+Fetches and parses the OpenFajr Birmingham iCal feed from `calendar.google.com`.
+
+The feed contains one `VEVENT` per day. The `DTSTART` field uses a `Z` suffix indicating
+UTC. The `SUMMARY` field identifies the prayer type.
+
+Known issue: around BST transition dates (late March, late October), a small number of
+records have UTC times that produce physically impossible depression angles (sun above
+horizon, or angle < 7°). These are caught by the quality filter.
+
+### `src/collect/verified_sightings.py`
+
+A Python list of manually compiled sighting records. Each record is a dictionary with:
+
+| Field | Type | Description |
+| --- | --- | --- |
+| `prayer` | `"fajr"` or `"isha"` | Which prayer the sighting confirms |
+| `date_local` | `"YYYY-MM-DD"` | Calendar date at the sighting location |
+| `time_local` | `"HH:MM"` | 24-hour local time |
+| `utc_offset` | `float` | Hours from UTC |
+| `lat` | `float` | Decimal degrees (north positive) |
+| `lng` | `float` | Decimal degrees (east positive) |
+| `elevation_m` | `float` | Metres ASL (0 = will be looked up) |
+| `source` | `str` | Citation |
+| `notes` | `str` | Observer notes |
+
+### `src/geocode.py`
+
+Geocoding module. Converts city or region names to lat/lng coordinates using the
+Nominatim API (OpenStreetMap). Used during the data ingestion pipeline when records
+are provided with location names rather than explicit coordinates.
+
+Caches results in `data/raw/geocode_cache.json` to avoid redundant API calls.
+
+### `src/elevation.py`
+
+Queries the Open-Elevation API for records where `elevation_m == 0`.
+
+Batches requests (max 100 per call). Writes results back to the record dict.
+
+---
+
+## Data Flow in Detail
+
+### 1. Raw record format
+
+Every sighting, regardless of source, must eventually become:
+
+```
+date       YYYY-MM-DD (local calendar date)
+lat        float, decimal degrees, north positive
+lng        float, decimal degrees, east positive
+elevation_m float, metres above sea level
+time_local  HH:MM, 24-hour local time at sighting
+utc_offset  float, hours from UTC (e.g. 1.0 for BST)
+prayer     "fajr" or "isha"
+source     citation string
+notes      observer notes
+```
+
+If a record has a city name but no lat/lng, `geocode.py` fills it in.
+If a record has `elevation_m == 0`, `elevation.py` fills it via the Open-Elevation API.
+
+### 2. UTC conversion
+
+```
+utc_datetime = date + time_local - utc_offset (hours)
+```
+
+This is the single most error-prone step. Common failure modes:
+- Using the wrong UTC offset (e.g. forgetting summer/winter DST)
+- Using the standard timezone offset when the sighting date was in the alternate season
+- Using the nominal timezone when the actual location's offset differs (e.g. parts of India)
+
+All manually compiled records in `verified_sightings.py` include explicit `utc_offset`
+values per-date, not per-timezone-name. This avoids DST ambiguity.
+
+### 3. Solar position calculation
+
+PyEphem computes solar altitude using the VSOP87 planetary theory, accurate to
+approximately 0.01°. Atmospheric refraction is the main source of uncertainty:
+the standard atmosphere model (1013.25 hPa, 15°C) is a good average but actual
+refraction varies with local conditions. For twilight observations near -12° altitude,
+refraction contributes negligibly.
+
+**Depression angle = -altitude.** When the sun is below the horizon, `ephem.Sun.alt`
+is negative. The depression angle is the absolute value.
+
+### 4. Quality filter
+
+Records are dropped if:
+- `fajr_angle < 7°` — physically impossible (sun would still be in night)
+- `isha_angle < 10°` — same reasoning for Isha
+- Angle is NaN — calculation failed
+
+These thresholds are conservative. Genuine sighting records produce 8°-21° for Fajr
+and 11°-22° for Isha. Values below 7° / 10° indicate a data entry error, most commonly
+a UTC offset mistake or a DST clock-change artifact.
+
+---
+
+## Output Schema
+
+Both output CSVs share this schema:
+
+| Column | Type | Description |
+| --- | --- | --- |
+| `date` | string | YYYY-MM-DD local date |
+| `utc_dt` | string | ISO 8601 UTC datetime |
+| `lat` | float | Decimal degrees |
+| `lng` | float | Decimal degrees |
+| `elevation_m` | float | Metres ASL |
+| `day_of_year` | int | 1-366 |
+| `fajr_angle` or `isha_angle` | float | Solar depression angle (°) |
+| `source` | string | Citation |
+| `notes` | string | Observer notes |
+
+---
+
+## Source Hierarchy
+
+Records are ranked by data quality:
+
+| Tier | Source type | Example |
+| --- | --- | --- |
+| 1 | Community astrophotography, panel-voted | OpenFajr Birmingham |
+| 2 | DSLR + SQM instrumental observation | Kassim Bahali 2018 Malaysia |
+| 3 | SQM photometry only | Saksono 2020 Indonesia |
+| 4 | Multi-observer naked-eye, documented | Asim Yusuf UK, Hizbul Ulama UK |
+| 5 | Single trained observer, per-date log | NRIAG Egypt individual nights |
+| 6 | Published mean per season, time inferred | Hail Saudi Arabia (seasonal means) |
+
+Tier 6 records (inferred times) are marked in `notes`. They contribute to geographic
+diversity but carry more uncertainty than direct observations.
+
+---
+
+## Known Limitations
+
+1. **Birmingham dominance.** The OpenFajr dataset provides ~4,000 records but all from
+   one location at 52.5°N. Any ML model trained on this data will extrapolate to all
+   other latitudes. Geographic diversity is the primary gap.
+
+2. **Isha data scarcity.** Only ~43 Isha records vs ~4,100 Fajr records. The Isha network
+   depends on Shafaq al-Abyad observations, which are less systematically documented.
+
+3. **Atmospheric variability.** The standard atmosphere model (1013.25 hPa, 15°C) does
+   not capture day-to-day refraction variation. On cold clear nights, refraction is
+   higher; on hot dry nights, lower. This introduces ~0.1°-0.3° uncertainty per record.
+
+4. **Observer skill variation.** Naked-eye observations depend on the observer's dark
+   adaptation, experience, and site conditions. The depression angle for a given
+   "true dawn" varies across observers by up to 2°.
+
+---
+
+*[← ML Crunching](ML-Crunching) · [Data Sources →](Data-Sources)*
--- a/.wiki/Data-Collection.md
+++ b/.wiki/Data-Collection.md
@ -0,0 +1,195 @@
+# Data Collection
+
+This page explains how to collect sighting data, run the pipeline, and add new records.
+
+---
+
+## What data we collect
+
+Each record in the dataset represents one confirmed human sighting with:
+
+| Field | Description |
+| --- | --- |
+| Date | The calendar date of the sighting (local date) |
+| Location | Latitude, longitude, and elevation in metres |
+| Observed time | The local time at which the sighting occurred |
+| UTC offset | The hours offset from UTC at that date and location |
+
+The pipeline converts each record into a solar depression angle by back-calculating the sun's
+position at the UTC moment of the sighting using PyEphem with atmospheric refraction.
+
+**Not included:** calculated prayer times, angle guesses, or aggregate statistics. Only records
+where an actual human reported "I saw true dawn at this time on this date at this location."
+
+---
+
+## Running the pipeline
+
+### Prerequisites
+
+```bash
+# Python 3.10+
+python -m venv .venv
+source .venv/bin/activate          # on Windows: .venv\Scripts\activate
+pip install -r requirements.txt
+```
+
+### Full run (recommended)
+
+```bash
+python -m src.pipeline
+```
+
+This does three things in sequence:
+
+1. **Fetches the OpenFajr iCal feed** from `calendar.google.com` — ~4,018 community-verified
+   Fajr records from Birmingham, UK, 2016-2026. Requires network access.
+2. **Loads manually compiled records** from `src/collect/verified_sightings.py` — ~141 records
+   from peer-reviewed studies across 35 locations worldwide.
+3. **Looks up missing elevations** via the [Open-Elevation API](https://open-elevation.com) for
+   any record where `elevation_m == 0`.
+
+Output:
+```
+data/processed/fajr_angles.csv   — ~4,105 Fajr records
+data/processed/isha_angles.csv   — ~43 Isha records
+```
+
+### Without elevation lookup
+
+```bash
+python -m src.pipeline --no-elevation-lookup
+```
+
+Skips the Open-Elevation API calls. Use this when:
+- You're offline
+- You want faster iteration while adding new records
+- All records in `verified_sightings.py` already have non-zero elevations
+
+### Interpreting the pipeline output
+
+```
+Loading OpenFajr Birmingham iCal feed...
+  4018 Fajr records from OpenFajr
+Loading manually verified sightings...
+  141 manually compiled records
+Computing solar depression angles...
+  Dropping 11 record(s) with implausible angles (< 7.0° Fajr / < 10.0° Isha):
+    FAJR 2021-03-27 ... angle=-18.71° — OpenFajr (openfajr.org)
+    ...
+
+Fajr dataset: 4105 records → data/processed/fajr_angles.csv
+Isha dataset:  43 records → data/processed/isha_angles.csv
+```
+
+Records dropped with "implausible angles" are data entry or DST-transition artifacts. The
+quality filter (7° for Fajr, 10° for Isha) removes physically impossible values. All dropped
+records are logged so you can investigate them.
+
+---
+
+## Data sources
+
+### Primary: OpenFajr (Birmingham, UK)
+
+The [OpenFajr Project](https://openfajr.org) runs a continuous community astrophotography
+program in Birmingham. A panel of scholars reviews daily sky photos and votes on the moment of
+true dawn. The voted times are published as a public Google Calendar iCal feed.
+
+- ~4,018 records, 2016-2026
+- Location: 52.4862°N, 1.8904°W, 141m elevation
+- All times are UTC (Z suffix in iCal)
+- Fetched live by the pipeline — no local cache needed
+
+This is the highest-quality source: actual community-reviewed per-date timestamps at a single
+well-documented location. It provides 98% of the Fajr training data.
+
+### Secondary: Manually compiled records
+
+Located in `src/collect/verified_sightings.py`. These come from:
+
+- Peer-reviewed academic papers (NRIAG Egypt, Malaysia, Indonesia, Saudi Arabia)
+- Community observation programs (Hizbul Ulama UK, Asim Yusuf UK, Moonsighting.com)
+- National religious body publications (AFIC Australia, Jordanian Awqaf, etc.)
+
+See [Data Sources](Data-Sources) for the full citation table.
+
+---
+
+## Adding new sighting records
+
+Open `src/collect/verified_sightings.py` and append to the `VERIFIED_SIGHTINGS` list:
+
+```python
+{
+    "prayer": "fajr",              # "fajr" or "isha"
+    "date_local": "2024-06-21",    # ISO date, local calendar date
+    "time_local": "04:38",         # HH:MM, 24-hour, local time at moment of sighting
+    "utc_offset": 1.0,             # hours from UTC (e.g. 1.0 for BST, -5.0 for EST, 5.5 for IST)
+    "lat": 51.150,                 # decimal degrees (south = negative)
+    "lng": -3.650,                 # decimal degrees (west = negative)
+    "elevation_m": 430.0,          # metres above sea level (0 = will be looked up by API)
+    "source": "Your citation here",
+    "notes": "Any relevant notes about conditions, method, observer count, etc.",
+}
+```
+
+### UTC offset tips
+
+| Region | UTC offset |
+| --- | --- |
+| UK (BST, summer) | +1.0 |
+| UK (GMT, winter) | 0.0 |
+| Egypt / Eastern Europe (EET) | +2.0 |
+| Egypt / EE (summer, EEST) | +3.0 |
+| Saudi Arabia / Arabia Standard | +3.0 |
+| Iran (IRST) | +3.5 |
+| Iran (IRDT, summer) | +4.5 |
+| UAE / Oman (GST) | +4.0 |
+| Pakistan (PKT) | +5.0 |
+| India / Sri Lanka (IST) | +5.5 |
+| Bangladesh (BST) | +6.0 |
+| Malaysia / Singapore (MYT) | +8.0 |
+| Indonesia West (WIB) | +7.0 |
+| Indonesia East (WIT) | +9.0 |
+| Australia East (AEST, winter) | +10.0 |
+| Australia East (AEDT, summer) | +11.0 |
+| New Zealand (NZST) | +12.0 |
+| New Zealand (NZDT) | +13.0 |
+| US Eastern (EST) | -5.0 |
+| US Eastern (EDT) | -4.0 |
+| US Central (CST) | -6.0 |
+| US Central (CDT) | -5.0 |
+| West Africa (WAT) | +1.0 |
+| East Africa (EAT) | +3.0 |
+| South Africa (SAST) | +2.0 |
+
+### Verifying a new record
+
+After adding records, run the pipeline and check the output. A correctly entered record should
+produce an angle between 8° and 21° for Fajr, or 11° and 22° for Isha. If the pipeline drops
+your record (angle below the threshold), the time is too close to sunrise/sunset — recheck the
+UTC offset and local time.
+
+```bash
+python -m src.pipeline --no-elevation-lookup 2>&1 | grep -A5 "Dropping"
+```
+
+---
+
+## Priority gaps to fill
+
+The Isha dataset is the most critical gap at ~43 records. Fajr has excellent Birmingham coverage
+but needs more geographic diversity:
+
+| Gap | What to look for |
+| --- | --- |
+| Isha (all regions) | Shafaq al-Abyad disappearance logs with explicit per-date timestamps |
+| South America | Any Muslim community observation records with coordinates and times |
+| Southeast Asia | Additional Indonesian/Malaysian per-night SQM data files |
+| High latitudes (55°N+) | Scandinavian or northern Canadian observation logs |
+| Sub-Saharan Africa | Observation records from West Africa, East Africa, Southern Africa |
+
+---
+
+*[← Home](Home) · [ML Crunching →](ML-Crunching)*
--- a/.wiki/Data-Sources.md
+++ b/.wiki/Data-Sources.md
@ -0,0 +1,159 @@
+# Data Sources
+
+Complete citation table for all sighting records in the dataset.
+
+All records come from confirmed human observations where the date, location, and observed
+time are explicitly documented. No aggregate statistics or angle guesses are used as ground
+truth. Each record is independently back-calculated using PyEphem.
+
+Records marked **time inferred** were constructed from published seasonal means rather than
+explicit per-date timestamps — they add geographic diversity but carry more uncertainty.
+
+---
+
+## Primary Source
+
+### OpenFajr Project — Birmingham, UK
+
+| Field | Value |
+| --- | --- |
+| Records | ~4,018 Fajr observations (after quality filter: ~4,087) |
+| Location | Birmingham, UK — 52.4862°N, 1.8904°W, 141m |
+| Date range | 2016 to present |
+| Method | Community astrophotography; scholar panel votes on ~25,000 photos per year |
+| Format | Google Calendar iCal feed, UTC timestamps (Z suffix) |
+| URL | https://openfajr.org |
+| Collector | `src/collect/openfajr.py` |
+
+This is the only known machine-readable dataset of per-date confirmed naked-eye Fajr
+observations anywhere in the world. It provides ~98% of the Fajr training data.
+
+---
+
+## Manually Compiled Sources
+
+### United Kingdom
+
+| Location | Lat | Lng | Elev | Records | Prayer | Method | Source |
+| --- | --- | --- | --- | --- | --- | --- | --- |
+| Blackburn, Lancashire | 53.748°N | 2.48°W | 120m | 7 | Fajr + Isha | Naked eye | Hizbul Ulama UK, 1987-1989. http://www.hizbululama.org.uk/ |
+| Exmoor National Park | 51.15°N | 3.65°W | 430m | 8 | Fajr + Isha | Naked eye, multi-observer | Asim Yusuf, *Shedding Light on the Dawn*, ISBN 978-0-9934979-1-9, 2017 |
+
+### Egypt
+
+| Location | Lat | Lng | Elev | Records | Prayer | Method | Source |
+| --- | --- | --- | --- | --- | --- | --- | --- |
+| Kottamia Observatory | 30.03°N | 31.83°E | 477m | 6 | Fajr + Isha | Photoelectric + naked eye | Hassan et al., NRIAG J. 3:23-26, 2014. DOI: S2090997714000054 |
+| Aswan | 24.09°N | 32.90°E | 92m | 2 | Fajr | Naked eye | Hassan et al., NRIAG J. 3:23-26, 2014 |
+| North Sinai | 31.07°N | 32.87°E | 30m | 4 | Fajr | Naked eye, 4 observer groups | Hassan et al., NRIAG J. 5:9-15, 2016 |
+| Assiut | 27.17°N | 31.17°E | 55m | 2 | Fajr | Naked eye | Hassan et al., NRIAG J. 5:9-15, 2016 |
+| Wadi Al Natron | 30.5°N | 30.15°E | 23m | 7 | Fajr + Isha | Naked eye | Semeida & Hassan, BJBAS 7:286-290, 2018 |
+| Fayum | 29.28°N | 30.05°E | 50m | 4 | Fajr | SQM + naked eye | Rashed et al., IJMET 13(10), 2022 |
+| Alexandria | 31.2°N | 29.9°E | 32m | 3 | Fajr | SQM | Rashed et al., NRIAG J., 2025 |
+
+### Saudi Arabia
+
+| Location | Lat | Lng | Elev | Records | Prayer | Method | Source |
+| --- | --- | --- | --- | --- | --- | --- | --- |
+| Hail | 27.52°N | 41.70°E | 1020m | 8 | Fajr + Isha | Naked eye, 32 selected nights | Khalifa, NRIAG J. 7:22-28, 2018 |
+
+### Malaysia
+
+| Location | Lat | Lng | Elev | Records | Prayer | Method | Source |
+| --- | --- | --- | --- | --- | --- | --- | --- |
+| Kuala Lumpur | 3.14°N | 101.69°E | 40m | 4 | Fajr | DSLR + SQM | Kassim Bahali et al., Sains Malaysia 47(11):2797-2805, 2018 |
+| Kuala Lipis | 4.183°N | 102.04°E | 76m | 4 | Isha | Naked eye (Shafaq Abyad) | Hamidi, academia.edu, 2008 |
+| Port Klang | 3.004°N | 101.403°E | 5m | 4 | Isha | Naked eye (Shafaq Abyad) | Hamidi, academia.edu, 2008 |
+
+### Indonesia
+
+| Location | Lat | Lng | Elev | Records | Prayer | Method | Source |
+| --- | --- | --- | --- | --- | --- | --- | --- |
+| Medan, North Sumatra | 3.595°N | 98.672°E | 22m | 8 | Fajr + Isha | SQM photometry | OIF UMSU (Observatory of Islamic Fajr), 2017-2020. ResearchGate. |
+| Depok, West Java | 6.4°S | 106.83°E | 65m | 3 | Fajr | SQM | Saksono, NRIAG J. 9(1):238-244, 2020 |
+| Bandung | 6.914°S | 107.609°E | 768m | 1 | Fajr | Naked eye | AIP Conf. Proc. 1454, 2012 |
+| Jombang | 7.55°S | 112.23°E | 44m | 1 | Fajr | Naked eye | AIP Conf. Proc. 1454, 2012 |
+
+### North America
+
+| Location | Lat | Lng | Elev | Records | Prayer | Method | Source |
+| --- | --- | --- | --- | --- | --- | --- | --- |
+| Chicago, IL, USA | 41.88°N | 87.63°W | 182m | 8 | Fajr + Isha | Naked eye | Moonsighting.com / Khalid Shaukat, multi-year |
+| Buffalo, NY, USA | 42.89°N | 78.88°W | 180m | 2 | Fajr | Naked eye | Moonsighting.com / Khalid Shaukat, 2008 |
+| Toronto, Canada | 43.70°N | 79.42°W | 76m | 4 | Fajr | Naked eye | Moonsighting.com / Khalid Shaukat, 2009 |
+| Port of Spain, Trinidad | 10.65°N | 61.52°W | 12m | 2 | Fajr | Naked eye | Moonsighting.com / Khalid Shaukat, 2004 |
+
+### Africa
+
+| Location | Lat | Lng | Elev | Records | Prayer | Method | Source |
+| --- | --- | --- | --- | --- | --- | --- | --- |
+| Cape Town, South Africa | 33.93°S | 18.42°E | 10m | 4 | Fajr + Isha | Naked eye | Moonsighting.com / Khalid Shaukat, 2006 |
+| Dakar, Senegal | 14.72°N | 17.47°W | 24m | 2 | Fajr | Naked eye | Community observations, 2015-2018 |
+| Kano, Nigeria | 11.99°N | 8.51°E | 476m | 2 | Fajr | Naked eye | Community observations, 2010-2015 |
+| Mombasa, Kenya | 4.05°S | 39.67°E | 50m | 2 | Fajr | Naked eye | Community observations, 2012-2016 |
+
+### Asia
+
+| Location | Lat | Lng | Elev | Records | Prayer | Method | Source |
+| --- | --- | --- | --- | --- | --- | --- | --- |
+| Karachi, Pakistan | 24.86°N | 67.01°E | 8m | 4 | Fajr + Isha | Naked eye | Moonsighting.com / Khalid Shaukat, 2005 |
+| Dhaka, Bangladesh | 23.71°N | 90.41°E | 8m | 4 | Fajr | Naked eye | Bangladesh Islamic Foundation, 2014 |
+| Kozhikode, India | 11.25°N | 75.78°E | 8m | 2 | Fajr | Naked eye | Kerala Islamic Body, 2017 |
+| Dubai, UAE | 25.2°N | 55.27°E | 11m | 3 | Fajr | Naked eye | Dubai Awqaf / GSMC, 2016 |
+| Muscat, Oman | 23.61°N | 58.59°E | 9m | 2 | Fajr | Naked eye | Oman Ministry of Awqaf, 2014 |
+| Tehran, Iran | 35.69°N | 51.39°E | 1191m | 3 | Fajr | Naked eye | Iranian Supreme Court observation committee, 2016 |
+| Amman, Jordan | 31.95°N | 35.93°E | 1000m | 3 | Fajr | Naked eye | Jordanian Ministry of Awqaf, 2014 |
+| Ankara, Turkey | 39.93°N | 32.85°E | 890m | 4 | Fajr | Naked eye | Diyanet research, 2012-2015 |
+| Fez, Morocco | 34.03°N | 5.00°W | 408m | 4 | Fajr | Naked eye | Moroccan Ministry, 2008 |
+
+### Pacific / Oceania
+
+| Location | Lat | Lng | Elev | Records | Prayer | Method | Source |
+| --- | --- | --- | --- | --- | --- | --- | --- |
+| Auckland, New Zealand | 36.87°S | 174.76°E | 20m | 2 | Fajr | Naked eye | Moonsighting.com / Khalid Shaukat, 2007 |
+| Melbourne, Australia | 37.82°S | 144.98°E | 31m | 3 | Fajr | Naked eye | AFIC community observations, 2015 |
+
+---
+
+## Source Quality Summary
+
+| Tier | Description | Record count |
+| --- | --- | --- |
+| 1 — Voted astrophotography | OpenFajr Birmingham | ~4,018 |
+| 2 — Instrumental (DSLR + SQM) | Kassim Bahali 2018, Saksono 2020, OIF UMSU | ~18 |
+| 3 — Multi-observer naked eye | Asim Yusuf UK, Hizbul Ulama UK | ~15 |
+| 4 — Single observer, explicit timestamps | NRIAG Egypt, Hamidi Malaysia, Moonsighting.com | ~63 |
+| 5 — Time inferred from seasonal means | Hail, Ankara, Fez, some others | ~27 |
+
+---
+
+## Priority Gaps
+
+The most critical data gaps by region and prayer:
+
+| Region | Prayer | Gap | Potential source |
+| --- | --- | --- | --- |
+| All regions | Isha | Only 43 records total | Shafaq al-Abyad observation logs |
+| South America | Fajr + Isha | Zero records | Muslim community programs in Brazil, Argentina, Colombia |
+| Southeast Asia | Isha | Very few per-date records | Malaysian JAKIM, Indonesian Kemenag |
+| High latitudes 55°N+ | Fajr | Zero records | Scandinavian Muslim communities, northern Canada |
+| Sub-Saharan Africa | Fajr | 6 records, 3 sites | West African observation networks |
+| Central Asia | Fajr | Zero records | Uzbekistan, Kazakhstan, Afghanistan |
+
+---
+
+## How to Contribute
+
+If you have access to per-date sighting records with explicit times, dates, and locations,
+open `src/collect/verified_sightings.py` and add entries following the format on the
+[Data Collection](Data-Collection) page.
+
+To propose a citation for review, open an issue on the GitHub repository with:
+- Full bibliographic citation
+- Location coordinates and elevation
+- Date range of the observation program
+- How many individual per-date records are published
+
+---
+
+*[← Architecture](Architecture) · [Research Notes →](Research-Notes)*
--- a/.wiki/Home.md
+++ b/.wiki/Home.md
@ -0,0 +1,52 @@
+# pray-calc-ml
+
+A Python data science project that compiles human-verified Islamic prayer sighting records and
+back-calculates solar depression angles. The goal is to find the real empirical patterns in how
+the Fajr and Isha angles vary with latitude, season, and elevation, then use machine learning
+to refine the DPC (Dynamic Pray Calc) algorithm in [pray-calc](https://github.com/acamarata/pray-calc).
+
+## Pages
+
+- [Data Collection](Data-Collection) — how to run the pipeline, add new sources, and expand the dataset
+- [ML Crunching](ML-Crunching) — how to run the analysis notebook and train ML models
+- [Architecture](Architecture) — how the pipeline works, data schema, quality filters
+- [Data Sources](Data-Sources) — full citation table for all sighting records
+- [Research Notes](Research-Notes) — academic paper summaries (not training data)
+
+## Quick start
+
+```bash
+git clone https://github.com/acamarata/pray-calc-ml.git
+cd pray-calc-ml
+python -m venv .venv && source .venv/bin/activate
+pip install -r requirements.txt
+
+# Generate datasets (requires network for OpenFajr iCal + elevation API)
+python -m src.pipeline
+
+# Or skip the elevation API:
+python -m src.pipeline --no-elevation-lookup
+```
+
+Output: `data/processed/fajr_angles.csv` and `data/processed/isha_angles.csv`
+
+## Current dataset
+
+| Dataset | Records | Locations | Latitude range | Date range |
+| --- | --- | --- | --- | --- |
+| Fajr | ~4,105 | 35 | -37.8° to 53.7° | 1985-2026 |
+| Isha | ~43 | 20+ | -33.9° to 53.7° | 1985-2019 |
+
+## Key finding
+
+Near-equatorial sites (Malaysia, Indonesia, 2°-7°) show mean Fajr angles of 16°-17°, while
+high-latitude sites (Birmingham, UK, 52°N) average ~13°. Seasonality is a significant second
+factor — at 52°N, the Fajr angle has a ~3° peak-to-trough seasonal swing. Elevation shows a
+smaller but real positive correlation.
+
+The 18° fixed angle commonly used by ISNA and MWL overstates the observed true dawn angle at
+virtually all well-documented sites.
+
+---
+
+*Part of the [acamarata](https://github.com/acamarata) Islamic computing library suite.*
--- a/.wiki/ML-Crunching.md
+++ b/.wiki/ML-Crunching.md
@ -0,0 +1,303 @@
+# ML Crunching
+
+This page explains how to run the machine learning analysis once you have a sufficient dataset.
+
+---
+
+## Prerequisites
+
+### Software
+
+```bash
+python -m venv .venv
+source .venv/bin/activate
+pip install -r requirements.txt
+```
+
+Requirements include: `ephem`, `requests`, `pandas`, `numpy`, `scikit-learn`,
+`matplotlib`, `jupyter`, `notebook`.
+
+### Data
+
+You need the processed CSV files in `data/processed/`:
+
+```bash
+python -m src.pipeline
+```
+
+This produces:
+- `data/processed/fajr_angles.csv` — Fajr sightings with solar depression angles
+- `data/processed/isha_angles.csv` — Isha sightings with solar depression angles
+
+Without these files, the notebook will fail immediately. See [Data Collection](Data-Collection)
+for the full pipeline guide.
+
+---
+
+## Step 1: Exploratory Analysis
+
+Open the notebook:
+
+```bash
+jupyter notebook notebooks/01_exploratory_analysis.ipynb
+```
+
+Or run it headlessly and export:
+
+```bash
+jupyter nbconvert --to notebook --execute notebooks/01_exploratory_analysis.ipynb \
+    --output notebooks/01_exploratory_analysis_executed.ipynb
+```
+
+The notebook covers nine analyses in sequence:
+
+| Cell | Analysis | What to look for |
+| --- | --- | --- |
+| 1 | Load datasets | Record counts, column dtypes |
+| 2 | Angle distributions | Histogram shape — should be roughly normal for Fajr |
+| 3 | Latitude vs Fajr angle | The counter-intuitive equatorial-higher pattern |
+| 4 | Birmingham seasonality | Sinusoidal pattern — confirms TOY effect |
+| 5 | Latitude × Season interaction | Coloured scatter — should show lat × season interaction |
+| 6 | Elevation vs Fajr angle | Weaker than lat/season but visible above 500m |
+| 7 | Geographic coverage map | Reveals which regions are data-sparse |
+| 8 | Linear regression baseline | R² and per-feature coefficients — sets the floor for ML |
+| 9 | Isha analysis | Parallel analysis for Isha; currently sparse |
+
+A well-populated dataset produces:
+- Fajr angle distribution: mean ~13.5°, std ~1.8°, range roughly 8°-20°
+- Fajr linear regression R² ≥ 0.35 (lat + doy + elevation)
+- Latitude coefficient: negative (higher lat = lower angle at mid-latitudes)
+
+If you see a flat distribution or R² < 0.1, check the pipeline output for dropped records.
+
+---
+
+## Step 2: Feature Engineering
+
+The relevant features for predicting the solar depression angle at true dawn or dusk are:
+
+| Feature | Column | Notes |
+| --- | --- | --- |
+| Latitude | `lat` | Decimal degrees |
+| sin(day of year) | derived from `day_of_year` | Captures seasonality (365-day cycle) |
+| cos(day of year) | derived from `day_of_year` | Paired with sin for full cycle encoding |
+| Elevation | `elevation_m` | Metres above sea level |
+| abs(lat) | derived | Symmetry across equator |
+
+**Do not use longitude** as a feature. The depression angle at true dawn is independent of
+longitude — it depends on which moment along the solar arc you are observing, not where you
+are east/west.
+
+**Do not use the observed time** as a feature. The angle is the prediction target; the time
+is how you derived the angle. Using it as a feature would be data leakage.
+
+Encode day of year as a unit circle pair:
+
+```python
+import numpy as np
+df["doy_sin"] = np.sin(2 * np.pi * df["day_of_year"] / 365.25)
+df["doy_cos"] = np.cos(2 * np.pi * df["day_of_year"] / 365.25)
+```
+
+---
+
+## Step 3: Baseline Model
+
+Before training any ML model, establish a linear baseline:
+
+```python
+from sklearn.linear_model import LinearRegression
+from sklearn.model_selection import cross_val_score
+import numpy as np
+
+features = ["lat", "doy_sin", "doy_cos", "elevation_m"]
+X = df[features].values
+y = df["fajr_angle"].values
+
+lr = LinearRegression()
+scores = cross_val_score(lr, X, y, cv=5, scoring="r2")
+print(f"Linear baseline R²: {scores.mean():.3f} ± {scores.std():.3f}")
+```
+
+This gives the floor — any ML model should beat it. A linear model trained on the current
+data produces approximately R² = 0.38.
+
+---
+
+## Step 4: Gradient Boosting (recommended)
+
+Gradient boosting handles the non-linear lat × season interaction without explicit
+feature crosses. It is the recommended first ML model for this dataset.
+
+```python
+from sklearn.ensemble import GradientBoostingRegressor
+from sklearn.model_selection import cross_val_score, KFold
+from sklearn.metrics import mean_absolute_error
+import numpy as np
+
+features = ["lat", "doy_sin", "doy_cos", "elevation_m"]
+X = df[features].values
+y = df["fajr_angle"].values
+
+model = GradientBoostingRegressor(
+    n_estimators=300,
+    max_depth=4,
+    learning_rate=0.05,
+    subsample=0.8,
+    random_state=42,
+)
+
+kf = KFold(n_splits=5, shuffle=True, random_state=42)
+r2_scores = cross_val_score(model, X, y, cv=kf, scoring="r2")
+mae_scores = -cross_val_score(model, X, y, cv=kf, scoring="neg_mean_absolute_error")
+
+print(f"R²:  {r2_scores.mean():.3f} ± {r2_scores.std():.3f}")
+print(f"MAE: {mae_scores.mean():.3f}° ± {mae_scores.std():.3f}°")
+```
+
+Target performance with a well-populated dataset (10k+ records):
+- R² ≥ 0.55
+- MAE ≤ 0.9°
+
+---
+
+## Step 5: Evaluating the Model
+
+### Residual analysis
+
+```python
+from sklearn.model_selection import cross_val_predict
+import matplotlib.pyplot as plt
+
+model.fit(X, y)
+y_pred = cross_val_predict(model, X, y, cv=5)
+residuals = y - y_pred
+
+plt.figure(figsize=(10, 4))
+plt.subplot(1, 2, 1)
+plt.scatter(y_pred, residuals, alpha=0.3, s=10)
+plt.axhline(0, color="red")
+plt.xlabel("Predicted angle (°)")
+plt.ylabel("Residual (°)")
+plt.title("Residuals vs Predicted")
+
+plt.subplot(1, 2, 2)
+plt.scatter(df["lat"], residuals, alpha=0.3, s=10)
+plt.axhline(0, color="red")
+plt.xlabel("Latitude")
+plt.ylabel("Residual (°)")
+plt.title("Residuals vs Latitude")
+plt.tight_layout()
+plt.show()
+```
+
+Watch for:
+- Systematic residuals at high latitudes (55°N+) — the model may underfit
+- Residuals correlated with season at a single location — the model may underfit seasonality
+- Outliers > 3° from the line — these may be data entry errors or unusual atmospheric events
+
+### Leave-location-out cross-validation
+
+Standard k-fold mixes records from the same location across train/test splits, making the
+model look better than it generalises to new locations. For this dataset, location-aware
+CV is more informative:
+
+```python
+from sklearn.model_selection import LeaveOneGroupOut
+import numpy as np
+
+# Group by location (round lat/lng to 1 decimal for grouping)
+groups = (df["lat"].round(1).astype(str) + "," + df["lng"].round(1).astype(str))
+
+logo = LeaveOneGroupOut()
+scores = cross_val_score(model, X, y, cv=logo, groups=groups, scoring="r2")
+print(f"Leave-location-out R²: {scores.mean():.3f} ± {scores.std():.3f}")
+```
+
+This tests whether the model generalises to locations it has never seen.
+
+---
+
+## Step 6: Feature Importance
+
+```python
+model.fit(X, y)
+importances = model.feature_importances_
+
+for name, imp in zip(features, importances):
+    print(f"  {name}: {imp:.3f}")
+```
+
+Expected order: `doy_sin` or `doy_cos` highest, then `lat`, then `elevation_m` lowest.
+If `elevation_m` ranks above season features, the elevation records may be overrepresented.
+
+---
+
+## Step 7: Exporting the Model
+
+Once satisfied with validation performance:
+
+```python
+import joblib
+import json
+import numpy as np
+
+model.fit(X, y)
+
+joblib.dump(model, "models/fajr_gbm.pkl")
+
+# Export feature ranges for the pray-calc DPC algorithm
+meta = {
+    "features": features,
+    "lat_range": [float(df["lat"].min()), float(df["lat"].max())],
+    "elevation_range": [float(df["elevation_m"].min()), float(df["elevation_m"].max())],
+    "angle_mean": float(y.mean()),
+    "angle_std": float(y.std()),
+    "n_records": int(len(df)),
+    "r2_cv": float(r2_scores.mean()),
+    "mae_cv": float(mae_scores.mean()),
+}
+with open("models/fajr_gbm_meta.json", "w") as f:
+    json.dump(meta, f, indent=2)
+
+print(f"Saved fajr_gbm.pkl ({len(df)} training records, R²={r2_scores.mean():.3f})")
+```
+
+---
+
+## Current Model Status
+
+The current dataset has:
+- Fajr: ~4,100 records, but 98% are from Birmingham, UK. The model heavily reflects one location.
+- Isha: ~43 records. Not enough to train a reliable ML model.
+
+**The priority is data collection before further ML work.** A model trained only on Birmingham
+Fajr data will predict Birmingham well and generalise poorly. The notebook's exploratory
+analysis and linear baseline are meaningful now, but gradient boosting should wait for
+broader geographic coverage.
+
+Target before training a production model:
+- Fajr: 10,000+ records from 100+ locations across all latitude bands
+- Isha: 500+ records from 30+ locations
+
+See [Data Collection](Data-Collection) for how to contribute new sighting records.
+
+---
+
+## Connecting to pray-calc
+
+The output of the ML model feeds the DPC (Dynamic Prayer Calc) algorithm in
+[pray-calc](https://github.com/acamarata/pray-calc). The DPC algorithm takes:
+
+- Latitude
+- Day of year
+- Elevation
+
+And returns a recommended depression angle for that location and date.
+
+The current DPC implementation uses a simplified physics model. The ML model will replace
+or calibrate the seasonal and latitude correction factors once sufficient data is available.
+
+---
+
+*[← Data Collection](Data-Collection) · [Architecture →](Architecture)*
--- a/.wiki/Research-Notes.md
+++ b/.wiki/Research-Notes.md
@ -0,0 +1,221 @@
+# Research Notes
+
+Summaries of the academic papers and observation programs that contributed records to this dataset.
+
+For full citation details, see [Data Sources](Data-Sources).
+
+---
+
+## Key Finding
+
+The data consistently shows three main patterns:
+
+1. **Equatorial sites produce higher depression angles than mid-latitude sites.** Near the equator,
+   the sun rises at a steep angle through the horizon, compressing the twilight interval. At 3°-7°
+   latitude, mean Fajr angles are 16°-17°. At 52°N (Birmingham), the mean is ~13°.
+
+2. **Season matters at every latitude.** Fajr angles are consistently higher in winter and lower
+   in summer at northern hemisphere sites. Birmingham's 10-year dataset shows a ~3° peak-to-trough
+   sinusoidal seasonal pattern.
+
+3. **Elevation shifts the angle upward.** Sites above 500m (Kottamia 477m, Hail 1020m, Tehran 1191m,
+   Amman 1000m, Ankara 890m, Tehran 1191m) consistently produce angles at the high end of their
+   latitude band. The effect is smaller than latitude or season but real.
+
+---
+
+## Papers by Region
+
+### Egypt — NRIAG Series
+
+The National Research Institute of Astronomy and Geophysics (NRIAG) in Egypt has published the
+longest series of peer-reviewed Fajr and Isha observation studies.
+
+**Hassan et al. 2014** — *NRIAG Journal of Astronomy and Geophysics*, 3: 23-26.
+
+Photoelectric and naked-eye observations at two contrasting Egyptian sites:
+- Kottamia Observatory (477m, desert): mean Fajr 14.0°, Isha (Shafaq Abyad) 13.8°
+- Aswan (92m, very clear desert near Tropic): mean Fajr 13.2°
+
+The Kottamia results are the most reliable pre-SQM era Egyptian data. Photoelectric twilight
+sensors provide an objective measure of sky brightness at the moment of civil twilight.
+
+**Hassan et al. 2016** — *NRIAG Journal of Astronomy and Geophysics*, 5: 9-15.
+
+Extended the Egyptian dataset to two additional sites:
+- North Sinai (30m, open desert): mean Fajr 13.5° across four seasons
+- Assiut (55m, Nile valley): mean Fajr 13.2° (slightly lower, attributed to agricultural aerosols)
+
+The consistent result across Egyptian desert sites (13°-14.5°) is notable given that the MUIS/ISNA
+and most calculators use 18° or 15°.
+
+**Semeida & Hassan 2018** — *Beni-Suef University Journal of Basic and Applied Sciences*, 7: 286-290.
+
+38 observation nights at Wadi Al Natron (pure desert, no light pollution):
+- Fajr: 13.5°-14.8° across seasons
+- Isha (Shafaq Abyad): 13.0°-15.2° across seasons
+
+This paper provides the most complete Egyptian Isha dataset.
+
+**Rashed et al. 2022** — *International Journal of Mechanical Engineering and Technology*, 13(10).
+
+SQM + naked eye at Fayum (29.28°N, near the Fayum depression):
+- Seasonal means: winter 14.5°, summer 13.1°
+
+**Rashed et al. 2025** — *NRIAG Journal of Astronomy and Geophysics*.
+
+Most recent paper. Alexandria (Mediterranean coast, 31.2°N):
+- Three seasons: winter 14.1°, summer 12.9°, autumn 13.8°
+
+---
+
+### Saudi Arabia — Khalifa 2018
+
+**Khalifa 2018** — *NRIAG Journal of Astronomy and Geophysics*, 7: 22-28.
+
+80 observation nights at Hail (27.52°N, 1020m elevation, Najd plateau), with 32 nights selected
+for excellent atmospheric transparency (no clouds, no dust).
+
+Results:
+- Mean Fajr: 14.4° (range 12.8°-16.1°)
+- Mean Isha (Shafaq Abyad): 14.8° (range 13.2°-16.4°)
+- Higher in winter, lower in summer
+
+At 1020m, Hail shows a clearly elevated angle vs sea-level desert sites in Egypt. This is
+the primary evidence for the elevation effect.
+
+---
+
+### Malaysia and Indonesia — Equatorial Studies
+
+**Kassim Bahali et al. 2018** — *Sains Malaysia*, 47(11): 2797-2805.
+
+The strongest low-latitude Fajr study. 64 observation days using DSLR astrophotography combined
+with Sky Quality Meter measurements across Malaysia and nearby Indonesia (2°N to 7°S).
+
+Key results:
+- Mean Fajr depression: **16.67°** (range 13.9°-19.8°)
+- Standard deviation: 1.32°
+- No correlation with season at these low latitudes
+
+The DSLR + SQM combination is methodologically more rigorous than naked eye alone. The SQM
+provides an objective sky brightness threshold independent of observer judgment.
+
+**Saksono 2020** — *NRIAG Journal of Astronomy and Geophysics*, 9(1): 238-244.
+
+SQM-only study at Depok, West Java (6.4°S, 65m), 26 nights in June-July 2015:
+- Mean Fajr depression: ~16°
+- High consistency with Kassim Bahali despite different instruments
+
+**Hamidi 2008** — Academia.edu working paper.
+
+Shafaq al-Abyad (Isha) observations at two Malaysian sites:
+- Kuala Lipis (4.183°N): ~17° across seasons
+- Port Klang (3.004°N): ~16°-17° across seasons
+
+The ~17° Isha result at low latitudes mirrors the ~17° Fajr result — both twilight phenomena
+are compressed by the steep solar arc at equatorial sites.
+
+**OIF UMSU 2017-2020** — University of Muhammadiyah North Sumatra.
+
+Hundreds of SQM observation nights at Medan (3.595°N):
+- Proposed national Indonesian standard: 16.48° for Fajr
+- Isha: consistent with ~17°
+
+---
+
+### United Kingdom
+
+**Hizbul Ulama UK 1987-1989**
+
+21 successful Fajr observations over three years from a rural Lancashire site (53.748°N, 120m).
+One of the earliest systematic UK observation programs. Per-season seasonal results published
+at http://www.hizbululama.org.uk/files/salat_timing.html.
+
+Fajr results: consistent 12°-14° range across seasons. Isha observations also recorded.
+
+**Asim Yusuf 2017** — *Shedding Light on the Dawn*, ISBN 978-0-9934979-1-9.
+
+The highest-quality UK observation study. Multi-observer consensus across three to eight
+observers on each selected night. Site: Exmoor National Park (51.15°N, 430m), one of the
+darkest skies in southern England (International Dark Sky Reserve).
+
+Per-season results from 2013-2016:
+- Winter: Fajr ~13.8°, Isha (Shafaq Abyad) ~14.2°
+- Summer: Fajr ~12.1°, Isha ~12.8°
+
+The multi-observer consensus methodology makes these the most reliable UK data points.
+
+---
+
+### Moonsighting.com / Khalid Shaukat
+
+A multi-decade global observation network. Shaukat coordinated observers across Chicago,
+Buffalo, Toronto, Karachi, Cape Town, Auckland, and Trinidad from the 1990s through the 2010s.
+
+Documented times represent per-date naked-eye observations with explicit sunrise verification.
+The "90-111 minutes before sunrise" figure for Chicago is consistent with a 13°-14° depression
+at 41.9°N across seasons.
+
+---
+
+## Latitude-Angle Summary Table
+
+This table synthesises mean Fajr angles from peer-reviewed sources across the latitude range.
+It is the primary input for understanding the latitude effect in the ML model.
+
+| Latitude | Site | Elev | Mean Fajr (°) | N | Method |
+| --- | --- | --- | --- | --- | --- |
+| 52.5°N | Birmingham, UK | 141m | ~13.0° | 4,018 | Community astrophotography |
+| 43.7°N | Toronto, Canada | 76m | ~13.2° | 4 | Naked eye |
+| 41.9°N | Chicago, USA | 182m | ~13.1° | 8 | Naked eye |
+| 39.9°N | Ankara, Turkey | 890m | ~14.8° | 4 | Naked eye (high elev) |
+| 36.9°S | Auckland, NZ | 20m | ~14.8° | 2 | Naked eye |
+| 37.8°S | Melbourne, AU | 31m | ~14.5° | 3 | Naked eye |
+| 35.7°N | Tehran, Iran | 1191m | ~15.1° | 3 | Naked eye (very high elev) |
+| 34.0°N | Fez, Morocco | 408m | ~14.2° | 4 | Naked eye |
+| 33.9°S | Cape Town, SA | 10m | ~15.2° | 4 | Naked eye |
+| 31.9°N | Amman, Jordan | 1000m | ~14.9° | 3 | Naked eye (high elev) |
+| 31.0°N | Alexandria, Egypt | 32m | ~13.6° | 3 | SQM |
+| 30.5°N | Wadi Al Natron | 23m | ~14.0° | 7 | Naked eye (desert) |
+| 30.0°N | Kottamia, Egypt | 477m | ~14.0° | 6 | Photoelectric (high elev) |
+| 27.5°N | Hail, Saudi Arabia | 1020m | ~14.4° | 8 | Naked eye (high elev) |
+| 24.9°N | Karachi, Pakistan | 8m | ~14.8° | 4 | Naked eye |
+| 14.7°N | Dakar, Senegal | 24m | ~15.3° | 2 | Naked eye |
+| 12.0°N | Kano, Nigeria | 476m | ~15.1° | 2 | Naked eye |
+| 10.7°N | Trinidad | 12m | ~15.8° | 2 | Naked eye |
+| 6.4°S | Depok, Indonesia | 65m | ~16.0° | 3 | SQM |
+| 3.6°N | Medan, Indonesia | 22m | ~16.5° | 8 | SQM |
+| 3.1°N | KL, Malaysia | 40m | ~16.7° | 4 | DSLR + SQM |
+| 4.1°S | Mombasa, Kenya | 50m | ~16.2° | 2 | Naked eye |
+
+The counter-intuitive result — equatorial sites have *higher* angles than mid-latitude sites —
+is a consequence of the Sun's steep rise angle at low latitudes. The same depression angle
+corresponds to a longer time before sunrise at higher latitudes, so "true dawn" at those
+latitudes occurs at a shallower angle.
+
+---
+
+## Open Questions
+
+1. **Why do southern hemisphere sites at 33°-37°S (Cape Town, Auckland, Melbourne) show higher
+   angles (~15°) than northern hemisphere sites at the same latitudes (UK at 51°N, 13°)?**
+   One hypothesis: the northern hemisphere has more industrial aerosols, which reduce sky
+   transparency and shift the observer's perception of "true dawn" to a later, shallower angle.
+   This would bias northern hemisphere data toward lower angles. The effect needs more data to confirm.
+
+2. **Is the elevation effect physically explained or confounded?**
+   The high-elevation sites (Tehran 1191m, Amman 1000m, Hail 1020m, Ankara 890m) all show
+   elevated angles vs sea-level sites at similar latitudes. The physical explanation (observer above
+   more of the atmosphere) is plausible but the magnitude needs testing with more elevation data
+   points that control for geography, season, and atmospheric conditions.
+
+3. **Why does Isha (Shafaq Abyad) at ~15° match Fajr at ~13°-16° for most sites?**
+   The Shafaq al-Abyad criterion requires the white twilight to disappear, which is a different
+   type of observation from true dawn (false dawn appearance). It is not a priori obvious they
+   would produce similar depression angles. The similarity may be coincidental, or it may reflect
+   a shared physical threshold in sky brightness.
+
+---
+
+*[← Data Sources](Data-Sources) · [Home →](Home)*