newrelic/wiki-sync-action@v1.0.1 is archived and unmaintained.
Switch to Andrew-Chen-Wang/github-wiki-action@v4, which is actively
maintained and uses the same token-based approach.
Tracked: BSRN/SURFRAD processors (reference, excluded from pipeline),
GaN-MN downloader, academic paper fetcher, Madrid SQM processor,
ML analysis scripts (src/analyze/), umsu_medan_2024 raw sightings.
Gitignored: global_extrapolator, instant_1m_injector/vectorized,
massive_harvest_engine, massive_sqm_downloader, global_sqm_harvester,
run_infinite_pipeline.sh, run_massive_collection.sh, search_papers.py
(agent-generated experimental scripts, not part of core pipeline).
Classified 83k records into 4 tiers by methodology alignment with
Islamic astronomical definitions (Fajr Sadiq, Shafaq al-Abyad).
Tier 1+2 (4,912 records): CCD horizon, naked eye, dark-site SQM, DSLR
-> Fajr baseline: 16.98 deg [95% CI: 16.73-17.22]
-> Weighted R²=0.43, MAE=1.14 deg
-> Well-calibrated at 45-55N (R²=0.74, MAE=0.25 deg via OpenFajr)
-> Extrapolated at equator (no Tier 1 tropical data)
Isha: only 45 Tier 1+2 records. Not enough for a formula.
Key finding: equatorial Fajr angle is the most important missing
measurement. Whether it's 15-16 or 17-18 deg changes everything.
Comprehensive analysis of 83k records across 11 model types.
Key finding: physical variables (lat, day_of_year) explain the correct
seasonal-latitude interaction pattern, but the heterogeneous dataset has
a 3.3 degree noise floor from incompatible measurement methodologies.
Recommended DPC formula (Medium-9):
f(abs_lat, day_of_year) -> angle
9 terms: intercept + abs_lat + 4 seasonal harmonics + 4 lat*season interactions
Mandatory args: abs(latitude), day_of_year
Optional: elevation_m (+0.0007 deg/m)
Not needed: longitude (data artifact, not physics)
See analysis_results.md for full formulas with coefficients.
Issues fixed:
- Add upper bound angle filter: fajr/isha capped at 22 deg (was unbounded,
max was 49 deg from light pollution artifacts)
- Remove washetdonker from approved list: threshold method produces 7 deg
angles (civil twilight), not Fajr at 12-18 deg
- Remove openfajr_94992898.csv: duplicated the iCal feed data with slightly
different coordinates, bypassing dedup (4,007 duplicate records)
- Filter out future dates: OpenFajr publishes predictions for the full year
- Filter out polar stations (|lat| > 70): no meaningful Fajr/Isha
- Filter out Null Island (lat=0, lng=0): GPS default / missing coordinates
- Move precomputed angles merge before dedup: was bypassing dedup entirely
- Make BAD_NOTE_MARKERS case-insensitive: catches mixed-case variants
- Add missing tess_jun2017.csv to approved list
- Clean up duplicate comment blocks in ingest.py
Dataset after fixes: 48,668 Fajr + 34,529 Isha = 83,197 total
Angle range now: 7.0-22.0 deg (Fajr), 10.0-22.0 deg (Isha)
Latitude range now: -62.6 to 69.7 (was -90 to 90)
Add 6 new data collection pipelines and their processed outputs:
Sources added:
- TESS/Stars4All photometer network: 37 months (Jun 2017-Aug 2020),
~40k raw events from 100+ European stations via Zenodo archives
- Globe at Night citizen science: 26k twilight observations (2006-2024),
filtered from 308k total observations for solar depression 6-22 deg
- GaN-MN continuous monitoring: 45 months (Jan 2022-Sep 2025),
~12.5k twilight events from 88 stations across 20+ countries
- Galicia SQM network: 14 stations, 1-min resolution, 7.5k events
- Madrid/Majadahonda SQM: multi-year continuous monitoring, 3.1k events
- washetdonker.nl Netherlands: 7 stations, 3.3k morning events
- Academic papers: Jordan (Abed 2015), Fayum Egypt, India photometer
Pipeline changes:
- ingest.py: add all new files to APPROVED_RAW_CSVS allowlist,
fix filter to use allowlist instead of hardcoded exclusions
- .gitignore: exclude bulk raw data directories (BSRN, TESS, GaN-MN,
washetdonker, Globe at Night downloads)
Final dataset: 56,668 Fajr + 34,763 Isha = 91,431 total records
Previous: 5,871 Fajr + 46 Isha = 5,917 total records
- Regenerate fajr_angles.csv with current collection state
- Update wiki docs to reflect current dataset stats
- Add missing requirements and minor pipeline fixes
- Migrate .wiki/ to .github/wiki/ (GCI standard for public repos)
- Add _Sidebar.md for GitHub Wiki navigation
- Update wiki-sync.yml to reference .github/wiki/ path
- Remove .markdownlintignore (covered by .vscode/settings.json)
- Migrate .allow-ai-terms to ALLOW_AI_TERMS_REPOS in pre-commit hook
- Expand .gitignore with full IDE and AI agent directory list
- Update README project structure reference
Identified three sources of cross-source duplication and fixed each:
1. Kassim Bahali 2018 Pekan Pahang (9 records)
Same 9 June-July 2017 DSLR observations existed in both
verified_sightings.py (Table 2 entries) and the raw CSV
kassim_bahali_2017_malaysia.csv. Removed from verified_sightings;
raw CSV is the canonical source with richer cloud/conditions notes.
2. BRIN Mount Timau SQM dataset (22 records)
timau_sqm_fajr.csv contained two SQM threshold readings per night:
target=18.0° (75 records, primary) and target=16.51° (22 records,
derived from the 75-night mean). Removed target=16.51 rows.
Each night now has exactly one Fajr time.
3. Khalifa 2018 Hail Fajr (4 records)
Original batch had times producing implausible angles: 2015-01-15
gave 12.6° and 2015-06-21 gave 19.3° (paper reports 14.014°±0.317°).
Removed the four bad-time records. Batch 16a replacements (computed
from the paper mean D0) remain and give consistent 13.9-14.1° angles.
Pipeline: add automatic deduplication guard. After combining all sources,
any (prayer, date, lat rounded to 3dp, lng rounded to 3dp) duplicate is
logged and dropped (keep first). This prevents future cross-source overlaps
from silently inflating the dataset or training on the same observation twice.
Dataset: fajr_angles.csv 4535 records, isha_angles.csv 120 records
Zero duplicates confirmed.
Herdiwijaya 2016 + 2020 (J.Phys.Conf.Ser.) — Amfoang/Kupang, East Nusa Tenggara
New site: 9.667°S, 124.0°E, 1300m high-elevation dark site.
D0=18.0° (pristine, 83 moonless night study 2011-2018).
4,532 Fajr / 82 Isha records across 113 locations.
Added 38 per-date individual DSLR observations from Kassim Bahali et al. (2019)
JATMA 7(2):37-48 across 10 new sites:
- Sabang, Aceh, Indonesia (5.876°N, 95.340°E) — 11 nights Dec 2017
- Yaring, Pattani, Thailand (6.934°N, 101.319°E) — 2 nights Jan 2018
- Surabaya, East Java, Indonesia — 3 nights Feb 2018
- Sumenep, Madura, Indonesia — 3 nights Feb 2018
- Ternate, North Maluku, Indonesia — 3 nights Mar 2018
- South Sulawesi (Gowa area), Indonesia — 6 nights Mar 2018
- Mersing, Johor, Malaysia — 3 nights Jun 2018
- Kuala Rompin, Pahang, Malaysia — 2 nights Jul 2018
- Nenasi Pekan, Pahang, Malaysia — 2 nights Aug 2018
- Kota Tinggi, Johor, Malaysia — 3 nights Sep 2018
Depression angles computed via PyEphem from actual dawn times + coordinates.
Mean D0 range: 17.07° (Pattani) to 19.61° (Kota Tinggi dry season).
Includes first Thailand data point (Pattani) and expanded Indonesian coverage.
+10 new unique locations (88 → 98)
Added 32 new Fajr records from 8 new sites:
- 6 Indonesian cities from Saksono ISRN/UHAMKA 'Premature Dawn' series
(Padang, Batusangkar, Cirebon, Balikpapan, Bitung, Manokwari) — urban LP,
D0=-13.4°, 4 seasonal records each
- Tayu Beach, Pati, Central Java — Noor & Hamdani 2018 QIJIS, photoelectric+SQM,
D0=-17.0°, 4 individual nights Aug-Sep 2016
- Cimahi, West Java — Herdiwijaya 2020, SQM, D0=-18.5°, 4 seasonal records
+8 new unique locations (80 → 88)
New records from research expansion:
- Tanjung Aru, Sabah Malaysia (Niri & Zainuddin): 4 Isha Shafaq Abyad records
- Teluk Kemang, Malaysia (Abdel-Hadi & Hassan 2022): 4 Fajr + 4 Isha SQM records
- Bosscha Observatory, Java 1310m (Herdiwijaya 2020): 4 Fajr records
- Yogyakarta, Java (Herdiwijaya 2014-2016, 136 nights): 4 Fajr records
- Kupang, NTT 10°S (Herdiwijaya 2020): 4 Fajr + 4 Isha records
- Matrouh, Egypt (Hassan et al.): 4 Fajr + 3 Isha records (1 filtered)
- Kharga Oasis, Egypt (Hassan et al. 2020): 4 Fajr records
- Hurghada, Egypt (Hassan et al. 2020): 4 Fajr records
- Marsa-Alam, Egypt (Hassan et al. 2020): 4 Fajr records
- 15th of May City, Egypt (Taha et al. 2025): 4 Fajr records
- Riyadh, Saudi Arabia (Taha et al. 2025): 4 Fajr records
- Mauritania 18°N (Taha et al. 2025): 4 Fajr records — first West Africa data
New modules:
- src/geocode.py: Nominatim geocoding with disk cache
- src/ingest.py: CSV ingestion and data standardization pipeline
- src/pipeline.py: integrated raw CSV loading via ingest module
Five wiki pages covering Data Collection, ML Crunching, Architecture, Data
Sources, and Research Notes. GitHub Actions workflow syncs .wiki/ to the
GitHub Wiki on push to main. Adds .markdownlintignore and VS Code settings
to exclude .claude/ from lint checks. Adds .allow-ai-terms to allow the
.claude/ directory path reference in lint ignore files.
Replaces the original JS calibration library with a pure Python pipeline
for collecting and back-calculating solar depression angles from human-verified
Fajr and Isha prayer sightings.
What this does:
- src/pipeline.py: master pipeline; fetches iCal + manual records, back-calculates
angles via PyEphem, applies quality filters, exports two clean CSVs
- src/collect/openfajr.py: parses the OpenFajr Birmingham iCal feed (~4,018 records)
- src/collect/verified_sightings.py: manually compiled records from peer-reviewed
studies (Egypt, Saudi Arabia, Malaysia, Indonesia, UK, USA, Canada, and more)
- src/angle_calc.py: PyEphem back-calculation with atmospheric refraction
- src/elevation.py: Open-Elevation API batch lookup
Datasets generated:
- data/processed/fajr_angles.csv: 4,105 confirmed Fajr records, 35 locations,
latitude range -37.8 to 53.7 degrees, date range 1985-2026
- data/processed/isha_angles.csv: 43 confirmed Isha records, 20+ locations
Also includes:
- notebooks/01_exploratory_analysis.ipynb: latitude, TOY, elevation pattern analysis
- research/: academic paper summaries (not training data)
- data/raw/sources.md: full citation table for all data sources
Weighted least-squares calibration of Islamic prayer time depression
angles from observed mosque announcement data. Uses golden-section
search to minimize the sum of squared residuals independently for
Fajr and Isha. Internal Jean Meeus solar ephemeris — zero runtime
dependencies.
API: calibrateAngles, scoreAngles, predictFajr, predictIsha.
Full TypeScript, dual CJS/ESM via tsup.
32 ESM tests, 6 CJS tests, all passing on Node 20/22/24.