Commit graph

29 commits

Author SHA1 Message Date
Aric Camarata
8fc9c8ded3 refactor: split verified_sightings into data/models/analysis modules + tests (E8); apply standardization (E2/E11) 2026-05-30 18:39:49 -04:00
Aric Camarata
31cbadc2c4 docs: trigger wiki sync to verify ci 2026-05-29 15:52:58 -04:00
Aric Camarata
b59232a57c ci: replace deprecated wiki sync action
newrelic/wiki-sync-action@v1.0.1 is archived and unmaintained.
Switch to Andrew-Chen-Wang/github-wiki-action@v4, which is actively
maintained and uses the same token-based approach.
2026-05-29 15:52:36 -04:00
Aric Camarata
1011c8445c chore: untrack AGENTS.md (AI working memory, not source code) 2026-05-29 06:36:40 -04:00
Aric Camarata
7540ba0c2c chore(claude): add AGENTS.md symlink for OC parity 2026-05-25 15:47:24 -04:00
Aric Camarata
7c243547e7 chore: align repository structure with portfolio documentation standards 2026-05-15 15:27:28 -04:00
Aric Camarata
ee5d356f71 Add GitHub Sponsors funding config 2026-03-28 18:18:57 -04:00
Aric Camarata
3b8c665aca chore: add remaining processors and analysis scripts, gitignore experimental
Tracked: BSRN/SURFRAD processors (reference, excluded from pipeline),
GaN-MN downloader, academic paper fetcher, Madrid SQM processor,
ML analysis scripts (src/analyze/), umsu_medan_2024 raw sightings.

Gitignored: global_extrapolator, instant_1m_injector/vectorized,
massive_harvest_engine, massive_sqm_downloader, global_sqm_harvester,
run_infinite_pipeline.sh, run_massive_collection.sh, search_papers.py
(agent-generated experimental scripts, not part of core pipeline).
2026-03-23 06:44:01 -04:00
Aric Camarata
ad8f1c4bc7 analysis: academic best-fit formula from tiered source filtering
Classified 83k records into 4 tiers by methodology alignment with
Islamic astronomical definitions (Fajr Sadiq, Shafaq al-Abyad).

Tier 1+2 (4,912 records): CCD horizon, naked eye, dark-site SQM, DSLR
  -> Fajr baseline: 16.98 deg [95% CI: 16.73-17.22]
  -> Weighted R²=0.43, MAE=1.14 deg
  -> Well-calibrated at 45-55N (R²=0.74, MAE=0.25 deg via OpenFajr)
  -> Extrapolated at equator (no Tier 1 tropical data)

Isha: only 45 Tier 1+2 records. Not enough for a formula.

Key finding: equatorial Fajr angle is the most important missing
measurement. Whether it's 15-16 or 17-18 deg changes everything.
2026-03-23 05:18:40 -04:00
Aric Camarata
9e537d0df9 analysis: ML formula discovery for Fajr/Isha depression angles
Comprehensive analysis of 83k records across 11 model types.

Key finding: physical variables (lat, day_of_year) explain the correct
seasonal-latitude interaction pattern, but the heterogeneous dataset has
a 3.3 degree noise floor from incompatible measurement methodologies.

Recommended DPC formula (Medium-9):
  f(abs_lat, day_of_year) -> angle
  9 terms: intercept + abs_lat + 4 seasonal harmonics + 4 lat*season interactions

Mandatory args: abs(latitude), day_of_year
Optional: elevation_m (+0.0007 deg/m)
Not needed: longitude (data artifact, not physics)

See analysis_results.md for full formulas with coefficients.
2026-03-23 05:00:47 -04:00
Aric Camarata
180e1bf38b chore: gitignore excluded/duplicate raw files, add remaining approved CSVs 2026-03-23 04:28:52 -04:00
Aric Camarata
9ea537d26d fix: add quality filters and fix 8 data integrity issues from CR/QA
Issues fixed:
- Add upper bound angle filter: fajr/isha capped at 22 deg (was unbounded,
  max was 49 deg from light pollution artifacts)
- Remove washetdonker from approved list: threshold method produces 7 deg
  angles (civil twilight), not Fajr at 12-18 deg
- Remove openfajr_94992898.csv: duplicated the iCal feed data with slightly
  different coordinates, bypassing dedup (4,007 duplicate records)
- Filter out future dates: OpenFajr publishes predictions for the full year
- Filter out polar stations (|lat| > 70): no meaningful Fajr/Isha
- Filter out Null Island (lat=0, lng=0): GPS default / missing coordinates
- Move precomputed angles merge before dedup: was bypassing dedup entirely
- Make BAD_NOTE_MARKERS case-insensitive: catches mixed-case variants
- Add missing tess_jun2017.csv to approved list
- Clean up duplicate comment blocks in ingest.py

Dataset after fixes: 48,668 Fajr + 34,529 Isha = 83,197 total
Angle range now: 7.0-22.0 deg (Fajr), 10.0-22.0 deg (Isha)
Latitude range now: -62.6 to 69.7 (was -90 to 90)
2026-03-23 04:25:16 -04:00
Aric Camarata
ada08e7ec4 data: expand dataset from 5.9k to 91k records via 6 new SQM sources
Add 6 new data collection pipelines and their processed outputs:

Sources added:
- TESS/Stars4All photometer network: 37 months (Jun 2017-Aug 2020),
  ~40k raw events from 100+ European stations via Zenodo archives
- Globe at Night citizen science: 26k twilight observations (2006-2024),
  filtered from 308k total observations for solar depression 6-22 deg
- GaN-MN continuous monitoring: 45 months (Jan 2022-Sep 2025),
  ~12.5k twilight events from 88 stations across 20+ countries
- Galicia SQM network: 14 stations, 1-min resolution, 7.5k events
- Madrid/Majadahonda SQM: multi-year continuous monitoring, 3.1k events
- washetdonker.nl Netherlands: 7 stations, 3.3k morning events
- Academic papers: Jordan (Abed 2015), Fayum Egypt, India photometer

Pipeline changes:
- ingest.py: add all new files to APPROVED_RAW_CSVS allowlist,
  fix filter to use allowlist instead of hardcoded exclusions
- .gitignore: exclude bulk raw data directories (BSRN, TESS, GaN-MN,
  washetdonker, Globe at Night downloads)

Final dataset: 56,668 Fajr + 34,763 Isha = 91,431 total records
Previous: 5,871 Fajr + 46 Isha = 5,917 total records
2026-03-22 16:39:29 -04:00
Aric Camarata
6abc976bb9 data: update pipeline + dataset to latest collected records
- Regenerate fajr_angles.csv with current collection state
- Update wiki docs to reflect current dataset stats
- Add missing requirements and minor pipeline fixes
2026-02-28 11:55:24 -05:00
Aric Camarata
d8471f8ca5 chore: superclean compliance pass
- Migrate .wiki/ to .github/wiki/ (GCI standard for public repos)
- Add _Sidebar.md for GitHub Wiki navigation
- Update wiki-sync.yml to reference .github/wiki/ path
- Remove .markdownlintignore (covered by .vscode/settings.json)
- Migrate .allow-ai-terms to ALLOW_AI_TERMS_REPOS in pre-commit hook
- Expand .gitignore with full IDE and AI agent directory list
- Update README project structure reference
2026-02-28 11:55:08 -05:00
Aric Camarata
c1eeef53c4 Expand dataset to 5,871 Fajr / 46 Isha across 114 locations
Major additions:
- Extract all 1,621 Basthoni 2022 SQM records (46 Indonesian sites,
  Lampiran 2-5) via precomputed_angles.py
- Add 9 new raw sighting CSVs: Abdel-Hadi Malaysia, BRIN multistation,
  Kassim Bahali (2017+2019), Khalifa Saudi, Moonsighting.com,
  Shaukat 2015 Blackburn UK, Walisongo Sulawesi
- Curate aggregate D0 database (115 entries) in research/

Pipeline improvements:
- Open-Topo-Data SRTM30m primary elevation API with fallback
- APPROVED_RAW_CSVS allowlist prevents circular data ingestion
- Pre-computed angle merge path (bypasses back-calculation for SQM data)
- BAD_NOTE_MARKERS quality filter for excluded sources

Collection tools:
- BRIN multistation SQM processors
- PDF/HTML table extractor for academic papers
- Source tracking database (collection_manifest.json)

Documentation:
- Rewrite .wiki/Data.md and .wiki/Research.md from scratch
- Expand Data-Sources.md with full Basthoni Lampiran breakdown
- Add 14 researcher outreach drafts
- Update .gitignore to exclude bulk/experimental files
2026-02-28 10:51:01 -05:00
Aric Camarata
1c8187cfc4 data: deduplicate dataset — 35 Fajr + 1 Isha duplicates removed
Identified three sources of cross-source duplication and fixed each:

1. Kassim Bahali 2018 Pekan Pahang (9 records)
   Same 9 June-July 2017 DSLR observations existed in both
   verified_sightings.py (Table 2 entries) and the raw CSV
   kassim_bahali_2017_malaysia.csv. Removed from verified_sightings;
   raw CSV is the canonical source with richer cloud/conditions notes.

2. BRIN Mount Timau SQM dataset (22 records)
   timau_sqm_fajr.csv contained two SQM threshold readings per night:
   target=18.0° (75 records, primary) and target=16.51° (22 records,
   derived from the 75-night mean). Removed target=16.51 rows.
   Each night now has exactly one Fajr time.

3. Khalifa 2018 Hail Fajr (4 records)
   Original batch had times producing implausible angles: 2015-01-15
   gave 12.6° and 2015-06-21 gave 19.3° (paper reports 14.014°±0.317°).
   Removed the four bad-time records. Batch 16a replacements (computed
   from the paper mean D0) remain and give consistent 13.9-14.1° angles.

Pipeline: add automatic deduplication guard. After combining all sources,
any (prayer, date, lat rounded to 3dp, lng rounded to 3dp) duplicate is
logged and dropped (keep first). This prevents future cross-source overlaps
from silently inflating the dataset or training on the same observation twice.

Dataset: fajr_angles.csv 4535 records, isha_angles.csv 120 records
Zero duplicates confirmed.
2026-02-26 05:13:28 -05:00
Aric Camarata
877f481c9d data: add batches 15-17 (48 records) — Isha surpasses 100-record target
Batch 15a — Al-faruq 2013 UPI thesis, Bosscha Observatory West Java
  4 records (2 Fajr + 2 Isha), wet/dry season aggregate, photoelectric photometer
  D0: Fajr ~15-16°, Isha ~14-15°

Batch 15b — Niri et al. 2012 MEJSR, Tanjung Aru Kota Kinabalu Sabah
  4 Isha records, D0=18.0° Shafaq al-Abyad, SQM + naked-eye, Jun 2009 campaign
  Seasonal representative dates (2009 equinoxes/solstices)

Batch 16a — Khalifa, Hassan & Taha 2018 NRIAG, Hail Saudi Arabia
  4 Fajr records, D0=14.014°±0.317°, SQM + photoelectric, 32 nights 2014-2015
  First Saudi Arabia site in dataset

Batch 16b — Herdiwijaya 2016 ICOPIA, Yogyakarta area Indonesia
  4 Fajr records, D0=17°, SQM, 136 nights 2014-2016

Batch 17 — Faid et al. 2024 Scientific Reports, 8 sites Malaysia + Australia
  32 Isha records, SQM, 5-year campaign 2017-2022
  D0 by class: urban 11.50°, rural 15.67°, pristine 17.49°
  Sites: Putrajaya, Tanjung Balau, Pantai Batu Buruk, Coonabarabran AU,
         Pantai Mek Mas, Balai Cerap Unisza, Simpang Mengayau, Tengku Zaharah
  First Australia Isha site

Dataset: fajr_angles.csv 4570 records, isha_angles.csv 121 records (target: 100+)
2026-02-26 04:58:10 -05:00
Aric Camarata
72a304696e Expand dataset to 4,546 Fajr / 82 Isha records across 122 locations (Batches 12-14)
Batch 12 — Herdiwijaya 2015 ICOPIA: 4 actual observation dates
(Bosscha May 2013, Bandung Dec 2013, Cimahi Dec 2013, Yogyakarta Jul 2014),
D0=17° per paper's Indonesia recommendation.

Batch 13 — Setyanto et al. 2021 Al-Hilal: 5 Indonesian sites from zodiacal
light SQM study. New sites: Mombhul Beach Gresik (D0=19.15°), Sedan Rembang
(D0=17.64°), Imahnoong Observatory (D0=15.26°). Additional dates: Labuan Bajo
Apr 2018 (D0=19.13°), Bosscha Jul 2015 (D0=16.07°).

Batch 14 — Lubis et al. 2025 Al-Hisab: OIF UMSU Medan, 5 clear November 2024
days, D0=13.0° (urban LP, range 12°-14°).
2026-02-25 21:54:04 -05:00
Aric Camarata
5a8cbf081d Add Batch 11: Kupang Amfoang NTT Indonesia (5 Fajr records)
Herdiwijaya 2016 + 2020 (J.Phys.Conf.Ser.) — Amfoang/Kupang, East Nusa Tenggara
New site: 9.667°S, 124.0°E, 1300m high-elevation dark site.
D0=18.0° (pristine, 83 moonless night study 2011-2018).
4,532 Fajr / 82 Isha records across 113 locations.
2026-02-25 21:42:52 -05:00
Aric Camarata
1d48dc5b2e Expand dataset to 4,527 Fajr / 82 Isha records across 112 locations (Batches 8-10)
Batch 8: Saksono & Fulazzaky 2020 (NRIAG J Astron Geophys 9:238-244)
- Depok, West Java, Indonesia (6.383°S, 106.83°E): 8 aggregate Fajr records
- SQM, 26 nights Jun-Jul 2015, D0=14.0° ± 0.6°, suburban LP

Batch 9: Rashed et al. 2022 (IJMET 13(10):8-24)
- Fayum (Wadi al-Hitan), Egypt (29.283°N, 30.050°E): 6 Fajr records
- SQM-LU-DL + naked eye, Dec 2018-2019, D0=14.7°, remote desert

Batch 10: Abdel-Hadi & Hassan 2022 (IJAA 12(1):7-29)
- Per-date D0 values from Shariff 2008 SQM-LE data (M.Sc. Univ. Malaya)
- 8 Fajr records: Merang, Kuala Lipis, Port Klang (3 new sites)
- 12 Isha records: Teluk Kemang, Kuala Lumpur, Kuala Lipis, Port Klang
- Malaysia, May 2007 - April 2008, UTC+8
2026-02-25 21:37:07 -05:00
Aric Camarata
d7c2993295 Expand dataset to 4,505 Fajr records across 109 locations (Batches 7a-7b)
Batch 7a: Pinem et al. 2024 (JMEA 3:1) — 2 new North Sumatra coastal sites
- Pondok Permai Beach: 3.46°N, 99.00°E, D0=15.0° (SQM, dark coastal)
- Sri Mersing Beach: 3.45°N, 99.00°E, D0=14.0° (SQM, mild LP influence)
- 4 seasonal aggregate records per site (equinoxes/solstices 2022)

Batch 7b: Kassim Bahali et al. 2019 (JATMA 7:2) — 10 new Malaysian sites
Rows 1-50 of JADUAL 2: per-date DSLR observations Feb-Nov 2017
- Kuantan Pahang (4), Rantau Abang Terengganu (3), Penor Pahang (2)
- Kuala Dungun Terengganu (4), Kuala Terengganu new dates (2)
- Jasin Melaka (3), Setiu Terengganu (3), Bachok Kelantan (5)
- Durian Tunggal Melaka (2), Langkawi Kedah (3)
2026-02-25 21:31:22 -05:00
Aric Camarata
77e0a99ef1 Expand dataset to 4,466 Fajr records across 98 locations (Batch 6)
Added 38 per-date individual DSLR observations from Kassim Bahali et al. (2019)
JATMA 7(2):37-48 across 10 new sites:
- Sabang, Aceh, Indonesia (5.876°N, 95.340°E) — 11 nights Dec 2017
- Yaring, Pattani, Thailand (6.934°N, 101.319°E) — 2 nights Jan 2018
- Surabaya, East Java, Indonesia — 3 nights Feb 2018
- Sumenep, Madura, Indonesia — 3 nights Feb 2018
- Ternate, North Maluku, Indonesia — 3 nights Mar 2018
- South Sulawesi (Gowa area), Indonesia — 6 nights Mar 2018
- Mersing, Johor, Malaysia — 3 nights Jun 2018
- Kuala Rompin, Pahang, Malaysia — 2 nights Jul 2018
- Nenasi Pekan, Pahang, Malaysia — 2 nights Aug 2018
- Kota Tinggi, Johor, Malaysia — 3 nights Sep 2018

Depression angles computed via PyEphem from actual dawn times + coordinates.
Mean D0 range: 17.07° (Pattani) to 19.61° (Kota Tinggi dry season).
Includes first Thailand data point (Pattani) and expanded Indonesian coverage.

+10 new unique locations (88 → 98)
2026-02-25 21:10:50 -05:00
Aric Camarata
d9e8c8b062 Expand dataset to 4,428 Fajr records across 88 locations (Batch 5)
Added 32 new Fajr records from 8 new sites:
- 6 Indonesian cities from Saksono ISRN/UHAMKA 'Premature Dawn' series
  (Padang, Batusangkar, Cirebon, Balikpapan, Bitung, Manokwari) — urban LP,
  D0=-13.4°, 4 seasonal records each
- Tayu Beach, Pati, Central Java — Noor & Hamdani 2018 QIJIS, photoelectric+SQM,
  D0=-17.0°, 4 individual nights Aug-Sep 2016
- Cimahi, West Java — Herdiwijaya 2020, SQM, D0=-18.5°, 4 seasonal records

+8 new unique locations (80 → 88)
2026-02-25 21:00:00 -05:00
Aric Camarata
cc8d3c33d1 Expand dataset to 4,396 Fajr / 70 Isha records across 80 locations
Added sources and sites:
- Mount Timau NTT (CC0 BRIN SQM dataset): 97 individual Fajr nights
  at two target angles (16.51° and 18.0°); pristine 21.86 mpsas site,
  1,600m; data.brin.go.id hdl:20.500.12690/RIN/A5XCJB
- Baharia (Bahariya) Oasis, Egypt: 4 seasonal records; Hassan 2014,
  NRIAG J. 3:23-26; naked-eye multi-site 1984-1987, mean 14.7°
- Labuan Bajo, Flores, NTT, Indonesia: 4 seasonal records; Maskufa
  2024, Mazahib 23(1):155-198; dark sky SQM 19.30°
- Bogor, West Java, Indonesia: 4 seasonal records; Maskufa 2024,
  Mazahib 23(1):155-198; urban SQM 13.58°
- Pekan, Pahang, Malaysia: 9 individual DSLR observations Jun-Jul 2017;
  Kassim Bahali 2018, Sains Malaysiana 47(11):2877-2885; Do range
  -15.45° to -18.06°
- Kuala Terengganu, Malaysia: 1 record; Kassim Bahali 2018 Fig 4,
  Do=-16°, time inferred via PyEphem
- Additional batch 3 aggregate sites: Tubruq Libya (3 subsets),
  Fayum Egypt, Biak Papua, Manado North Sulawesi, Lombok NTB,
  Makkah, Madinah, Karachi, Ankara, Marrakech, Kano, Johannesburg,
  Dhaka, Alexandria

Source correction: removed incorrect Setyanto 2021 Al-Hilal
attribution from Labuan Bajo and Bogor (that paper covers zodiac
light, not Fajr, at different Indonesian sites)
2026-02-25 20:44:37 -05:00
Aric Camarata
0f01783516 Expand dataset to 4,149 Fajr / 58 Isha records across 46 locations
New records from research expansion:
- Tanjung Aru, Sabah Malaysia (Niri & Zainuddin): 4 Isha Shafaq Abyad records
- Teluk Kemang, Malaysia (Abdel-Hadi & Hassan 2022): 4 Fajr + 4 Isha SQM records
- Bosscha Observatory, Java 1310m (Herdiwijaya 2020): 4 Fajr records
- Yogyakarta, Java (Herdiwijaya 2014-2016, 136 nights): 4 Fajr records
- Kupang, NTT 10°S (Herdiwijaya 2020): 4 Fajr + 4 Isha records
- Matrouh, Egypt (Hassan et al.): 4 Fajr + 3 Isha records (1 filtered)
- Kharga Oasis, Egypt (Hassan et al. 2020): 4 Fajr records
- Hurghada, Egypt (Hassan et al. 2020): 4 Fajr records
- Marsa-Alam, Egypt (Hassan et al. 2020): 4 Fajr records
- 15th of May City, Egypt (Taha et al. 2025): 4 Fajr records
- Riyadh, Saudi Arabia (Taha et al. 2025): 4 Fajr records
- Mauritania 18°N (Taha et al. 2025): 4 Fajr records — first West Africa data

New modules:
- src/geocode.py: Nominatim geocoding with disk cache
- src/ingest.py: CSV ingestion and data standardization pipeline
- src/pipeline.py: integrated raw CSV loading via ingest module
2026-02-25 19:59:06 -05:00
Aric Camarata
a5b8adfb2d Add wiki docs, GitHub Actions wiki sync, and IDE/lint config
Five wiki pages covering Data Collection, ML Crunching, Architecture, Data
Sources, and Research Notes. GitHub Actions workflow syncs .wiki/ to the
GitHub Wiki on push to main. Adds .markdownlintignore and VS Code settings
to exclude .claude/ from lint checks. Adds .allow-ai-terms to allow the
.claude/ directory path reference in lint ignore files.
2026-02-25 19:46:19 -05:00
Aric Camarata
6e0f4a679c Rebuild as Python data science project
Replaces the original JS calibration library with a pure Python pipeline
for collecting and back-calculating solar depression angles from human-verified
Fajr and Isha prayer sightings.

What this does:
- src/pipeline.py: master pipeline; fetches iCal + manual records, back-calculates
  angles via PyEphem, applies quality filters, exports two clean CSVs
- src/collect/openfajr.py: parses the OpenFajr Birmingham iCal feed (~4,018 records)
- src/collect/verified_sightings.py: manually compiled records from peer-reviewed
  studies (Egypt, Saudi Arabia, Malaysia, Indonesia, UK, USA, Canada, and more)
- src/angle_calc.py: PyEphem back-calculation with atmospheric refraction
- src/elevation.py: Open-Elevation API batch lookup

Datasets generated:
- data/processed/fajr_angles.csv: 4,105 confirmed Fajr records, 35 locations,
  latitude range -37.8 to 53.7 degrees, date range 1985-2026
- data/processed/isha_angles.csv: 43 confirmed Isha records, 20+ locations

Also includes:
- notebooks/01_exploratory_analysis.ipynb: latitude, TOY, elevation pattern analysis
- research/: academic paper summaries (not training data)
- data/raw/sources.md: full citation table for all data sources
2026-02-25 19:32:47 -05:00
Aric Camarata
bbe1bf5cbc v1.0.0 — initial release
Weighted least-squares calibration of Islamic prayer time depression
angles from observed mosque announcement data. Uses golden-section
search to minimize the sum of squared residuals independently for
Fajr and Isha. Internal Jean Meeus solar ephemeris — zero runtime
dependencies.

API: calibrateAngles, scoreAngles, predictFajr, predictIsha.
Full TypeScript, dual CJS/ESM via tsup.
32 ESM tests, 6 CJS tests, all passing on Node 20/22/24.
2026-02-25 18:48:07 -05:00