mirror of
https://github.com/acamarata/pray-calc-ml.git
synced 2026-07-01 11:24:26 +00:00
Replaces the original JS calibration library with a pure Python pipeline for collecting and back-calculating solar depression angles from human-verified Fajr and Isha prayer sightings. What this does: - src/pipeline.py: master pipeline; fetches iCal + manual records, back-calculates angles via PyEphem, applies quality filters, exports two clean CSVs - src/collect/openfajr.py: parses the OpenFajr Birmingham iCal feed (~4,018 records) - src/collect/verified_sightings.py: manually compiled records from peer-reviewed studies (Egypt, Saudi Arabia, Malaysia, Indonesia, UK, USA, Canada, and more) - src/angle_calc.py: PyEphem back-calculation with atmospheric refraction - src/elevation.py: Open-Elevation API batch lookup Datasets generated: - data/processed/fajr_angles.csv: 4,105 confirmed Fajr records, 35 locations, latitude range -37.8 to 53.7 degrees, date range 1985-2026 - data/processed/isha_angles.csv: 43 confirmed Isha records, 20+ locations Also includes: - notebooks/01_exploratory_analysis.ipynb: latitude, TOY, elevation pattern analysis - research/: academic paper summaries (not training data) - data/raw/sources.md: full citation table for all data sources
481 lines
19 KiB
Text
481 lines
19 KiB
Text
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Fajr and Isha Angle: Exploratory Analysis\n",
|
||
"\n",
|
||
"This notebook explores the compiled datasets of verified human sightings to find patterns\n",
|
||
"in how the solar depression angle at Fajr and Isha varies with:\n",
|
||
"\n",
|
||
"- **Latitude** — distance from the equator\n",
|
||
"- **Day of Year (TOY)** — seasonality\n",
|
||
"- **Elevation** — metres above sea level\n",
|
||
"\n",
|
||
"Run the pipeline first:\n",
|
||
"```bash\n",
|
||
"python -m src.pipeline --no-elevation-lookup\n",
|
||
"```"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import pandas as pd\n",
|
||
"import numpy as np\n",
|
||
"import matplotlib.pyplot as plt\n",
|
||
"import matplotlib.ticker as ticker\n",
|
||
"from pathlib import Path\n",
|
||
"\n",
|
||
"ROOT = Path.cwd().parent\n",
|
||
"\n",
|
||
"fajr = pd.read_csv(ROOT / 'data/processed/fajr_angles.csv', parse_dates=['utc_dt'])\n",
|
||
"isha = pd.read_csv(ROOT / 'data/processed/isha_angles.csv', parse_dates=['utc_dt'])\n",
|
||
"\n",
|
||
"print(f'Fajr records: {len(fajr)}')\n",
|
||
"print(f'Isha records: {len(isha)}')\n",
|
||
"print(f'Fajr latitude range: {fajr[\"lat\"].min():.1f}° to {fajr[\"lat\"].max():.1f}°')\n",
|
||
"print(f'Fajr date range: {fajr[\"date\"].min()} to {fajr[\"date\"].max()}')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 1. Angle Distribution Overview"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"fig, axes = plt.subplots(1, 2, figsize=(14, 5))\n",
|
||
"\n",
|
||
"axes[0].hist(fajr['fajr_angle'], bins=60, color='steelblue', alpha=0.8, edgecolor='white')\n",
|
||
"axes[0].axvline(fajr['fajr_angle'].mean(), color='red', linestyle='--', label=f'Mean {fajr[\"fajr_angle\"].mean():.2f}°')\n",
|
||
"axes[0].axvline(fajr['fajr_angle'].median(), color='orange', linestyle='--', label=f'Median {fajr[\"fajr_angle\"].median():.2f}°')\n",
|
||
"axes[0].set_xlabel('Solar Depression Angle (°)')\n",
|
||
"axes[0].set_ylabel('Count')\n",
|
||
"axes[0].set_title(f'Fajr Angle Distribution (n={len(fajr):,})')\n",
|
||
"axes[0].legend()\n",
|
||
"\n",
|
||
"if len(isha) > 0:\n",
|
||
" axes[1].hist(isha['isha_angle'], bins=20, color='darkorange', alpha=0.8, edgecolor='white')\n",
|
||
" axes[1].axvline(isha['isha_angle'].mean(), color='red', linestyle='--', label=f'Mean {isha[\"isha_angle\"].mean():.2f}°')\n",
|
||
" axes[1].set_xlabel('Solar Depression Angle (°)')\n",
|
||
" axes[1].set_ylabel('Count')\n",
|
||
" axes[1].set_title(f'Isha Angle Distribution (n={len(isha):,})')\n",
|
||
" axes[1].legend()\n",
|
||
"\n",
|
||
"plt.tight_layout()\n",
|
||
"plt.savefig(ROOT / 'data/processed/angle_distribution.png', dpi=150, bbox_inches='tight')\n",
|
||
"plt.show()\n",
|
||
"\n",
|
||
"print('\\nFajr angle percentiles:')\n",
|
||
"print(fajr['fajr_angle'].describe().to_string())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 2. Latitude vs Fajr Angle"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"fig, ax = plt.subplots(figsize=(14, 6))\n",
|
||
"\n",
|
||
"# Scatter for non-Birmingham records (smaller dataset, more geographic variety)\n",
|
||
"bham = fajr[fajr['lat'].between(52.4, 52.5)]\n",
|
||
"other = fajr[~fajr['lat'].between(52.4, 52.5)]\n",
|
||
"\n",
|
||
"ax.scatter(bham['lat'], bham['fajr_angle'], alpha=0.1, s=8, color='steelblue', label=f'Birmingham OpenFajr (n={len(bham):,})')\n",
|
||
"ax.scatter(other['lat'], other['fajr_angle'], alpha=0.8, s=40, color='red', zorder=5, label=f'Other locations (n={len(other):,})')\n",
|
||
"\n",
|
||
"# Mean by latitude band\n",
|
||
"fajr['lat_band'] = (fajr['lat'] / 5).round() * 5 # round to nearest 5°\n",
|
||
"band_means = fajr.groupby('lat_band')['fajr_angle'].mean()\n",
|
||
"ax.plot(band_means.index, band_means.values, 'k--', linewidth=2, label='Band mean (5° bins)')\n",
|
||
"\n",
|
||
"ax.set_xlabel('Latitude (°)')\n",
|
||
"ax.set_ylabel('Fajr Depression Angle (°)')\n",
|
||
"ax.set_title('Fajr Angle vs Latitude')\n",
|
||
"ax.legend()\n",
|
||
"ax.grid(True, alpha=0.3)\n",
|
||
"ax.axhline(fajr['fajr_angle'].mean(), color='gray', linestyle=':', alpha=0.5, label='Overall mean')\n",
|
||
"\n",
|
||
"plt.tight_layout()\n",
|
||
"plt.savefig(ROOT / 'data/processed/fajr_vs_latitude.png', dpi=150, bbox_inches='tight')\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 3. Seasonality (Day of Year) vs Fajr Angle — Birmingham"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Birmingham has 4,000+ records — ideal for TOY analysis\n",
|
||
"bham = fajr[fajr['lat'].between(52.4, 52.5)].copy()\n",
|
||
"\n",
|
||
"fig, axes = plt.subplots(2, 1, figsize=(14, 10))\n",
|
||
"\n",
|
||
"# Raw scatter\n",
|
||
"axes[0].scatter(bham['day_of_year'], bham['fajr_angle'], alpha=0.3, s=5, color='steelblue')\n",
|
||
"axes[0].set_xlabel('Day of Year')\n",
|
||
"axes[0].set_ylabel('Fajr Depression Angle (°)')\n",
|
||
"axes[0].set_title('Birmingham Fajr Angle vs Day of Year (raw)')\n",
|
||
"axes[0].set_xticks([1, 60, 121, 182, 244, 305, 365])\n",
|
||
"axes[0].set_xticklabels(['Jan', 'Mar', 'May', 'Jul', 'Sep', 'Nov', 'Dec'])\n",
|
||
"axes[0].grid(True, alpha=0.3)\n",
|
||
"\n",
|
||
"# Rolling mean (30-day window)\n",
|
||
"bham_sorted = bham.sort_values('day_of_year')\n",
|
||
"bham_sorted['rolling_mean'] = bham_sorted['fajr_angle'].rolling(window=30, center=True).mean()\n",
|
||
"axes[1].scatter(bham_sorted['day_of_year'], bham_sorted['fajr_angle'], alpha=0.15, s=5, color='steelblue')\n",
|
||
"axes[1].plot(bham_sorted['day_of_year'], bham_sorted['rolling_mean'], 'r-', linewidth=2, label='30-day rolling mean')\n",
|
||
"axes[1].axhline(bham['fajr_angle'].mean(), color='gray', linestyle='--', label=f'Overall mean {bham[\"fajr_angle\"].mean():.2f}°')\n",
|
||
"axes[1].set_xlabel('Day of Year')\n",
|
||
"axes[1].set_ylabel('Fajr Depression Angle (°)')\n",
|
||
"axes[1].set_title('Birmingham Fajr Angle vs Day of Year (smoothed)')\n",
|
||
"axes[1].set_xticks([1, 60, 121, 182, 244, 305, 365])\n",
|
||
"axes[1].set_xticklabels(['Jan', 'Mar', 'May', 'Jul', 'Sep', 'Nov', 'Dec'])\n",
|
||
"axes[1].legend()\n",
|
||
"axes[1].grid(True, alpha=0.3)\n",
|
||
"\n",
|
||
"plt.tight_layout()\n",
|
||
"plt.savefig(ROOT / 'data/processed/birmingham_seasonality.png', dpi=150, bbox_inches='tight')\n",
|
||
"plt.show()\n",
|
||
"\n",
|
||
"# Stats by season\n",
|
||
"bham['season'] = pd.cut(bham['day_of_year'],\n",
|
||
" bins=[0, 80, 172, 266, 355, 366],\n",
|
||
" labels=['Winter', 'Spring', 'Summer', 'Autumn', 'Winter2'])\n",
|
||
"print('Birmingham Fajr angle by season:')\n",
|
||
"print(bham.groupby('season')['fajr_angle'].describe().to_string())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 4. Latitude × Season Interaction"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# For non-Birmingham locations with per-season data\n",
|
||
"other = fajr[~fajr['lat'].between(52.4, 52.5)].copy()\n",
|
||
"\n",
|
||
"fig, ax = plt.subplots(figsize=(14, 6))\n",
|
||
"\n",
|
||
"scatter = ax.scatter(other['day_of_year'], other['fajr_angle'],\n",
|
||
" c=other['lat'], cmap='RdYlBu', s=80, alpha=0.8,\n",
|
||
" vmin=-40, vmax=55)\n",
|
||
"\n",
|
||
"cbar = plt.colorbar(scatter, ax=ax)\n",
|
||
"cbar.set_label('Latitude (°)')\n",
|
||
"ax.set_xlabel('Day of Year')\n",
|
||
"ax.set_ylabel('Fajr Depression Angle (°)')\n",
|
||
"ax.set_title('Fajr Angle vs Season, colored by Latitude')\n",
|
||
"ax.set_xticks([1, 60, 121, 182, 244, 305, 365])\n",
|
||
"ax.set_xticklabels(['Jan', 'Mar', 'May', 'Jul', 'Sep', 'Nov', 'Dec'])\n",
|
||
"ax.grid(True, alpha=0.3)\n",
|
||
"\n",
|
||
"plt.tight_layout()\n",
|
||
"plt.savefig(ROOT / 'data/processed/lat_season_interaction.png', dpi=150, bbox_inches='tight')\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 5. Elevation vs Fajr Angle"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Elevation effect — compare sites with different elevations at similar latitudes\n",
|
||
"other = fajr[~fajr['lat'].between(52.4, 52.5)].copy()\n",
|
||
"\n",
|
||
"fig, ax = plt.subplots(figsize=(10, 6))\n",
|
||
"\n",
|
||
"scatter = ax.scatter(other['elevation_m'], other['fajr_angle'],\n",
|
||
" c=other['lat'].abs(), cmap='viridis', s=80, alpha=0.8)\n",
|
||
"\n",
|
||
"cbar = plt.colorbar(scatter, ax=ax)\n",
|
||
"cbar.set_label('|Latitude| (°)')\n",
|
||
"ax.set_xlabel('Elevation (m)')\n",
|
||
"ax.set_ylabel('Fajr Depression Angle (°)')\n",
|
||
"ax.set_title('Fajr Angle vs Elevation')\n",
|
||
"ax.grid(True, alpha=0.3)\n",
|
||
"\n",
|
||
"# Correlation\n",
|
||
"corr = other[['elevation_m', 'fajr_angle']].corr().iloc[0, 1]\n",
|
||
"ax.text(0.05, 0.95, f'Pearson r = {corr:.3f}', transform=ax.transAxes,\n",
|
||
" fontsize=12, verticalalignment='top')\n",
|
||
"\n",
|
||
"plt.tight_layout()\n",
|
||
"plt.savefig(ROOT / 'data/processed/elevation_effect.png', dpi=150, bbox_inches='tight')\n",
|
||
"plt.show()\n",
|
||
"\n",
|
||
"print(f'Elevation vs Fajr angle correlation: {corr:.3f}')\n",
|
||
"\n",
|
||
"# Key elevation comparisons\n",
|
||
"print('\\nHigh-elevation sites (>500m):')\n",
|
||
"high_elev = other[other['elevation_m'] > 500].groupby(['lat', 'elevation_m'])['fajr_angle'].mean()\n",
|
||
"print(high_elev.to_string())\n",
|
||
"\n",
|
||
"print('\\nLow-elevation sites (<50m):')\n",
|
||
"low_elev = other[other['elevation_m'] < 50].groupby(['lat', 'elevation_m'])['fajr_angle'].mean()\n",
|
||
"print(low_elev.to_string())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 6. Geographic Coverage Map"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Site coverage summary\n",
|
||
"all_data = pd.concat([\n",
|
||
" fajr[['lat', 'lng', 'elevation_m', 'source']].assign(prayer='fajr'),\n",
|
||
" isha[['lat', 'lng', 'elevation_m', 'source']].assign(prayer='isha'),\n",
|
||
"])\n",
|
||
"\n",
|
||
"sites = all_data.groupby(['lat', 'lng', 'elevation_m']).agg(\n",
|
||
" n_fajr=('prayer', lambda x: (x == 'fajr').sum()),\n",
|
||
" n_isha=('prayer', lambda x: (x == 'isha').sum()),\n",
|
||
").reset_index()\n",
|
||
"\n",
|
||
"print(f'Unique observation sites: {len(sites)}')\n",
|
||
"print(f'Latitude range: {sites[\"lat\"].min():.2f}° to {sites[\"lat\"].max():.2f}°')\n",
|
||
"print()\n",
|
||
"print('Sites with most records:')\n",
|
||
"print(sites.sort_values('n_fajr', ascending=False).head(10).to_string())\n",
|
||
"\n",
|
||
"fig, ax = plt.subplots(figsize=(16, 8))\n",
|
||
"sc = ax.scatter(sites['lng'], sites['lat'],\n",
|
||
" s=np.sqrt(sites['n_fajr'] + sites['n_isha']) * 8 + 20,\n",
|
||
" c=sites['lat'], cmap='RdYlBu', alpha=0.8, edgecolors='black', linewidth=0.5)\n",
|
||
"cbar = plt.colorbar(sc, ax=ax)\n",
|
||
"cbar.set_label('Latitude (°)')\n",
|
||
"ax.set_xlabel('Longitude (°)')\n",
|
||
"ax.set_ylabel('Latitude (°)')\n",
|
||
"ax.set_title('Observation Sites (bubble size = record count)')\n",
|
||
"ax.axhline(0, color='gray', linestyle='--', alpha=0.5, linewidth=0.8)\n",
|
||
"ax.grid(True, alpha=0.3)\n",
|
||
"ax.set_xlim(-100, 185)\n",
|
||
"ax.set_ylim(-45, 65)\n",
|
||
"\n",
|
||
"plt.tight_layout()\n",
|
||
"plt.savefig(ROOT / 'data/processed/site_map.png', dpi=150, bbox_inches='tight')\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 7. Simple Linear Regression: Fajr Angle ~ f(lat, day_of_year, elevation)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"from sklearn.linear_model import LinearRegression\n",
|
||
"from sklearn.preprocessing import StandardScaler\n",
|
||
"from sklearn.metrics import r2_score, mean_absolute_error\n",
|
||
"import numpy as np\n",
|
||
"\n",
|
||
"# Use all data\n",
|
||
"features = ['lat', 'day_of_year', 'elevation_m']\n",
|
||
"X = fajr[features].copy()\n",
|
||
"\n",
|
||
"# Add squared terms for non-linearity\n",
|
||
"X['lat_abs'] = fajr['lat'].abs()\n",
|
||
"X['lat_sq'] = fajr['lat'] ** 2\n",
|
||
"X['doy_sin'] = np.sin(2 * np.pi * fajr['day_of_year'] / 365.25)\n",
|
||
"X['doy_cos'] = np.cos(2 * np.pi * fajr['day_of_year'] / 365.25)\n",
|
||
"X['doy_sin2'] = np.sin(4 * np.pi * fajr['day_of_year'] / 365.25)\n",
|
||
"X['doy_cos2'] = np.cos(4 * np.pi * fajr['day_of_year'] / 365.25)\n",
|
||
"\n",
|
||
"y = fajr['fajr_angle']\n",
|
||
"\n",
|
||
"scaler = StandardScaler()\n",
|
||
"X_scaled = scaler.fit_transform(X)\n",
|
||
"\n",
|
||
"model = LinearRegression()\n",
|
||
"model.fit(X_scaled, y)\n",
|
||
"y_pred = model.predict(X_scaled)\n",
|
||
"\n",
|
||
"print(f'R² = {r2_score(y, y_pred):.4f}')\n",
|
||
"print(f'MAE = {mean_absolute_error(y, y_pred):.4f}°')\n",
|
||
"print()\n",
|
||
"print('Feature coefficients:')\n",
|
||
"for feat, coef in zip(X.columns, model.coef_):\n",
|
||
" print(f' {feat:15s}: {coef:.4f}')\n",
|
||
"\n",
|
||
"# Residuals\n",
|
||
"residuals = y - y_pred\n",
|
||
"print(f'\\nResidual stats:')\n",
|
||
"print(residuals.describe().to_string())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Residual plot\n",
|
||
"fig, axes = plt.subplots(1, 3, figsize=(16, 5))\n",
|
||
"\n",
|
||
"axes[0].scatter(fajr['lat'], residuals, alpha=0.1, s=5)\n",
|
||
"axes[0].axhline(0, color='red', linestyle='--')\n",
|
||
"axes[0].set_xlabel('Latitude')\n",
|
||
"axes[0].set_ylabel('Residual (°)')\n",
|
||
"axes[0].set_title('Residuals vs Latitude')\n",
|
||
"\n",
|
||
"axes[1].scatter(fajr['day_of_year'], residuals, alpha=0.1, s=5)\n",
|
||
"axes[1].axhline(0, color='red', linestyle='--')\n",
|
||
"axes[1].set_xlabel('Day of Year')\n",
|
||
"axes[1].set_ylabel('Residual (°)')\n",
|
||
"axes[1].set_title('Residuals vs Day of Year')\n",
|
||
"\n",
|
||
"axes[2].scatter(fajr['elevation_m'], residuals, alpha=0.3, s=20)\n",
|
||
"axes[2].axhline(0, color='red', linestyle='--')\n",
|
||
"axes[2].set_xlabel('Elevation (m)')\n",
|
||
"axes[2].set_ylabel('Residual (°)')\n",
|
||
"axes[2].set_title('Residuals vs Elevation')\n",
|
||
"\n",
|
||
"plt.suptitle('Linear Regression Residuals')\n",
|
||
"plt.tight_layout()\n",
|
||
"plt.savefig(ROOT / 'data/processed/regression_residuals.png', dpi=150, bbox_inches='tight')\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 8. Isha Angle Analysis"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"if len(isha) > 0:\n",
|
||
" fig, axes = plt.subplots(1, 3, figsize=(16, 5))\n",
|
||
"\n",
|
||
" axes[0].scatter(isha['lat'], isha['isha_angle'], color='darkorange', alpha=0.8, s=60)\n",
|
||
" axes[0].set_xlabel('Latitude (°)')\n",
|
||
" axes[0].set_ylabel('Isha Depression Angle (°)')\n",
|
||
" axes[0].set_title('Isha Angle vs Latitude')\n",
|
||
" axes[0].grid(True, alpha=0.3)\n",
|
||
"\n",
|
||
" axes[1].scatter(isha['day_of_year'], isha['isha_angle'], color='darkorange', alpha=0.8, s=60)\n",
|
||
" axes[1].set_xlabel('Day of Year')\n",
|
||
" axes[1].set_ylabel('Isha Depression Angle (°)')\n",
|
||
" axes[1].set_title('Isha Angle vs Season')\n",
|
||
" axes[1].set_xticks([1, 60, 121, 182, 244, 305, 365])\n",
|
||
" axes[1].set_xticklabels(['Jan', 'Mar', 'May', 'Jul', 'Sep', 'Nov', 'Dec'])\n",
|
||
" axes[1].grid(True, alpha=0.3)\n",
|
||
"\n",
|
||
" axes[2].scatter(isha['elevation_m'], isha['isha_angle'], color='darkorange', alpha=0.8, s=60)\n",
|
||
" axes[2].set_xlabel('Elevation (m)')\n",
|
||
" axes[2].set_ylabel('Isha Depression Angle (°)')\n",
|
||
" axes[2].set_title('Isha Angle vs Elevation')\n",
|
||
" axes[2].grid(True, alpha=0.3)\n",
|
||
"\n",
|
||
" plt.suptitle(f'Isha Analysis (n={len(isha)} records)')\n",
|
||
" plt.tight_layout()\n",
|
||
" plt.savefig(ROOT / 'data/processed/isha_analysis.png', dpi=150, bbox_inches='tight')\n",
|
||
" plt.show()\n",
|
||
"\n",
|
||
" print('Isha angle stats by latitude band:')\n",
|
||
" isha['lat_band'] = pd.cut(isha['lat'], bins=[-40, -10, 10, 30, 45, 60],\n",
|
||
" labels=['30-40°S', '10°S-10°N', '10-30°N', '30-45°N', '45-60°N'])\n",
|
||
" print(isha.groupby('lat_band')['isha_angle'].describe().to_string())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 9. Summary and Hypotheses for ML\n",
|
||
"\n",
|
||
"### Observed patterns:\n",
|
||
"\n",
|
||
"1. **Latitude effect**: Near-equatorial sites (Malaysia, Indonesia, 2°-7°) show higher Fajr angles (~16°-17°) compared to mid-latitude sites (UK ~13°, Egypt ~14°). This is counter-intuitive but physically explainable: the sun's arc through the horizon zone is steeper at low latitudes, so each degree of depression corresponds to a shorter time interval.\n",
|
||
"\n",
|
||
"2. **Seasonality (TOY)**: At fixed latitude, Fajr angle is lower in summer than winter. This is clear in the Birmingham dataset (10+ years of data). Summer twilight is shorter and the sun's path through the horizon zone is shallower.\n",
|
||
"\n",
|
||
"3. **Elevation**: Higher-elevation sites tend toward slightly higher angles. Desert observatory sites (Kottamia 477m, Hail 1020m, Tehran 1191m) show angles on the higher end. This is consistent with the physical effect: elevated observers see through less atmosphere, so the first light of dawn appears at a slightly steeper angle.\n",
|
||
"\n",
|
||
"4. **Latitude × Season interaction**: The seasonal swing is larger at high latitudes (Birmingham has a ~3° range from summer to winter) and smaller at equatorial sites (Malaysian sites show < 1° seasonal variation).\n",
|
||
"\n",
|
||
"### Next steps for ML:\n",
|
||
"\n",
|
||
"- Train gradient boosted models (XGBoost, LightGBM) on all available data\n",
|
||
"- Key features: `lat`, `lat_abs`, `lat_sq`, `day_of_year`, `doy_sin`, `doy_cos`, `elevation_m`, `lat × doy_sin`, `lat × doy_cos`\n",
|
||
"- Expand Isha dataset (currently only 43 records) before training Isha model\n",
|
||
"- Outlier analysis: identify records that deviate significantly from the fitted model and investigate whether they represent data entry errors, unusual atmospheric conditions, or genuine outliers\n",
|
||
"- Cross-validation strategy: leave-one-location-out (not random split) to test generalization to unseen locations"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"name": "python",
|
||
"version": "3.14.0"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 4
|
||
}
|