mirror of
https://github.com/acamarata/pray-calc-ml.git
synced 2026-07-02 03:40:39 +00:00
Major additions: - Extract all 1,621 Basthoni 2022 SQM records (46 Indonesian sites, Lampiran 2-5) via precomputed_angles.py - Add 9 new raw sighting CSVs: Abdel-Hadi Malaysia, BRIN multistation, Kassim Bahali (2017+2019), Khalifa Saudi, Moonsighting.com, Shaukat 2015 Blackburn UK, Walisongo Sulawesi - Curate aggregate D0 database (115 entries) in research/ Pipeline improvements: - Open-Topo-Data SRTM30m primary elevation API with fallback - APPROVED_RAW_CSVS allowlist prevents circular data ingestion - Pre-computed angle merge path (bypasses back-calculation for SQM data) - BAD_NOTE_MARKERS quality filter for excluded sources Collection tools: - BRIN multistation SQM processors - PDF/HTML table extractor for academic papers - Source tracking database (collection_manifest.json) Documentation: - Rewrite .wiki/Data.md and .wiki/Research.md from scratch - Expand Data-Sources.md with full Basthoni Lampiran breakdown - Add 14 researcher outreach drafts - Update .gitignore to exclude bulk/experimental files
57 lines
1.2 KiB
Text
57 lines
1.2 KiB
Text
__pycache__/
|
|
*.py[cod]
|
|
*.egg-info/
|
|
.eggs/
|
|
dist/
|
|
build/
|
|
.venv/
|
|
venv/
|
|
env/
|
|
.env
|
|
*.log
|
|
.DS_Store
|
|
.ipynb_checkpoints/
|
|
.jupyter/
|
|
.claude/
|
|
|
|
# Raw scraped/downloaded files
|
|
data/raw/*.pdf
|
|
|
|
# Generated notebook outputs
|
|
data/processed/*.png
|
|
data/processed/*.svg
|
|
|
|
# Bulk data directories (too large for git, not curated)
|
|
data/cache/
|
|
data/raw/crawled/
|
|
data/raw/excluded/
|
|
data/raw/brin_multistation_raw/
|
|
|
|
# Collection session logs and scratch
|
|
data/raw/collection_log*.txt
|
|
data/raw/sources_crawled.md
|
|
|
|
# Research scratch files (superseded by aggregate_d0_values.csv)
|
|
research/aggregate_d0_database.csv
|
|
research/aggregate_analysis.md
|
|
research/candidate_papers.json
|
|
research/mine_*.py
|
|
|
|
# Experimental collection scripts (not part of core pipeline)
|
|
src/autonomous_collect.py
|
|
src/compute_aggregate_times.py
|
|
src/collect/aladhan.py
|
|
src/collect/autonomous_collector.py
|
|
src/collect/aggregate_to_records.py
|
|
src/collect/bulk_generator.py
|
|
src/collect/bulk_runner.py
|
|
src/collect/cities.py
|
|
src/collect/collect_agent.py
|
|
src/collect/harvest.py
|
|
src/collect/jakim.py
|
|
src/collect/morocco.py
|
|
src/collect/muis_singapore.py
|
|
src/collect/openalex_harvester.py
|
|
src/collect/source_tracker.py
|
|
src/collect/waktusolat.py
|
|
src/collect/web_harvester.py
|