Equinox Haber is an editorial-first, AI-assisted Turkish news publishing system for haber.sametbasbug.dev.
The project explores a practical question: can a small public news surface stay fast, readable, transparent, and auditable when software handles ingestion, deduplication, queueing, audits, image generation, static builds, and deployment discipline — while an editorial agent/human-in-the-loop keeps the final news judgment?
The reusable part is the workflow. The live output is a global-focused Turkish news stream.
This repository is intentionally split into two layers:
- Source code and workflow tooling are MIT licensed. This includes the Astro site, Python news pipeline, CLI, audits, queue logic, tests, and build/deploy workflow.
- Editorial content, generated/published images, brand identity, and protected media are not included in the MIT license. See
CONTENT_LICENSE.md.
In short: the system can be studied, reused, forked, and adapted; the published news archive and brand layer should not be treated as open content.
Most small publishing projects eventually hit the same maintenance wall:
- collecting sources is easy, but selecting responsibly is hard;
- automation is fast, but blind autopublish is risky;
- duplicate stories and repeated angles quietly lower quality;
- AI-generated drafts can help, but they must not become the editor;
- image generation needs strict size/licensing/visual guardrails;
- publishing needs boring safeguards: audits, build checks, narrow commits, CI, and rollback-friendly history.
Equinox Haber is a working experiment around those constraints.
Many maintainers now have to review AI-assisted changes, generated summaries, bot-authored PRs, or automated release notes. The hard part is not calling a model; it is keeping the workflow auditable when automation touches public output.
This repo is useful as a small, concrete reference for:
- separating ingestion/automation from final editorial authority;
- keeping provider-dependent steps out of CI;
- writing tests for stale-source, duplicate, manual-review, image, and leak-prevention gates;
- documenting where automation must stop and a human/maintainer must decide;
- preserving a clear license boundary between reusable tooling and protected published content.
The current live site is a standalone GitHub Pages news surface at haber.sametbasbug.dev.
Key routes:
/— live news homepage with category tabs and automatic featured signals/?kategori=siyaset,/?kategori=ekonomi,/?kategori=teknoloji,/?kategori=bilim— category-filtered homepage views/<slug>/— article pages with canonical news metadata/sayfa/<n>/— paginated public archive/icerik-paneli/— expanded content-panel stream/rss.xml— RSS feed for the latest public items/news-sitemap.xml— Google News sitemap for the recent publication window/sitemap-index.xml— sitemap index generated by Astro
Transparency and policy pages:
/hakkimizda//yazarlar//iletisim//editorial-ilkeler//duzeltme-politikasi//yapay-zeka-ve-yayin-sureci//gizlilik-politikasi/
The header is intentionally category-focused. Trust, policy, RSS, and ecosystem links live in the footer so the news reading surface stays clean.
The homepage has two editorial layers:
- Canlı akış — the main chronological stream.
- Öne çıkanlar / Akıştan seçilenler — an automatic signal block, not a fake manual editor pick.
The automatic featured model scores recent items using recency, source count, breaking/global signal terms, and optional editorPick as a boost. Old manual flags are penalized so stale stories do not stay pinned forever.
When a category tab is active, the side block becomes category-specific, for example Teknoloji akışından, and keeps its items inside that category. In the default Tümü view, it prefers category diversity.
The repository has two main parts:
- Astro publishing surface — static news pages, category UI, article templates, RSS, Google News sitemap, sitemap index, transparency pages, author/site shell, and GitHub Pages deployment.
- Python news pipeline — RSS collection, normalization, duplicate reduction, editorial queue, scoring/filtering, quality gates, AI hero generation handoff, markdown/frontmatter generation, audits, and controlled publish workflow.
haber-project/
src/
components/news/ # News UI components
content/equinoxHaber/ # Published markdown news items
data/ # Site/category helpers
layouts/ # News/article/info page shells
pages/ # Astro routes, RSS, news sitemap
public/images/generated/ # Generated hero images
news_pipeline/
news_pipeline/
collectors/ # RSS/source ingestion
normalize/ # Raw item cleanup
dedupe/ # Similarity and duplicate checks
editorial/ # Filtering, scoring, autonomy gates
queue/ # Editorial queue state
publish/ # Markdown/frontmatter/body/hero helpers
cli/ # Typer CLI commands and audits
data/ # Runtime data, ignored except placeholders/docs
The project is editorial-first.
Python is not the editor. It is the technical rail: it collects, normalizes, scores, queues, audits, builds, and publishes only after the editorial handoff is present.
Asteria AI is the narrow editorial agent used in this project. Its role is to inspect candidates, read selected source URLs, write the Turkish article body, prepare facts/tags, and produce the hero image brief. The pipeline then validates and carries that work into the Astro site.
Current category set:
- Siyaset
- Ekonomi
- Teknoloji
- Bilim
Turkey-related stories are included only when the global context is strong enough for one of those categories.
# Collect source items
news-pipeline collect
# Normalize raw items and update the queue
news-pipeline process
# Prepare a compact editorial board for Asteria
news-pipeline heartbeat prepare-one --json
# Asteria applies the editorial handoff
news-pipeline queue polish <QUEUE_ID> \
--title 'Türkçe başlık' \
--description 'Türkçe açıklama' \
--category 'Teknoloji' \
--facts-json '["...", "..."]' \
--body 'Haber gövdesi...' \
--hero-prompt 'AI hero brief...' \
--hero-alt 'Türkçe alt metin' \
--tags-json '["haber", "teknoloji"]' \
--json
# Publish through the technical rail
news-pipeline heartbeat publish-one --execute --no-collect --json
# Local quality gates
news-pipeline audit-content
news-pipeline audit-images
npm run buildThe direct publish/autopublish path is intentionally disabled for production use. Production publishing should go through the heartbeat/editorial-polish route so the editorial handoff, hero brief, audits, and build checks are preserved.
The pipeline is designed to reduce common publishing failure modes:
- source age checks;
- URL/title/description/topic duplicate guards;
- recent repeated-company/product signal penalties;
- manual-review separation for sensitive legal/political/personal-risk stories;
- required Asteria editorial polish before production publish;
- Turkish title/body/fact checks;
- required hero prompt and alt text;
- AI hero generation with WebP normalization;
- generated local hero guard:
.webp,1200×675, max400 KB; - RSS media enclosures with absolute URLs;
- Google News sitemap for the recent publication window;
- image/content audits before build;
- Astro build before deploy;
- narrow commit/push scope for published items.
CI runs provider-free checks only: Python compile, pipeline tests with pytest, deprecated CLI guardrails, pipeline audits, and Astro build. It does not call external source collection, AI providers, or production publish commands.
The public release includes the surfaces expected by news/feed consumers:
rss.xmlis limited to the latest 50 items and uses absolute image enclosure URLs.news-sitemap.xmllists recently published items with Google News metadata.sitemap-index.xmlis generated for the static site.- Article pages include canonical URLs and structured news metadata.
- Public transparency pages explain ownership, authorship, corrections, privacy, editorial principles, and AI use.
Generated hero images are intentionally local and normalized:
- final path:
public/images/generated/equinox-haber/<slug>.webp - dimensions:
1200×675 - format:
WebP - default quality target:
82 - audit maximum:
400 KB
The image generator writes to a raw temporary output first; the pipeline then resizes, crops, strips metadata, and writes the final WebP. This prevents provider output dimensions from silently bloating the site.
Source/RSS/OG images from publishers are intentionally blocked for Equinox Haber hero use because they create licensing and editorial reuse risk.
Requirements:
- Python 3.12+
- Node.js 24+
- npm
python3 -m venv news_pipeline/.venv
source news_pipeline/.venv/bin/activate
pip install -e "news_pipeline[test]"
npm install# Run the Astro dev server
npm run dev
# Build the static site
npm run build
# Run provider-free local quality checks
npm run quality
# Compile Python pipeline
python -m compileall news_pipeline/news_pipeline
# Run pipeline unit tests
news_pipeline/.venv/bin/python -m pytest news_pipeline/tests
# Explore the synthetic demo dataset (dry-run walkthrough; no providers/push)
news-pipeline demo seed --force
news-pipeline demo walkthroughUseful npm wrappers:
npm run news:prepare
npm run news:publish-one
npm run news:audit
npm run test:pipelinenews_pipeline/README.md— pipeline CLI and workflow detailsnews_pipeline/OPERATIONS.md— operational runbooknews_pipeline/HEARTBEAT_RUNBOOK.md— heartbeat cycle notesnews_pipeline/AUTONOMOUS_PUBLISH_POLICY.md— autonomy boundariesnews_pipeline/news_pipeline/demo/synthetic/README.md— tiny provider-free demo datasetCONTRIBUTING.md— contribution guidelinesROADMAP.md— maintainability roadmapSECURITY.md— security reporting and scope
This is a working public project, not a polished framework package. It powers the live Equinox Haber site and contains real operational history.
The current maintenance priority is to keep the production surface boringly reliable:
- keep the category tabs, feeds, sitemap, and transparency pages current;
- preserve the Asteria editorial handoff instead of sliding into blind autopublish;
- keep generated images small and locally auditable;
- maintain provider-free CI checks;
- document changes as the site and pipeline evolve;
- keep this README as the top-level map whenever routes, feeds, pipeline gates, or release policy change.
- Code and workflow tooling:
MIT License - Content, images, media, and brand layer:
CONTENT_LICENSE.md - Third-party materials remain subject to their original owners and terms.