Skip to content

sametbasbug/haber.sametbasbug.dev

Repository files navigation

Anlık Haber

Quality Checks Deploy License: MIT

Anlık Haber is an editorial-first, AI-assisted news publishing system for haber.sametbasbug.dev.

The project explores a practical question: can a small public news surface be maintained with a transparent pipeline where software handles ingestion, deduplication, queueing, audits, build/deploy discipline, and image generation, while an editorial agent/human-in-the-loop keeps the final news judgment?

The live site is Turkish and global-focused. The code and pipeline are documented in English/Turkish because the reusable part is the workflow, not only the content.

What is open source here?

This repository is intentionally split into two layers:

  • Source code and workflow tooling are MIT licensed. This includes the Astro site, Python news pipeline, CLI, audits, queue logic, and build/deploy workflow.
  • Editorial content, generated/published images, brand identity, and protected media are not included in the MIT license. See CONTENT_LICENSE.md.

In short: the system can be studied, reused, forked, and adapted; the published news archive and brand layer should not be treated as open content.

Why this repository exists

Most small publishing projects eventually hit the same maintenance wall:

  • collecting sources is easy, but selecting responsibly is hard;
  • automation is fast, but blind autopublish is risky;
  • duplicate stories and repeated angles quietly lower quality;
  • AI-generated drafts can help, but they must not become the editor;
  • publishing needs boring guardrails: audits, build checks, narrow commits, and rollback-friendly history.

Anlık Haber is a working experiment around those constraints.

Why this matters for OSS maintainers

Many maintainers now have to review AI-assisted changes, generated summaries, bot-authored PRs, or automated release notes. The hard part is not calling a model; it is keeping the workflow auditable when automation touches public output.

This repo is useful as a small, concrete reference for:

  • separating ingestion/automation from final editorial authority;
  • keeping provider-dependent steps out of CI;
  • writing tests for stale-source, duplicate, manual-review, and leak-prevention gates;
  • documenting where automation must stop and a human/maintainer must decide;
  • preserving a clear license boundary between reusable tooling and protected published content.

Architecture

The repository has two main parts:

  • Astro publishing surface — static news pages, RSS, sitemap, category UI, author/site shell, and GitHub Pages deployment.
  • Python news pipeline — RSS collection, normalization, duplicate reduction, editorial queue, scoring/filtering, quality gates, AI hero generation handoff, markdown generation, audits, and controlled publish workflow.
haber-project/
  src/
    components/news/          # News UI components
    content/anlikHaber/       # Published markdown news items
    pages/                    # Astro routes, RSS, sitemap
  public/images/generated/    # Generated hero images
  news_pipeline/
    news_pipeline/
      collectors/             # RSS/source ingestion
      normalize/              # Raw item cleanup
      dedupe/                 # Similarity and duplicate checks
      editorial/              # Filtering, scoring, autonomy gates
      queue/                  # Editorial queue state
      publish/                # Markdown/frontmatter/hero helpers
      cli/                    # Typer CLI commands
    data/                     # Runtime data, ignored except placeholders/docs

Editorial model

The project is editorial-first.

Python is not the editor. It is the technical rail: it collects, normalizes, scores, queues, audits, builds, and publishes only after the editorial handoff is present.

Asteria AI is the narrow editorial agent used in this project. Its role is to inspect candidates, read selected source URLs, write the Turkish article body, prepare facts/tags, and produce the hero image brief. The pipeline then validates and carries that work into the Astro site.

Current category set:

  • Siyaset
  • Ekonomi
  • Teknoloji
  • Bilim

Turkey-related stories are included only when the global context is strong enough for one of those categories.

Core workflow

# Collect source items
news-pipeline collect

# Normalize raw items and update the queue
news-pipeline process

# Prepare a board for editorial selection
news-pipeline heartbeat prepare-one --json

# After editorial review/polish, publish through the technical rail
news-pipeline heartbeat publish-one --execute --no-collect --json

# Local quality gates
news-pipeline audit-content
news-pipeline audit-images
npm run build

The direct publish path is intentionally disabled in public CLI help. Production publishing should go through the heartbeat/editorial-polish route so the editorial handoff, hero brief, audits, and build checks are preserved.

Quality and safety gates

The pipeline is designed to reduce common publishing failure modes:

  • source age checks;
  • URL/title/description/topic duplicate guards;
  • manual-review separation for sensitive legal/political/personal-risk stories;
  • required editorial polish before production publish;
  • Turkish title/body/fact checks;
  • required hero prompt and alt text;
  • AI hero generation with WebP optimization;
  • image/content audits before build;
  • Astro build before deploy;
  • narrow commit/push scope for published items.

CI runs provider-free checks only: Python compile, pipeline tests with pytest, pipeline audits, and Astro build. It does not call external source collection, AI providers, or production publish commands.

Installation

Requirements:

  • Python 3.12+
  • Node.js 24+
  • npm
python3 -m venv news_pipeline/.venv
source news_pipeline/.venv/bin/activate
pip install -e "news_pipeline[test]"
npm install

Development commands

# Run the Astro dev server
npm run dev

# Build the static site
npm run build

# Run provider-free local quality checks
npm run quality

# Compile Python pipeline
python -m compileall news_pipeline/news_pipeline

# Run pipeline unit tests
news_pipeline/.venv/bin/python -m pytest news_pipeline/tests

# Explore the synthetic demo dataset (dry-run walkthrough; no providers/push)
news-pipeline demo seed --force
news-pipeline demo walkthrough

Documentation

Current status

This is a working public project, not a polished framework package. It currently powers the live Anlık Haber site and contains real operational history.

The near-term OSS goal is to make the reusable parts easier for others to study and adapt:

  • clearer setup and example data;
  • stronger tests around dedupe/scoring/audit gates;
  • more provider-agnostic interfaces for AI image/text handoff;
  • safer documentation for human-in-the-loop editorial automation.

License

  • Code and workflow tooling: MIT License
  • Content, images, media, and brand layer: CONTENT_LICENSE.md
  • Third-party materials remain subject to their original owners and terms.

About

Editorial-first AI-assisted news publishing pipeline for Turkish global news.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors