HN Reader

NewTopBestAskShowJob
Show HN: Cordon – Reduce large log files to anomalous sections
score icon12
comment icon0
12 hours agoby calebevans
Cordon uses transformer embeddings and density scoring to identify what's semantically unique in log files, filtering out repetitive noise.

The core insight: a critical error repeated 1000x is "normal" (semantically dense). A strange one-off event is anomalous (semantically isolated).

Outputs XML-tagged blocks with anomaly scores. Designed to reduce large logs as a form of pre-processing for LLM analysis.

Architecture: https://github.com/calebevans/cordon/blob/main/docs/architec...

Benchmark: https://github.com/calebevans/cordon/blob/main/benchmark/res...

Trade-offs: intentionally ignores repetitive patterns, uses percentile-based thresholds (relative, not absolute).

No comments