7 Ways to Extract Signal

You have 6,792 articles, 2,440 entities, 4,007 insights, and 2,215 relationships. The daily digest works. The raw entity list doesn't. Here are seven ideas for turning your data into decisions — ranked by value-to-effort.

96
Read (1.4%)
639
Skim (9.4%)
631
Save (9.3%)
5,426
Skip (79.9%)
Key finding: 80% of your articles are skipped. Only 1.4% get a "read" action (avg score 8.2). Half your feeds produce zero signal. Your data is heavily Anthropic-centric: 486 of 2,440 top entity mentions are Anthropic, driven by the Google News "Anthropic" feed (885 articles, 13% of total pipeline). This is a coverage bias, not a market signal.
CONCEPT 01
📈

Entity Momentum Board

Bloomberg terminal for your tech landscape. Anthropic at 486 mentions (+834% 7d), Google surging +2300%, 5 entities brand new this week. See what's accelerating before it trends.

Medium effort High value Uses existing data
CONCEPT 02
📡

Weak Signal Radar

14 entities appeared for the first time this week with 2+ mentions. Broadcom (65 mentions), CoreWeave (52), Claude Mythos (35) — all brand new in your graph. The periphery is where alpha lives.

Medium effort Highest alpha first_seen exists in schema
CONCEPT 03
📊

Feed ROI Analyzer

VentureBeat delivers 62% signal (A+). Seeking Alpha delivers 4% (F). 7 feeds produce literally 0% read+skim. You're burning API quota on pure noise.

Low effort Immediate value Uses existing data
CONCEPT 04
🎬

Narrative Arcs

Anthropic went from "3.5GW TPU deal" to "revenue could triple" to "limiting Mythos to enterprise security partners" in one week. Track the narrative arc over time.

High effort Unique insight New LLM pass needed
CONCEPT 05
🔍

Intelligence Diff

14 new entities this week. "Secured 3.5GW of TPU capacity" reinforced at 0.9 confidence. OpenClaw is the only decliner (-6%). The delta is the signal.

Low effort High value Uses existing data
CONCEPT 06
🗺

Coverage Heatmap

Topic 1 (AI/LLM) has 67 reads, 691 total. Topic 0 is 618 pure skip — uncategorized junk. Topics 6-13 are nearly invisible. Where are your blind spots?

Low effort Feed planning Uses existing data
CONCEPT 07

Contrarian Detector

51 contradicts relationships in your graph. Anthropic <-> OpenAI competing across 47 co-mentions. When your sources disagree, something interesting is happening.

High effort Highest alpha potential Needs sentiment extraction on insights
Concept 01

Entity Momentum Board

Think Bloomberg terminal for your tech landscape. Every entity gets a ticker row with a 7-day sparkline, momentum score, velocity arrow, and one-line signal summary. Sort by momentum to see what's accelerating before it hits your digest as a spike alert. This replaces the useless entity list with something you'd actually check daily.

Medium effort Replaces entity tab All data exists in D1
stillhouse-dashboard / entities
Sort: Momentum Mentions New Declining 2,440 entities tracked
Entity
7d
Trend
Velocity
Signal
Google co
96
+2300%
TPU capacity deal with Anthropic. 68 co-mentions with Anthropic. Broadcom partnership.
Meta co
42
+2000%
From 2 to 42 mentions. AI infrastructure investment narrative.
Microsoft co
35
+1067%
Cloud infrastructure economics. AI partnership moves.
AI Infrastructure th
77
+1000%
Hidden theme connecting Anthropic, Google, TPU, CoreWeave. Economics reshaping.
MCP tc
11
+1000%
Model Context Protocol. Low absolute volume but explosive growth.
Platform Eng. th
30
+900%
3 to 30 mentions. Enterprise platform stories converging with AI agents theme.
Anthropic co
439
+834%
486 total. 82 co-mentions with Claude. TPU deal, revenue tripling, Mythos launch.
Broadcom co
65
NEW
Brand new. Stock +13% on Anthropic deal. 62 co-mentions with Anthropic. First seen Apr 6.
CoreWeave co
52
NEW
Brand new Apr 9. 40 co-mentions with Anthropic. AI infrastructure pure-play.
Kubernetes tc
9
+350%
Infrastructure plumbing. Resurgence tied to AI workload orchestration.
OpenClaw pr
16
-6%
Only declining entity in top 20. 17 to 16 mentions. Losing narrative momentum.
Key insight: Anthropic dominates at 486 mentions vs #2 Google at 100. This isn't a trend — it's a coverage bias from your feed selection. The Google News "Anthropic" search feed alone generates 885 articles. AI Infrastructure (77 mentions) is the hidden connective thread between your top stories: Anthropic, Google, TPU, CoreWeave, Broadcom.
Where it lives: Replaces the current entity tab in the dashboard. New compiler endpoint GET /momentum?sort=velocity&limit=50 returns pre-computed momentum data. Sparkline data from a new entity_daily_mentions materialized table (daily cron aggregation).
Concept 02

Weak Signal Radar

The most valuable signal is what just appeared. Not the entity with 486 mentions — the one that appeared for the first time this week with 15+ mentions across multiple sources. A radar visualization shows entities by age and momentum: the outer ring is brand new, the inner ring is established. 14 entities appeared this week with 2+ mentions. The periphery is where alpha lives.

Medium effort Early warning system first_seen exists in schema
stillhouse-dashboard / radar
First seen < 72h
First seen 3-7d
Established > 7d
NASA
Artemis II
Orion spacecraft
Muse Spark
Broadcom
CoreWeave
Claude Mythos
Enterprise AI
Project Glasswing
TPU
Anthropic
OpenAI
Claude
Newly Detected (72h)
NASA
16 mentions / first seen Apr 10
Type: company / Related: Artemis II, Orion
Space/tech crossover story
Artemis II
15 mentions / first seen Apr 10
Type: product / Related: NASA, Orion spacecraft
Appeared same day as NASA — linked story
Orion spacecraft
9 mentions / first seen Apr 11
Type: technology / Related: NASA, Artemis II
Newest entity in the cluster
Accelerating This Week
Broadcom
65 mentions / first seen Apr 6
Stock +13% on Anthropic deal. 62 co-mentions.
CoreWeave
52 mentions / first seen Apr 9
AI infrastructure pure-play. 40 co-mentions with Anthropic.
Claude Mythos
35 mentions / first seen Apr 7
New product. Limited to enterprise security partners.
The alpha: 14 entities appeared for the first time this week with 2+ mentions. NASA and Artemis II appeared together on Apr 10 — a space/tech crossover story that's unusual for your AI-heavy feed mix. Broadcom and CoreWeave are the biggest new signals by volume: both closely tied to Anthropic's infrastructure narrative. Cloud Infrastructure (13 mentions, first seen Apr 7) is a meta-theme connecting many of these new arrivals.
Delivery options: This could be a new dashboard tab, or a weekly Telegram message ("14 new entities detected this week: Broadcom (65), CoreWeave (52), Claude Mythos (35)..."). The radar viz is great for dashboard; the weekly summary is great for Telegram. Both use the same underlying query on first_seen.
Concept 03

Feed ROI Analyzer

You're about to add more feeds. Before you do, know which ones earn their keep. Every feed gets a stacked bar showing what % of its articles you'd read, skim, save, or skip. A feed that's 95% skip is noise. A feed that's 62% read+skim is gold. Use this to prune dead weight and find gaps where you need more coverage.

Low effort Immediate feed planning value All data exists — just GROUP BY feed_id
stillhouse-dashboard / feed-health
Period: All time Articles: 6,792 Read Skim Save Skip
VentureBeat
A+
24
MarkTechPost
A
30
InfoQ
A
42
The New Stack
A
30
DevOps.com
B
24
r/artificial
B
218
GNews: "Anthropic"
C
885
HN score>50
D
399
Product Hunt
D
256
Forbes Innovation
D
416
Seeking Alpha
F
523
TechCrunch
F
121
FlowingData
F
11
Dean Blundell
F
32
MIT Tech Review
F
18
Your biggest volume sink: Google News "Anthropic" search generates 885 articles (13% of all articles) but only 17% signal. It's your biggest volume source AND your biggest noise source. Meanwhile, 7 feeds produce literally 0% read+skim rate: FlowingData, Dean Blundell, Philip Greenspun, The Economist Finance, Private Equity International, r/private_equity, MIT Tech Review. They're pure noise burning API quota, LLM scoring tokens, and D1 storage.
Volume vs. Precision Tradeoff
r/artificial (218 articles, B grade, 31% signal) delivers ~68 useful articles in absolute terms. VentureBeat (24 articles, A+ grade, 62% signal) delivers ~15 useful articles. The high-volume B-grade feed delivers 4x more absolute signal than the low-volume A+ feed. Show both views: % quality AND absolute signal yield.
Recommended Actions
Drop: 7 zero-signal feeds (save ~130 articles/month of processing).
Investigate: GNews Anthropic feed — 885 articles at 17% signal. Consider filtering or reducing frequency.
Keep: VentureBeat, MarkTechPost, InfoQ, The New Stack — your A-tier signal sources.
Concept 04

Narrative Arcs

Entities don't just have mention counts — they have stories. Anthropic went from "TPU capacity deal" to "revenue tripling" to "launching Mythos" to "limiting access to enterprise security" in one week. Tracking the framing shift over time tells you more than raw volume. This is where multi-year retention becomes a superpower.

High effort Unique multi-year insight Needs weekly LLM summarization
stillhouse / entity / anthropic — narrative arc
Anthropic company
486 mentions / 82 co-mentions with Claude / 68 with Google / 62 with Broadcom
Apr 6
Secured 3.5GW of TPU capacity from Google/Broadcom starting 2027. Revenue tripling narrative.
47 prev_7d
Growth
Apr 7
Claude Mythos + Project Glasswing announced. Enterprise security focus. Staggering demand from AWS.
+439 7d
Peak
Apr 9
CoreWeave partnership emerges. Custom silicon reshaping cloud economics. Stock rose 13% on Broadcom deal.
+834% vel.
Expansion
Apr 10
Competitive pressure: OpenAI (47 co-mentions). Faces threat from model providers moving up the stack.
82 w/ Claude
Scrutiny
Apr 11
CyberGym 83.1% benchmark. Limiting Mythos to enterprise security partners. European AI infrastructure diverging.
486 total
Maturing
Apr 13
Most rapid growth in American corporate history. AI Infrastructure Economics emerging as dominant theme.
439 this wk
Dominance
The multi-year play: With 12+ months of data, you'll see patterns across entities. Anthropic's arc this week alone shows: infrastructure investment → product launch → partnership expansion → competitive pressure → enterprise maturation. Pattern recognition across arcs is the real insight — does every major AI company follow this same trajectory?
Implementation: Weekly cron generates a 1-2 sentence summary per entity per week using the entity's insight claims + article titles from that period. Stored in a new entity_arc_weeks table. LLM also classifies phase: emergence / growth / peak / scrutiny / decline / recovery. The timeline renders directly from these rows — no real-time LLM needed.
Concept 05

Intelligence Diff

Your digest tells you what happened. This tells you what changed. 14 new entities appeared this week. An insight about TPU capacity got reinforced at 0.9 confidence. OpenClaw is the only entity declining in your top 20. The delta between weeks is the signal. Think git diff for your knowledge graph.

Low effort High-value weekly delivery Pure D1 queries — no new extraction
stillhouse / intelligence diff — week of Apr 7
Compared: Apr 7-13 vs Mar 31-Apr 6  |  +14 new entities (2+ mentions)   1 declining (OpenClaw)   5 entities brand NEW   10 insights reinforced at 0.8+
+
New Entities This Week +14
+
Broadcom company — 65 mentions
First seen Apr 6. Stock +13% on Anthropic TPU deal. 62 co-mentions with Anthropic, 59 with Google.
+
CoreWeave company — 52 mentions
First seen Apr 9. AI infrastructure pure-play. 40 co-mentions with Anthropic. Massive entry.
+
Claude Mythos product — 35 mentions
First seen Apr 7. Security-focused product. Limited to enterprise security partners. 34 co-mentions with Anthropic.
+
NASA company — 16 mentions   Artemis II product — 15 mentions
Both first seen Apr 10. Space/tech crossover. Orion spacecraft (9 mentions) followed on Apr 11. Unusual cluster for your AI-heavy feeds.
Declining 1
OpenClaw product
Only declining entity in top 20. Went from 17 to 16 mentions (-6%). Everything else is growing or brand new.
Massive Velocity Shifts 5
Google: 4 → 96 mentions (+2300%)
TPU capacity deal, Broadcom partnership, AI infrastructure investment. 68 co-mentions with Anthropic.
Meta: 2 → 42 mentions (+2000%)
From near-invisible to major presence. AI infrastructure investment narrative driving coverage.
Microsoft: 3 → 35 mentions (+1067%)
Cloud infrastructure economics. Partnership and competitive dynamics with AI companies.
AI Infrastructure: 7 → 77 mentions (+1000%)
The hidden connective theme. Appears in context with Anthropic, Google, TPU, CoreWeave, Broadcom. This is the thread linking your top stories.
Platform Engineering: 3 → 30 mentions (+900%)
Converging with AI agents theme. Enterprise platform stories dominating.
Insights Reinforced (0.8+ confidence) 10
"Secured 3.5GW of TPU capacity from Google/Broadcom starting in 2027" confidence: 0.9
High-confidence infrastructure claim. Reinforced across multiple sources.
"Launching a product with AWS due to 'staggering' enterprise demand" confidence: 0.9
Enterprise adoption signal. Multiple source confirmation.
"European enterprises adopting distinct AI infrastructure strategies" confidence: 0.9
Geo-divergence signal. Europe vs US AI infrastructure paths splitting.
"Revenue could easily triple this year" confidence: 0.8
Growth claim. Multiple analyst sources converging.
"Achieved 83.1% on CyberGym benchmark for vulnerability analysis" confidence: 0.9
Security capability claim. Specific benchmark result.
"Limiting Claude Mythos access to enterprise security partners" confidence: 0.9
Distribution strategy signal. Gated rollout to security-focused enterprises.
Delivery: This is a natural addition to the weekly organizer run or as a new Telegram message. The "Insights Reinforced" section is especially powerful — it uses the existing last_reinforced timestamp and confidence scoring that's already in your insights table. You're sitting on a claim-verification engine and not surfacing it. The velocity shifts section surfaces what raw mention counts hide — Google went from invisible to dominant in one week.
Concept 06

Coverage Heatmap

Your articles fall into topic clusters. Are your feeds actually covering what you care about? A heatmap by action type shows which topics get read vs. skipped. Topic 1 (AI/LLM) dominates your reads. Topic 0 is 618 articles of pure skip — uncategorized junk. Topics 6-13 are nearly invisible. This drives feed acquisition decisions.

Low effort Feed planning primary_topic field exists on articles
stillhouse-dashboard / coverage
Read
Skim
Save
Skip
T1: AI / LLM
67
117
144
363
T4: Broad Tech
13
180
331
421
T5: Misc Signal
7
37
40
29
T2: Niche Tech
3
25
24
29
T3: Dev/Platform
4
10
21
22
T8: Low Signal
0
1
7
70
T0: Uncategorized
0
0
0
618
T6: Edge
0
3
8
14
T7: Edge
0
2
9
10
The story this tells: Topic 1 (AI/LLM) is where 70% of your reads come from — 67 out of 96 total reads. Topic 4 (Broad Tech) is your volume leader at 945 total articles but only 13 reads — it's skim/save territory, not deep-read material. Topic 0 is a red flag: 618 articles that are 100% skip — these are uncategorized or junk articles consuming pipeline resources for zero signal. Topics 6-13 barely register. Either they represent domains you don't care about, or your feeds don't cover them. This tells you: improve Topic 0 classification, and ask whether Topics 6-9 need dedicated feeds or should be dropped entirely.
Concept 07

Contrarian Detector

When your sources disagree, that's the most interesting signal of all. Your graph has 51 "contradicts" relationships and 107 "competes_with" relationships. Anthropic and OpenAI co-appear in 47 articles — with both partnership and competition narratives. Entities with high tension between bullish and bearish coverage deserve your attention — the consensus is wrong about at least one side.

High effort Highest alpha potential Needs sentiment extraction on insights
stillhouse / contrarian signals — this week
Entities with highest bull/bear tension. Your graph: 2,215 relationships (885 uses, 582 related_to, 442 supports, 107 competes_with, 74 partners_with, 56 invested_in, 51 contradicts, 18 acquired).
Anthropic company Tension: 0.82
Bullish signalsBearish signals
Bull case
"Revenue could easily triple this year" (conf: 0.8)
"Experiencing most rapid growth in American corporate history" (conf: 0.8)
"Launching with AWS due to 'staggering' enterprise demand" (conf: 0.9)
"Secured 3.5GW TPU capacity" (conf: 0.9)
Bear case
"Faces competitive threat from model providers moving up the stack" (conf: 0.8)
"Limiting Claude Mythos to enterprise security partners" — gated access implies caution
47 co-mentions with OpenAI (competes_with relationship)
CoreWeave company Tension: 0.88
Bullish (infrastructure demand)Bearish (debt concerns)
Bull case
52 mentions in first week — massive market attention
"Stock rose 13% following deal with Anthropic" (conf: 0.9)
40 co-mentions with Anthropic — deep partnership signal
AI infrastructure pure-play at scale
Bear case
"Major AI companies exploring custom silicon could reshape cloud infrastructure economics" (conf: 0.8) — threatens CoreWeave's GPU model
Brand new entity (first seen Apr 9) — no track record in your data
High co-mention with Broadcom who supplies competitors too
AI Infrastructure Economics theme Tension: 0.75
Spend is justified (demand)Spend is unsustainable (bubble)
Bull case
"European enterprises adopting distinct AI infrastructure strategies" (conf: 0.9) — global demand
77 mentions this week (+1000%) — acceleration signal
Connects Anthropic, Google, TPU, CoreWeave, Broadcom — the economic backbone
Bear case
"Custom silicon could reshape cloud infrastructure economics" — disruption risk to current players
51 contradicts relationships in your graph suggest real disagreement
Massive capital commitments (3.5GW TPU) are bets on future demand, not current revenue
Implementation: This requires a new extraction dimension — sentiment/stance per insight. Add a stance field (bullish/bearish/neutral) to the insights table, extracted during the existing LLM pass. Then tension = normalized distance between bull and bear insight counts for an entity. You already have 51 "contradicts" and 107 "competes_with" relationships — those are natural tension signals that don't require new extraction. Start there.
Quick Wins (build first)
Feed ROI + Coverage Heatmap + Intelligence Diff

All three use existing data with zero new extraction. Pure SQL queries + dashboard rendering. The diff can also be a Telegram/Telegraph weekly message alongside the digest. Immediate action: drop the 7 zero-signal feeds, investigate the 885-article Anthropic volume sink, and classify the 618 Topic 0 skip articles.
Medium Investment (build second)
Momentum Board + Weak Signal Radar

Need a daily aggregation cron (entity_daily_mentions table) and a new dashboard view. The momentum board replaces the entity tab — 2,440 entities need velocity sorting, not alphabetical listing. The radar surfaces the 14 new entities/week that would otherwise drown in the noise.
Big Bets (build when data is deep)
Narrative Arcs + Contrarian Detector

Both need new LLM extraction dimensions (weekly summaries, sentiment/stance). The contrarian detector can start with your existing 51 contradicts + 107 competes_with relationships — no new extraction needed for v1. Narrative arcs are the multi-year play that makes your data retention decision pay off. With 6,792 articles and growing, you're approaching the scale where arcs become visible.
Coverage Bias
486 of 2,440 entities' top mentions are Anthropic. The Google News "Anthropic" feed (885 articles, 13% of pipeline) creates a coverage bias. Your view of the AI landscape is Anthropic-shaped, not market-shaped.
80% Skip Rate
Only 96 articles out of 6,792 get "read" action (1.4%, avg score 8.2). Your scoring is extremely selective. The question: are you over-filtering signal, or are 80% of your feeds actually noise?
Entity Graph Exploding
14 new entities this week with 2+ mentions. Broadcom (65), CoreWeave (52), Claude Mythos (35), NASA (16), Artemis II (15). Your entity graph is growing fast — the momentum board becomes essential for triage.