KWL Data Book — Source Catalogue

01

Spotify

Mood · the anchor · via NPILABS Athena

features frozen Nov 2024

Taxonomy

Chart ├─ top_50 (mainstream demand) └─ viral_50 (early/emergent demand) per market · per day · positions 1–50 Each charting track carries: ├─ Identity track_id · artist_id · isrc ├─ Timing release_date → catalogue age ├─ Audio valence · energy · danceability │ tempo · acousticness · mode └─ Origin p_line · c_line · label · isrc

Attributes we hold

From france_poc_v1.combined: chart_type, date, position, track_id, artist_id, isrc, title, release_date, and the six audio features — valence, energy, danceability, tempo, acousticness, mode — plus origin metadata p_line, c_line, label. Derived in analysis: mode_major, catalogue_age, local_share, position-weight.

scale: 150M songs · 15M artists · 8+ yrs daily charts
features: complete through Nov 2024 (Spotify deprecated the endpoint)
access: AWS Athena, licensed via NPILABS — aggregate-only in public output

Key attributes → insights

Tempo & mode — the validated mood proxy. Slowing tempo / falling major-key share read as a market turning anxious. value · the only mood read that is behavioural, daily and cross-border — not a survey six weeks stale

Local share (origin-classified) — what fraction of the chart is home-grown. Rising local share = cultural inward turn. value · the single most defensible number in the France report; maps to "speak heritage, not aspiration"

Catalogue age — new vs back-catalogue. A market reaching for the old is a market not reaching for the new. value · a nostalgia gauge that front-runs the "comfort spend" pattern brands plan around

Top-50 vs Viral-50 gap — mainstream vs emergent. The lead time between what's bubbling and what's arrived. value · A&R / sync timing; the early-warning layer of the music edition

Unique in combination: alone, audio features are an interesting curiosity. Set against Eurostat they become a leading mood proxy (tempo × savings-intent r = −0.52, validated). Set against Wikipedia they show whether the mood has a name yet. Set against Netflix they confirm or break the inward-turn story. Spotify is the anchor precisely because the other layers calibrate it — and because no rival in cultural intelligence holds the unbackfillable 8-year history.

02

Apple Music

Mood · cross-DSP corroboration

snapshot-shallow

Taxonomy

Top Songs chart └─ positions 1–100 per market · per day Each entry: ├─ Identity track_id · artist_id ├─ Title track_title ├─ Genre genres[] (JSON array) └─ Timing release_date

Attributes we hold

In music_charts (source = apple_music_top_songs): country, chart_date, rank, track_id, track_title, artist_name, artist_id, genres (JSON), release_date, url. No audio features — Apple gives chart position and metadata, not valence/energy.

coverage: 34 markets — far wider than Spotify's active France read
depth: daily snapshots from Apr 2026; no historical backfill (RSS is current-only)
genre: Apple's own genre labels — coarse but free taxonomy

Key attributes → insights

34-market breadth — the same artist's rank across markets. Cross-border flow, where a track is travelling. value · the multi-market spine the music edition needs; Spotify depth + Apple breadth

Genre labels — free genre tagging Spotify charts don't expose at chart level. value · genre-velocity reads ("Francophone afrobeat rising") without a paid taxonomy

Rank agreement vs Spotify — where two DSPs agree, the signal is real; where they split, demographics differ. value · a second opinion that hardens the mood read and exposes cohort skew

Unique in combination: Apple's value is almost entirely relational. On its own it's a chart anyone can read. Against Spotify it converts a France-deep read into a 34-market one and provides a free corroboration layer; against Amazon it separates streaming taste from purchase intent. It is the breadth multiplier on the mood anchor.

03

Amazon Music

Commerce · the only purchase-side music chart

beta · artist enrich pending

Taxonomy

Retail digital bestsellers └─ rank 1–60 (top 2 pages) per marketplace · per day Each entry: ├─ Identity track_id (ASIN) ├─ Title track_title ├─ Link url (/dp/ASIN) └─ Artist pending v2 enrichment

Attributes we hold

In music_charts (source = amazon_retail_dmusic_songs): country, chart_date, rank, track_id (ASIN), track_title, url. artist_name currently empty — resolved via a follow-up /dp/ lookup in v2. The distinction that matters: this is purchase, not stream.

coverage: 5 markets (US, GB, DE, FR, JP)
signal type: commerce — someone paid, not just pressed play
exclusivity: the only commerce-side music chart of any DSP

Key attributes → insights

Purchase vs stream gap — what a market buys skews older, wealthier, more deliberate than what it streams. value · a willingness-to-pay signal; the closest music data gets to the commercial outcome brands care about

Catalogue dominance — Amazon's purchase chart leans heavily to evergreens, revealing the comfort/gift economy. value · the "DSP edition" hook; reads the 35+ cohort the streaming charts miss

Unique in combination: framed as "Amazon Music Demand" not "charts," this is the bridge from culture to commerce. Set against Spotify (streaming taste) it isolates the act of paying; set against Google Trends it pairs search-intent with purchase-intent. It is the only source in the stack that touches a wallet directly — which is why it anchors the commerce edition despite being the thinnest.

04

Netflix

Narrative · what the market is watching

live

Taxonomy

Top 10 (chart captured to 100) ├─ Films └─ TV per market · per week Each entry: ├─ show_title ├─ season_title (TV; null for film) ├─ rank (1–100) └─ cumulative_weeks (staying power) NOTE: origin (local vs intl) NOT yet captured — TMDB enrich pending.

Attributes we hold

In netflix_top10: country, week, category (= "Films" or "TV"), rank, show_title, season_title, cumulative_weeks. The films/TV split is native; local-vs-international origin is not — it needs TMDB enrichment (a flagged roadmap item). cumulative_weeks is an under-used staying-power metric.

coverage: 9 markets (BR, DE, ES, FR, GB, IT, JP, KR, US)
depth: 2021 → present, weekly, full history (Tudum dump)
gap: no genre, no origin yet — both addable via TMDB join

Key attributes → insights

Films vs TV mix — a market leaning to TV is in long-form comfort; spiking on films is event-led. value · attention-budget read; pairs with the mood layer for the "defensive vs reaching" call

Local-language share (once origin is enriched) — does the screen confirm the inward turn the charts show? value · the cross-medium corroboration that makes the cultural thesis credible, not anecdotal

Cumulative weeks — staying power vs churn. A market with sticky #1s is settled; high churn is restless. value · a volatility read on collective attention — novel, nobody reports it

Unique in combination: Netflix is the second medium that lets us say "and the screen agrees." On its own it's a watch-list; against Spotify it turns a music-only mood read into a cross-medium cultural claim (six-of-seven-signals-agree); against Wikipedia it shows whether a hit show is also driving curiosity. Origin enrichment would make it the strongest corroborator in the stack.

05

Google Trends

Intent · revealed demand

live · FR only

Taxonomy

Search interest index └─ per term · per market · per week index 0–100 (normalised) Terms today (France): ├─ Carrefour (grocery) ├─ Hermès (luxury) ├─ Renault (automotive) ├─ Club Med (leisure) └─ Boursorama (finance)

Attributes we hold

In trends: term, country, week_start, index_value (0–100). Deliberately brand/category-anchored — these are the names the report's category playbooks are built on. The taxonomy is the term list; it is fully parameterisable per market.

coverage: France · 5 brand terms · 2019–2025 weekly
nature: relative index, not absolute volume (pytrends normalisation)
lever: term list is config — widening to new brands/markets is trivial

Key attributes → insights

Per-brand trajectory — Carrefour +16% / Boursorama −16% YoY: grocery up, retail-finance fading. value · the brand-level mover list that makes a report actionable, not just atmospheric

Category divergence — when one category climbs while the basket falls, that's the defensive-wallet signature. value · turns mood into spend-direction; the bridge the whole thesis is built on

Unique in combination: search intent alone is a marketer's commonplace. Against the mood layer it becomes the downstream test — does a mood shift show up in demand? Against GDELT it separates intent from noise (searching because of news vs organic demand). It is the layer that earns the word "commercial" in cultural-intelligence.

06

Wikipedia

Attention · where curiosity concentrates

live · FR only

Taxonomy

Pageviews per article └─ per article · per day (fr.wikipedia edition) Articles today: ├─ Brands Carrefour · Hermès · Renault │ Club Med · Boursorama └─ Culture Aya Nakamura · Werenoi (artist attention controls)

Attributes we hold

In wikipedia_pageviews: project (= fr.wikipedia), article, date, views (daily integer count). Absolute counts, not a normalised index — unlike Trends, this is a true volume. Daily granularity makes it the most responsive attention signal we hold.

coverage: France · 7 articles · 2019–2025 daily
nature: absolute pageviews — real magnitude, spikes are datable
per-language: extends per Wikipedia edition (de, ja, ko…) at no cost

Key attributes → insights

Artist attention surges — Werenoi +174% / Aya Nakamura +68% YoY. The encyclopedia confirms the playlist. value · names the cultural moment with a hard, datable number — receipts-grade attention

Spike timing — daily granularity dates the exact day curiosity moved (a release, a sync, a scandal). value · event attribution; the timeline that explains why a mood shifted

Unique in combination: Wikipedia answers a question the other layers can't — "does this have a name yet?" Against Spotify it shows whether a rising sound has crossed into public consciousness; against Netflix it shows whether a hit show is also driving lookups; against GDELT it separates organic curiosity from media-pushed attention. It is the cheapest, most responsive corroborator in the stack.

07

GDELT

Discourse · the media weather

live · FR only

Taxonomy

News tone └─ per query · per day tone ≈ −10 … +10 Queries today (France): ├─ FR_ALL_NEWS_TONE (national) ├─ FR_CARREFOUR_TONE ├─ FR_HERMES_TONE ├─ FR_RENAULT_TONE └─ FR_BOURSORAMA_TONE

Attributes we hold

In gdelt_tone: query_id, country, date, tone (position-weighted sentiment, roughly −10 to +10). One national tone series plus one per tracked brand — so brand discourse and national discourse can be read separately. Daily, 2019–2025 complete.

coverage: France · 5 queries · 2019–2025 daily
nature: sentiment of coverage, not volume — the mood of the news
global-ready: GDELT is worldwide; queries extend to any market/brand free

Key attributes → insights

National tone drift — flat-negative all quarter = no shock, no relief. The story is in behaviour, not headlines. value · the control that proves the mood read isn't just echoing the news cycle

Brand tone vs brand search — coverage souring while search holds = reputation risk before demand moves. value · an early-warning reputational layer for the enterprise/brand buyer

Unique in combination: GDELT's whole job is to be the foil. The product's signature line — "the mood is in behaviour, not headlines" — is only sayable because GDELT lets us check the headlines and show they didn't move. Against Spotify/Trends it distinguishes felt mood from reported mood; that separation is the credibility of the entire read.

08

Eurostat

Macro · context & ground truth

live

Taxonomy

Official indices (monthly) ├─ Consumer Confidence (CCI) │ ├─ headline │ ├─ major-purchases sub-index │ └─ savings-intent sub-index ├─ Retail trade volume ├─ Unemployment rate └─ Inflation (YoY)

Attributes we hold

In macro_series (source = eurostat): series_id (8 series incl. 3 CCI variants), country, period, frequency (monthly), value, unit (balance_pct, index_2021_100, pct, pct_yoy). The sub-indices are the prize — savings intent and major-purchases are where the music proxy validated.

coverage: France · 6 Eurostat series · 2019–2026 monthly
role: dual: the report's "commercial weather" AND the validation ground truth
EU-ready: same API serves all EU markets — widest free expansion

Key attributes → insights

CCI sub-indices — savings intent and major-purchase appetite are more responsive than headline confidence. value · the ground truth the mood proxy is validated against (tempo × savings r = −0.52)

The macro gap — calm prices + steady jobs + sunk confidence = the space behavioural signal fills. value · frames why behavioural data is needed at all; the report's opening move

Unique in combination: Eurostat is the truth the behavioural layers are measured against — without it, the music read is an assertion; with it, tempo becomes a validated proxy with a published coefficient. It is also the context that makes everything legible: the mood read only means something relative to where confidence actually sits. The anchor for the receipts.

09

Yahoo Finance

Trade · markets (today: control) · see also #11

live

Taxonomy

Daily closes ├─ CAC 40 (FR equity index) └─ EUR/USD (FX rate) 2019 → present, daily

Attributes we hold

In macro_series (source = yfinance): CAC40_CLOSE (eur_points) and EURUSD_CLOSE (ratio), daily, 2019–2025. Sits in the same table as Eurostat, so macro context is one query. Primary role today is a control variable in the backtests, not a headline report figure.

coverage: France equity + EUR/USD · daily · 2019–2025
role: market context + backtest control (absorbs macro variance)
extensible: any ticker/index per market, trivially

Key attributes → insights

Daily granularity — the only daily macro we hold; it dates market mood between monthly Eurostat prints. value · fills the cadence gap; lets the report move at the speed of behaviour, not the statistics office

Control role — included so the music signal must prove value beyond what markets already explain. value · methodological honesty — it's what makes the validation conservative and credible

Unique in combination: the quietest source — rarely a headline, always a control. Its uniqueness is methodological: by absorbing macro-financial variance in the backtests, it forces the behavioural layers to earn their place. It also bridges the monthly Eurostat cadence with a daily market read, keeping the macro context honest between prints.

11

Markets & brand equity

Trade · the liquid wealth tier · build-out

CAC/EURUSD live · rest roadmap

Taxonomy

Today (live, control) ├─ CAC 40 close └─ EUR/USD Build-out (the trade signal) ├─ Sector rotation │ └─ discretionary vs staples └─ Brand equity (listed brands) LVMH · Hermès · Kering · Renault · Carrefour · BNP · Sanofi …

Why promote it from control to signal

The academic anchor (Edmans et al.) is literally about music sentiment predicting stock returns — stocks are the original downstream variable in the bridge thesis, not just a nuisance to regress out. Two cuts carry real cultural signal: sector rotation (consumer-discretionary vs staples = a clean risk-on / risk-off mood read) and brand-level equity for the listed brands we already track on search, tone and mood.

today: CAC 40 + EUR/USD, daily, 2019–25 — used as backtest control
build: sector indices + ~15 listed brand tickers per market via yfinance — trivial extension
cadence: daily — the only daily macro, fills the gap between monthly Eurostat prints

Unique in combination: pair a listed brand's share price with its search (Trends), its news tone (GDELT), its cultural mood (charts) and you have a full brand picture across attention, sentiment and valuation — a cross-asset read no cultural-intelligence vendor offers. Sector rotation gives the macro-mood layer a daily pulse the monthly CCI can't.

12

Auctions — Salle.art

Trade · the elite wealth tier · concept-stage

concept · sibling project

Taxonomy

Auction results (ART project) ├─ House Christie's · Sotheby's │ Phillips · Bonhams ├─ Lot artist · movement · medium ├─ Estimate low / high └─ Hammer realised price · sold? source: barnebys_*.db (ART/Salle)

Why it belongs in the read

The wealth pyramid has three rungs and the stack reads two: Spotify is mass mood (18–35), Amazon is the older/wealthier purchaser — auctions are the very top. For luxury especially, "are collectors paying records?" is a direct read on high-net-worth confidence — exactly what a maison wants. The real data lives in the sibling ART / Salle.art project (Christie's/Sotheby's via Barnebys); it is not in the Cadence pipeline today — it was illustrative in the v5 mockup and is honestly a concept-stage integration, not a current source.

status: real data exists in ART project; not integrated into Cadence
cadence: per-sale (irregular) — the "slow" signal against music's "fast"
role: the elite altitude; completes mass → mid → elite wealth triangulation

Unique in combination: alone, auction results are an art-market curiosity. Stacked with Spotify (mass) and Amazon (mid) it becomes a wealth-tier triangulation — the only view that reads mood at three income altitudes at once. The single most differentiating thing the stack could add for the luxury buyer, and the literal embodiment of the "trade" verb at the top of the market.

10

Open-Meteo

Control · never published, always present

live · control only

Taxonomy

Daily weather (Paris proxy) ├─ temperature mean · max · min ├─ precipitation_sum ├─ sunshine_duration └─ wind_speed_10m_max 6 variables · daily

Attributes we hold

In weather_daily: country, city (Paris), date, variable (6 met variables), value, unit. A pure control layer — weather confounds mood (sunshine lifts everything), so it is regressed out, never reported. The discipline of holding it is itself a quality signal.

coverage: France (Paris) · 6 variables · 2019–2025 daily
role: confound control — removed from the read, never sold
extensible: any city worldwide, free, instantly

Key attributes → insights

Sunshine / temperature — the obvious mood confound. Held precisely so a "happy" reading can be shown not to be just good weather. value · it's the source that lets the report say "this isn't the sunshine talking" — credibility, not content

Unique in combination: the only source whose value is in being subtracted. It exists purely to make the other ten honest — a mood lift that survives the weather control is a real mood lift. No incumbent bothers; holding it is part of the "shows its working" wedge.

Every source,opened up.

Spotify

Apple Music

Amazon Music

Netflix

Google Trends

Wikipedia

GDELT

Eurostat

Yahoo Finance

Markets & brand equity

Auctions — Salle.art

Open-Meteo

Every source,
opened up.