§ The expansion roadmap
Every candidate below is public-domain or scrapable — no licence cost. They are ranked by (value to the read) × (ease of access), each with a status, a priority, and the reason it earns a place. The rule that governs the whole list: a source only ships if it can be named and dated in the report's receipts table. Four strategic threads run through it — close the France-only gap, harden the mood anchor, complete the trade family (stocks + auctions), and open the unclaimed city-level altitude.
Seven of ten live sources are France-only. OECD (macro), per-language Wikipedia, per-market Trends/GDELT queries make multi-market real — the biggest single lever on enterprise value.
Spotify froze audio features Nov 2024. Spotify Charts CSV + YouTube Charts + Deezer + in-house Essentia features remove the single point of failure.
The hero promises six verbs; the data keeps five. Promote stocks to a signal (sector rotation + brand equity) and integrate the ART/Salle auction data — the wealth tier. See the Data Book catalogue, sources 11 & 12.
Shazam cities + Ticketmaster venues + radio playlists let Cadence publish sub-national reads. "Lyon vs Paris" is a sentence no incumbent can write.
High value, low friction, mostly official APIs or RSS that reuse code we already have. Two of these — MusicBrainz and TMDB — are not new signal but enrichment that unblocks insights already half-built (origin classification, Netflix genre/origin).
| Source | Family | What it adds | How | Priority | Status | Why now |
|---|---|---|---|---|---|---|
| Spotify Charts CSV | Mood | Official daily/weekly Top 200 + Viral 50, 70+ markets | CSV download | P1 | Not started | Removes single-point dependency on Athena; own the anchor directly |
| MusicBrainz | Enrichment | Artist origin/area | Free API | P1 | Not started | Unblocks origin classifier v3 — fixes the weakest receipt in the France report, no IAM dependency |
| TMDB | Enrichment | Genre + origin for Netflix titles | Free key | P1 | Pending in fetcher | Turns Netflix from films/TV into local-vs-international — the cross-medium corroboration the thesis needs |
| OECD data API | Macro | Consumer confidence for non-EU markets (JP, KR, US…) | Free SDMX | P1 | Not started | Extends the Eurostat ground-truth pattern worldwide — the macro half of multi-market |
| Shazam charts | Discovery | Top 200 per country and per city | Public JSON | P1 | Not started | The only city-level music demand signal — the unclaimed sub-national altitude |
| Deezer API | Mood | Public chart API, strong French depth | Free REST | P2 | Not started | Anchor redundancy + a France-flagship corroborator |
| YouTube Charts | Mood | Weekly top songs/artists; strong in EM markets | Scrape (JSON) | P2 | Not started | Covers markets where Spotify is weak (India, Brazil depth) |
| Apple Podcasts | Narrative | Top podcasts per country/genre — what markets think about | RSS (reuse) | P2 | Not started | Reuses the Apple fetcher pattern; a new "discourse" angle near-free |
| Apple App Store charts | Intent | Top apps per country — finance/dating/game surges | RSS (reuse) | P2 | Not started | App-category surges are behavioural mood data; same fetcher family |
| Ticketmaster Discovery | Live culture | Events, on-sales, price ranges per market | Free key | P2 | Not started | Revealed discretionary-spend appetite + the venue half of city-level |
| TikTok Creative Center | Virality | Trending sounds/hashtags — where culture starts | Scrape (JSON) | P2 | Not started | The youth-culture leading edge incumbents charge a fortune for |
| Spotify Podcast Charts | Narrative | Top/trending podcasts per market | Public JSON | P3 | Not started | Pairs with Apple Podcasts for a full attention-of-listening layer |
Strong, specific additions that earn their place when a particular edition or market calls. Several open whole new signal families the current stack lacks — gaming, reading, citizen discourse.
| Source | Family | What it adds | How | Priority | Why |
|---|---|---|---|---|---|
| National box office (CNC, KOFIC) | Narrative | Cinema attendance — paid attention Netflix can't give | Public CSV/pages | P2 | France's CNC is clean and weekly; the paid-screen complement |
| Steam / SteamDB | Gaming | Concurrent players, top sellers per region | Public JSON | P2 | Gaming is the biggest cultural blind spot in every incumbent report |
| Twitch | Gaming | Top categories/streamers, viewer counts | Free API | P3 | Pairs with Steam for the gaming-culture layer |
| Reddit (country subs) | Discourse | Citizen topic + sentiment, r/france etc. | Free OAuth | P2 | Complements GDELT's news tone with bottom-up citizen tone |
| Bluesky / Mastodon | Discourse | Open social streams, post-Twitter-API | Free firehose | P3 | Genuinely open; good for event attribution |
| Onlineradiobox | Mood (45+) | Radio airplay per station/country | Scrape | P2 | The older cohort streaming charts miss — answers the Spotify-skew critique |
| Eurobarometer / ECB survey | Validation | Population-mood ground truth beyond CCI | Public download | P2 | Strengthens the proxy-validation story; another receipt for the method |
| Football results | Mood events | National-team results — datable mood shocks | Free API | P3 | Cheap narrative receipts ("the 6 Nations weekend moved the charts") |
| NYT Books / bestsellers | Reading | Bestseller lists; FNAC/Amazon books | API + scrape | P3 | Opens the "reading" family; reuses the Amazon harness |
| Google Play charts | Intent | Android app charts (EM markets) | Scrape | P3 | Pairs with Apple App Store for the full mobile picture |
| IMDb datasets | Enrichment | Title metadata dumps | Free TSV | P3 | Joins box office ↔ Netflix; supporting layer |
| YouTube Trending | Attention | Daily trending videos — memes, news, sport | Free quota | P3 | Broader than music; a general attention pulse |
Higher effort, brittle, or licence-sensitive — pursued only when a deal or an edition justifies the cost. One of these, in-house audio features, is strategically important (it's the answer to the Spotify freeze) but is real engineering, not a fetcher.
| Source | Signal | Priority | Effort / risk | Why it's here |
|---|---|---|---|---|
| Essentia / Musicnn (in-house features) | Mood | P1* | 3–4 wks eng | The post-Nov-2024 audio-features replacement — strategically the most important item on any list, but it's a build not a scrape |
| Amazon Music streaming charts | Music | P3 | Playwright, brittle | v2 of the live purchase fetcher; SPA-gated, 3–5 days, or pursue partnership |
| Genius lyrics + NLP | Mood (semantic) | P3 | In-house NLP | Adds what they're singing about on top of how it sounds — a new semantic mood layer |
| Setlist.fm / Bandsintown | Live culture | P3 | Partner keys | Tour density per city — the live-music half of the city altitude |
| OpenTable / TheFork | Dining out | P3 | Fiddly scrape | Bookable-slot density as a discretionary-spend proxy — original, but awkward |
| Vinted / eBay trending | Resale | P3 | ToS-sensitive | Secondhand category heat = value-mood signal; legally careful |
| Lyst Index | Fashion | P3 | PDF-bound | Cite rather than ingest — quarterly, not a feed |
A source ships only if every figure derived from it can be named and dated in the report's methodology table. The pitch is traceability — a source that can't carry a receipt doesn't go in, however interesting. That discipline is why the list is short and why the new sources are mostly official APIs, not grey scrapes.