Receipt · Live since May 2026

I called the JSON a listings site feeds itself, so the posts run free

Python · Crawl4AI · public JSON API · vault markdown

Pat’s Film Club, which I curate, posts a Film-of-the-Week to Instagram and Facebook every week. The source it needs is UK free-to-air HD film listings for the week ahead. The usual options were paying for a TV-data API every month, or scraping rendered HTML, which is slow and breaks often.

Neither was necessary. The listings site is a single-page app that fetches its own data lazily from a public JSON endpoint as you scroll. The pipeline skips the page entirely and calls that endpoint directly: about a thousand films for the rolling week, with ratings, posters and synopses. It filters to the free-to-air HD channels, then enriches each pick in parallel by reading the structured film data off each title’s page.

Free-to-air HD listings are needed. The two obvious routes, paying for a TV-data API and scraping the rendered HTML, are both rejected. Instead the pipeline calls the public JSON endpoint the single-page app feeds itself, pulls about a thousand films for the rolling week, filters to free-to-air HD channels, enriches each pick in parallel by reading structured data per title, and writes a shortlist and post drafts to disk. — Reject the two obvious routes and call the JSON endpoint the app feeds itself; filter, enrich each pick in parallel, then draft.

The result is a weekly shortlist and post drafts written to disk, on free public data with no recurring cost. The method, finding the JSON an app feeds itself from rather than fighting the app, is reusable for any site built the same way.