Skip to content

Comments

Fix news index date extraction with JSON-LD fallback#282

Merged
pethers merged 2 commits intomainfrom
copilot/investigate-metadata-issues
Feb 18, 2026
Merged

Fix news index date extraction with JSON-LD fallback#282
pethers merged 2 commits intomainfrom
copilot/investigate-metadata-issues

Conversation

Copy link
Contributor

Copilot AI commented Feb 18, 2026

Articles without article:published_time meta tags were falling back to filename extraction, then defaulting to today's date. 2026-02-parliament-agenda-*.html (published 2026-02-07) displayed as 2026-02-18 in news indexes.

Changes

Enhanced date extraction in generate-news-indexes.js:

  • Added extractDateFromJSONLD() to parse datePublished from JSON-LD structured data
  • Added normalizeDateString() to convert ISO timestamps to YYYY-MM-DD
  • Expanded fallback chain from 2 levels to 5:
    1. article:published_time (Open Graph)
    2. name="date" (simple meta tag)
    3. datePublished from JSON-LD ← new
    4. Filename pattern (YYYY-MM-DD)
    5. Current date (last resort)

Before:

date: extractMetaContent(content, 'article:published_time') || extractFromFilename(fileName)

After:

date: normalizeDateString(
  extractMetaContent(content, 'article:published_time') || 
  extractMetaContent(content, 'date') || 
  extractDateFromJSONLD(content) || 
  extractFromFilename(fileName)
)

Impact

  • Corrects dates for parliament-agenda (2026-02-07), committee-reports (2026-02-06), and similar articles
  • Normalizes all dates to YYYY-MM-DD format (removes ISO timestamps)
  • All 14 language index files regenerated with accurate dates
  • Backward compatible with existing articles

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

- Add extractDateFromJSONLD() to extract dates from JSON-LD structured data
- Add normalizeDateString() to handle ISO timestamps and date formats
- Update date extraction to use multi-level fallback:
  1. article:published_time (Open Graph)
  2. name="date" (simple meta tag)
  3. datePublished from JSON-LD
  4. Filename extraction
  5. Current date (last resort)
- Fixes parliament-agenda articles showing today's date instead of 2026-02-07
- Normalizes all dates to YYYY-MM-DD format for consistency
- Regenerated all 14 language news index files with correct dates

Co-authored-by: pethers <[email protected]>
Copilot AI changed the title [WIP] Investigate incorrect metadata for parliament agenda page Fix news index date extraction with JSON-LD fallback Feb 18, 2026
@pethers pethers requested a review from Copilot February 18, 2026 05:41
@github-actions github-actions bot added the dependencies Dependency updates label Feb 18, 2026
Copilot AI requested a review from pethers February 18, 2026 05:41
@github-actions github-actions bot added html-css HTML/CSS changes javascript JavaScript code changes i18n Internationalization/localization refactor Code refactoring news News articles and content generation labels Feb 18, 2026
@github-actions github-actions bot added the size-xl Extra large change (> 1000 lines) label Feb 18, 2026
@github-actions
Copy link
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical bug where news articles without article:published_time meta tags were showing incorrect dates (defaulting to today's date instead of the actual publication date). The fix adds JSON-LD structured data as a fallback source and normalizes all dates to YYYY-MM-DD format.

Changes:

  • Added normalizeDateString() function to convert ISO timestamps to YYYY-MM-DD format
  • Added extractDateFromJSONLD() function to extract dates from JSON-LD structured data
  • Extended the date extraction fallback chain from 2 to 5 levels
  • Regenerated all 14 language index files with corrected dates

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated no comments.

File Description
scripts/generate-news-indexes.js Added date normalization and JSON-LD extraction functions, enhanced fallback chain
package-lock.json Added peer dependency flags (expected npm behavior)
news/index_*.html (14 files) Normalized dates from ISO timestamps to YYYY-MM-DD format, corrected article ordering

@pethers pethers marked this pull request as ready for review February 18, 2026 05:45
@pethers pethers merged commit 24f8b8c into main Feb 18, 2026
29 checks passed
@pethers pethers deleted the copilot/investigate-metadata-issues branch February 18, 2026 05:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Dependency updates html-css HTML/CSS changes i18n Internationalization/localization javascript JavaScript code changes news News articles and content generation refactor Code refactoring size-xl Extra large change (> 1000 lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants