Backend Engineer Task

The document outlines the design and implementation of a backend system for retrieving and organizing news articles, incorporating LLM-generated insights. It details technical requirements including LLM interaction, data retrieval, API endpoints, and data processing, ranking, and enrichment. The system aims to provide tailored news articles based on user queries, location, and engagement metrics, while ensuring a consistent JSON output format across various API functionalities.

Uploaded by

Aniraj Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views6 pages

Backend Engineer Task

Uploaded by

Aniraj Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Task: Building a Contextual News Data Retrieval System

Objective:

Design and implement a backend system that can fetch and organize news articles from a
data source, simulating different API functionalities, and enrich these articles with
LLM-generated insights. This system should demonstrate the ability to:
● Understand the nuances of a user's query, including their location.
● Utilize an LLM to identify key entities, concepts, and the user's intent to refine data
retrieval.
● Retrieve data from a pre-defined data source, simulating different API endpoints.
● Process, rank, and enrich the results for relevance and comprehensiveness, considering
the user's location.
● Return a JSON response with the most relevant news articles.

Technical Requirements:
1. LLM Interaction:
○ Use a publicly available LLM API (e.g., OpenAI, Google Cloud Language API) to
process the user's news query.
○ Extract relevant entities (e.g., people, organizations, locations, events) and key
concepts from the query.
○ Determine the user's intent to select the most appropriate data retrieval strategy
(simulated API endpoint).
○ Example:
■ Input Query: "Latest developments in the Elon Musk Twitter acquisition near Palo
Alto"
■ Expected LLM Output:
■ Entities: "Elon Musk", "Twitter", "Palo Alto"
■ Intent: "nearby"
■ Input Query: "Top technology news from the New York Times"
■ Expected LLM Output:
■ Entities: "New York Times"
■ Intent: "category", "source"
○ While the application can accept the entities, concepts, and intent directly
through the API, we prefer that you showcase your skills by using an LLM for
this purpose.
2. Data Retrieval:

○ The news article data will be given as a file with mail. You will fetch the data and store
it in a database of your choice (SQL or NoSQL)
.
3. Data Format

○ The data in the news_data file will be organized in JSON files. Each file will contain an
array of news articles. Here's the format for each article:
{
"id": "b1793e11-85f1-47f4-b836-ddc21dd8991e",
"title": "Paris police chase over running of red light ends in pileup | DW News",
"description": "Paris police chase over running of red light ends in pileup | DW News...",
"url": "https://www.youtube.com/@dwnews",
"publication_date": "2025-03-24T11:08:11",
"source_name": "DW",
"category": [
"General"
],
"relevance_score": 0.51,
"latitude": 21.754075,
"longitude": 80.560129
}

4. API EndPoints

○ "category": Retrieve articles from a specific category (e.g., "Technology",

"Business", "Sports"). The category will be provided as part of the user query.
○ "score": Retrieve articles based on a relevance score. The data will contain a
"relevance_score" field. Retrieve articles with a score above a threshold (e.g., 0.7).
○ "search": Retrieve articles based on a search query. Perform a text search within the
article title and description. The search query will be derived from the user's input.
○ "source": Retrieve articles from specific sources (e.g., "New York Times", "Reuters").
The source will be provided as part of the user query.
○ "nearby": Retrieve articles published within a specified radius (e.g., 10km) of a given
location (latitude and longitude). The user's location will be included in the query.
Bonus API – Trending News by Location:
Trending News Feed (Location-Based)
Implement an additional API endpoint that returns a “Trending News Feed” tailored to
a user's location. This should simulate a feed similar to “what’s trending near me”
based on recent user engagement with articles.
Key Requirements:
● “Simulate a Stream of User Events”
Introduce a simulated stream of user activity events (e.g., views, clicks) that tie
users to specific articles and locations.
The candidate is expected to design the data model and event structure.
● “Trending Logic”
Process user interaction data to compute a trending score for articles.
Consider factors such as:
○ Volume and type of interactions.
○ Recency of interactions.
○ Geographical relevance (proximity to user's location).
● Trending Feed Endpoint:
○ Endpoint: GET /trending
○ Query Parameters: lat, lon, limit
The response should return top trending articles within the predefined/dynamic
location radius, ranked by computed trending score.
Caching Requirement:
Implement caching for trending feeds by location to improve response times. You may
use a geospatial segmentation approach (e.g., segmenting locations into clusters).
5. Data Processing, Ranking, and Enrichment:

○ Process the articles retrieved from the database based on the simulated API
endpoint selected by the LLM or through user input.
○ Rank the articles for relevance to the user's query.
■ For "category", and "source" rank by publication date (most recent first).
■ For "score", rank by relevance_score (highest first).
■ For "search", rank by a combination of relevance_score and text matching score
(how well the title/description matches the search query).
■ For "nearby", rank by distance from the user's location (closest first). You may
use the Haversine formula or another appropriate method to calculate the
distance between two points given their latitude and longitude.
○ Enrich the news articles with data from the LLM:
■ Include a summary of the article's content (generated by the LLM).
6. Output:

○ Return the top 5 most relevant news articles in a JSON format. The structure should
be consistent across all simulated API endpoints. Each article should include:
■ Title
■ Description
■ URL
■ Publication Date
■ Source Name
■ Category
■ Relevance Score
■ LLM-Generated Summary

Example Output:

{
"articles": [
{
"title": "Article Title 1",
"description": "Article Description 1",
"url": "https://www.example.com/article1",
"publication_date": "2024-04-28T10:00:00Z",
"source_name": "Example News",
"category": "Technology",
"relevance_score": 0.92,
"llm_summary": "This article discusses the latest developments in...",
"latitude": 37.4220,
"longitude": -122.0840
},
{
"title": "Article Title 2",
"description": "Article Description 2",
"url": "https://www.example.com/article2",
"publication_date": "2024-04-27T11:00:00Z",
"source_name": "Another News Source",,
... (3 more articles)]}

Additional Guidance:

● The above task is designed to be flexible, allowing you to demonstrate your skills at
different levels of depth. While completing the entire task will showcase a strong
understanding of your backend skills, even completing the initial stages can provide
valuable insights into your abilities. So feel free to complete the task at whatever stage
you can do best with minimum apis to load data from file to one your preferred DB and
create a minimum of 2-3 api endpoints to fetch the data.
● Database Setup: You'll need to have a database server set up and running (either a
local instance or a cloud-managed database).
● Dependencies: Make sure you have the necessary libraries installed.

API Design Considerations:

● RESTful Principles: Design the API following RESTful principles. Use appropriate HTTP
methods (e.g., GET, POST) and status codes.
● Endpoint Structure: Consider a base URL (e.g., /api/news) and then specific endpoints
for each simulated API function (e.g., /api/news/category, /api/news/search,
/api/news/nearby).
● Versioning: Include API versioning in the URL (e.g., /api/v1/news) to allow for future
updates.

Input Handling:
● Query Parameters: Use query parameters to pass user input to the API (e.g., GET
/api/news/search?query=Elon+Musk&location=Palo+Alto).
● Location Input: For the "nearby" endpoint, expect latitude and longitude parameters
(e.g., GET /api/news/nearby?lat=37.4220&lon=-122.0840&radius=10).
● Error Handling: Implement robust error handling. Return appropriate HTTP error codes
and informative error messages in the response body.

Output Formatting:
● JSON Format: Return the news articles in JSON format, as specified in the task
description.
● Consistent Structure: Maintain a consistent output structure across all API endpoints.
● Metadata: Consider including metadata in the response, such as the total number of
results, the page number, and the query that was used.

CM3035 Coursework
No ratings yet
CM3035 Coursework
3 pages
10 Standout Coding Projects
No ratings yet
10 Standout Coding Projects
61 pages
A Minor Project Proposal Report On: Connecting The Students
No ratings yet
A Minor Project Proposal Report On: Connecting The Students
17 pages
DW Assignment Spring - Winter 2021 20 Credit FINAL
No ratings yet
DW Assignment Spring - Winter 2021 20 Credit FINAL
5 pages
HS1501 Notes
No ratings yet
HS1501 Notes
6 pages
Info Session - GDG On Campus NYU Tandon
No ratings yet
Info Session - GDG On Campus NYU Tandon
14 pages
IIT Patna M.Tech Student Resume
No ratings yet
IIT Patna M.Tech Student Resume
1 page
Apply To Y Combinator
No ratings yet
Apply To Y Combinator
14 pages
Problem Statement:: Project Title Technologies Domain Project Difficulties Level
No ratings yet
Problem Statement:: Project Title Technologies Domain Project Difficulties Level
4 pages
30313208-PD8100-2015 Smart Cities Overview British Standard
No ratings yet
30313208-PD8100-2015 Smart Cities Overview British Standard
40 pages
23CB1402 - IIAE-COE-QB 24-25 CSBS Updated 9.2.2025
No ratings yet
23CB1402 - IIAE-COE-QB 24-25 CSBS Updated 9.2.2025
6 pages
ECM Report
100% (1)
ECM Report
379 pages
Project Document
No ratings yet
Project Document
71 pages
Computer Forensics Overview and Techniques
No ratings yet
Computer Forensics Overview and Techniques
72 pages
4 Months Nasscom - SuprMentr Internship 2025
No ratings yet
4 Months Nasscom - SuprMentr Internship 2025
8 pages
EC8702 Adhoc Syllabus
No ratings yet
EC8702 Adhoc Syllabus
2 pages
Projects
No ratings yet
Projects
35 pages
CCZG507-DevOps For Cloud-Comprehensive Exam - Makeup Scheme and Solution Document
No ratings yet
CCZG507-DevOps For Cloud-Comprehensive Exam - Makeup Scheme and Solution Document
10 pages
Burger Builder
No ratings yet
Burger Builder
16 pages
Thinkfinity Labs Company Overview
No ratings yet
Thinkfinity Labs Company Overview
63 pages
Machine Learning-1
No ratings yet
Machine Learning-1
64 pages
Data Science Report
No ratings yet
Data Science Report
46 pages
Fire Fighting Robotic Vehicle: Project On
No ratings yet
Fire Fighting Robotic Vehicle: Project On
25 pages
Thangaselvi Resume
No ratings yet
Thangaselvi Resume
4 pages
InnoViz Project - Summary Report
No ratings yet
InnoViz Project - Summary Report
10 pages
Se Zg583 Course Handout
No ratings yet
Se Zg583 Course Handout
9 pages
Olga Paraskevopoulou Thesis (August 2011)
No ratings yet
Olga Paraskevopoulou Thesis (August 2011)
62 pages
HomeStyler Website Requirements Overview
No ratings yet
HomeStyler Website Requirements Overview
13 pages
Resume Anirudh
No ratings yet
Resume Anirudh
1 page
WDI CourseCurriculum v2.0
No ratings yet
WDI CourseCurriculum v2.0
13 pages
Daniel Doron: Work Experience
No ratings yet
Daniel Doron: Work Experience
3 pages
Prediction of Graduate Admission IEEE - 2020
No ratings yet
Prediction of Graduate Admission IEEE - 2020
6 pages
The Iot World Forum (Iotwf) Standardized Architecture
No ratings yet
The Iot World Forum (Iotwf) Standardized Architecture
4 pages
QSO-30167 Project Charter Document
100% (1)
QSO-30167 Project Charter Document
10 pages
Hishamdoc
No ratings yet
Hishamdoc
59 pages
Pranav Mailarpawar: Tech Skills & Achievements
No ratings yet
Pranav Mailarpawar: Tech Skills & Achievements
1 page
Biller Aggregation System Project Overview
100% (4)
Biller Aggregation System Project Overview
9 pages
Key Concepts in Software Engineering
No ratings yet
Key Concepts in Software Engineering
29 pages
Richard Fabian Data Oriented Design Software Engineering For Limited
No ratings yet
Richard Fabian Data Oriented Design Software Engineering For Limited
327 pages
SPA Group 20
No ratings yet
SPA Group 20
16 pages
Aspiring Software Developer Profile
No ratings yet
Aspiring Software Developer Profile
1 page
Tutorial IRM4724
No ratings yet
Tutorial IRM4724
10 pages
UNIX and Linux System Administration Handbook 5th Edition Evi Nemeth PDF Download
No ratings yet
UNIX and Linux System Administration Handbook 5th Edition Evi Nemeth PDF Download
116 pages
A Fuzzy Ontology and Its Application To News Summarization
100% (1)
A Fuzzy Ontology and Its Application To News Summarization
22 pages
A Project Synopsis Animation Using Applet: Doraemon: Submitted by
100% (1)
A Project Synopsis Animation Using Applet: Doraemon: Submitted by
22 pages
CourseOutline-Internet Application Development BSCS-633
No ratings yet
CourseOutline-Internet Application Development BSCS-633
4 pages
White and Black Tech Professional Resume
No ratings yet
White and Black Tech Professional Resume
1 page
IoT and Robotics Learning Plan
100% (1)
IoT and Robotics Learning Plan
3 pages
CM2040: Databases, Networks and The Web: Arjun Muralidharan 6th September 2020
No ratings yet
CM2040: Databases, Networks and The Web: Arjun Muralidharan 6th September 2020
21 pages
8034 - Unit 13 Website Development - Assignment 2 Design Nathan Mckenzie-Hirst
No ratings yet
8034 - Unit 13 Website Development - Assignment 2 Design Nathan Mckenzie-Hirst
12 pages
Info Vis Final Paper Olga Paraskevopoulou
No ratings yet
Info Vis Final Paper Olga Paraskevopoulou
23 pages
Pie in the Sky Bakery Overview
No ratings yet
Pie in the Sky Bakery Overview
5 pages
Proposal Review Questionnaire Completed
No ratings yet
Proposal Review Questionnaire Completed
4 pages
SPCC Assignment Documentation
No ratings yet
SPCC Assignment Documentation
30 pages
Product Launch Plan PDF
No ratings yet
Product Launch Plan PDF
3 pages
Smart Agriculture Data Mining
No ratings yet
Smart Agriculture Data Mining
6 pages
01 Intro To Network Programming and Automation
No ratings yet
01 Intro To Network Programming and Automation
13 pages
Task - 2 (For Frontend People)
No ratings yet
Task - 2 (For Frontend People)
2 pages
Assignment
No ratings yet
Assignment
5 pages
5370 - Aryan Yaadav
No ratings yet
5370 - Aryan Yaadav
14 pages
RSADMIN Parameter Guide
No ratings yet
RSADMIN Parameter Guide
9 pages
5 Complexity: 5.1 The O Notation
No ratings yet
5 Complexity: 5.1 The O Notation
5 pages
Control Structures
No ratings yet
Control Structures
26 pages
PHP FormAndFileHandling
No ratings yet
PHP FormAndFileHandling
49 pages
It Officer E Book
No ratings yet
It Officer E Book
180 pages
C Programming Exam Paper 2012-13
No ratings yet
C Programming Exam Paper 2012-13
7 pages
CN Exp-7 (Shruti)
No ratings yet
CN Exp-7 (Shruti)
4 pages
Amdahl's Law (Autosaved)
No ratings yet
Amdahl's Law (Autosaved)
12 pages
Ahmedabad Institute of Technology CE Department Compiler Design (2170701) Assignment
No ratings yet
Ahmedabad Institute of Technology CE Department Compiler Design (2170701) Assignment
4 pages
Brinmeet Resume PDF
No ratings yet
Brinmeet Resume PDF
2 pages
Abhi CV
No ratings yet
Abhi CV
5 pages
Python
No ratings yet
Python
4 pages
100 C Language Questions and Answers
No ratings yet
100 C Language Questions and Answers
9 pages
Fundamentals of OOPs
No ratings yet
Fundamentals of OOPs
19 pages
Apriori Algorithm for Association Rule Mining
No ratings yet
Apriori Algorithm for Association Rule Mining
17 pages
Lecture 0 - CS50x 2025
No ratings yet
Lecture 0 - CS50x 2025
20 pages
MST 4220
No ratings yet
MST 4220
15 pages
MVC Framework Tutorial
0% (1)
MVC Framework Tutorial
11 pages
Da Unit-2
No ratings yet
Da Unit-2
23 pages
ADO.NET Overview and Key Components
No ratings yet
ADO.NET Overview and Key Components
24 pages
Running JavaFX in JGrasp PDF
No ratings yet
Running JavaFX in JGrasp PDF
3 pages
DA Python Record & Manual - 2nd Yr
No ratings yet
DA Python Record & Manual - 2nd Yr
112 pages
Cleanvul: Automatic Function-Level Vulnerability Detection in Code Commits Using LLM Heuristics
No ratings yet
Cleanvul: Automatic Function-Level Vulnerability Detection in Code Commits Using LLM Heuristics
25 pages
SUMMER 2020 Paper Solution - DBMS
No ratings yet
SUMMER 2020 Paper Solution - DBMS
21 pages
PyCUDA AH PDF
No ratings yet
PyCUDA AH PDF
16 pages
Red Hat Developer Toolset-4-4.1 Release Notes-En-US
No ratings yet
Red Hat Developer Toolset-4-4.1 Release Notes-En-US
19 pages
ITCS 321 Test ONE NOV 2018 KEY AAA
No ratings yet
ITCS 321 Test ONE NOV 2018 KEY AAA
5 pages
Algo DS Book PDF
67% (3)
Algo DS Book PDF
525 pages
Lab-2 3
No ratings yet
Lab-2 3
3 pages
1.1 Introduction To DBMS: Bus Reservation System 2020-2021
No ratings yet
1.1 Introduction To DBMS: Bus Reservation System 2020-2021
28 pages

Backend Engineer Task

Uploaded by

Backend Engineer Task

Uploaded by

Task: Building a Contextual News Data Retrieval System

4.​ API EndPoints

○​ "category": Retrieve articles from a specific category (e.g., "Technology",

API Design Considerations:

You might also like

4. API EndPoints

○ "category": Retrieve articles from a specific category (e.g., "Technology",