0% found this document useful (0 votes)
48 views3 pages

Problem Statement Interns Hiring Drive Dec2025

The document outlines a structured assessment for machine learning problem-solving, focusing on data preparation, feature engineering, pattern discovery, model development, API integration, and cloud deployment architecture. It specifies tasks to be completed using a customer-level credit card dataset for churn analysis, including data understanding, feature creation, and visualization of patterns. Deliverables include executable code, documentation, architecture diagrams, and a report detailing insights and model evaluation.

Uploaded by

premasahana5279
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views3 pages

Problem Statement Interns Hiring Drive Dec2025

The document outlines a structured assessment for machine learning problem-solving, focusing on data preparation, feature engineering, pattern discovery, model development, API integration, and cloud deployment architecture. It specifies tasks to be completed using a customer-level credit card dataset for churn analysis, including data understanding, feature creation, and visualization of patterns. Deliverables include executable code, documentation, architecture diagrams, and a report detailing insights and model evaluation.

Uploaded by

premasahana5279
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Problem statement: ML based Problem-Solving

Duration: 2:30 Hrs


Assessment Topic: Data Preparation, ML, APIs, Front-end/UI development
Tech Stack: Python
Goal: To introduce structured reasoning in ML coding

Task 1: Data Understanding Under Constraints


Choose dataset with at least 8 columns & mixed types.
Produce summary showing:
Type breakdown
Missing breakdown
Estimated model difficulty

Task 2: Feature engineering


Expected tasks:
Identify features, relevance and feature sufficiency
Create column conditionally (for example, if salary > avg_salary then "High").
Create grouped aggregation summary (mean, count).
Filter with compound boolean logic.

Task 3: Pattern Discovery


Identify 3 interesting patterns using visualizations.
Provide short explanations of possible root causes.

Task 4: Model selection, Model development, Model Evaluation and Performance tracking
Decide on model to be developed

Task 5: Application and API integration


Demonstrate the ability to serve ML predictions through an API and consume them through a
user-facing streamlit application.
Implement a single prediction endpoint using FASTAPI/FLASK
Build a simple streamlit app that:
Displays dataset shape and column names
Take user input using simple widgets like (text input/ number input)
Calls API to fetch predictions
Display the predictions, model evaluation and performance metrics

Task 6: Cloud deployment architecture Design


Draw a cloud deployment architecture diagram that includes
ETL/Data ingestion layer
UI layer
API layer
Data storage and model artefact storage layer
The architecture should be:
According to any one of the cloud service providers of your choice
(AWS/AZURE/GCP)
Clearly depict data flow and component responsibilities

Dataset to be used
Your assessment kit comes with a dataset that is to be used during your assessment.
Following are the details about the dataset.

This is a customer-level dataset from a credit card portfolio, primarily intended for churn
analysis and related tasks like segmentation and credit risk modeling. Each row corresponds
to an individual customer and includes demographic details, account attributes, financial
metrics, and behavioral indicators.
Intended use:
Churn prediction: Identifying early warning signs like declining transactions
and rising inactivity.

Deliverables:
Executable code base
Solution approach: Documentation and any other auxiliary information that can
include
Brief report that includes:
Data quality
Transformation applied
Insights on patterns discovered
Model selection and evaluation
Implicit and non-obvious insights or inferences about data and/or model
Architecture diagrams
Screen shots of UI
PS: Please add your personal details to the first slide/cover page of your report. Include
following details.
Full Name
Email Id
College Name
Stream (Ex: Computer Science Engineering, Artificial Intelligence and Data
Science, Electronics and Communication Engineering etc.)

You might also like