Problem statement: ML based Problem-Solving
Duration: 2:30 Hrs
Assessment Topic: Data Preparation, ML, APIs, Front-end/UI development
Tech Stack: Python
Goal: To introduce structured reasoning in ML coding
Task 1: Data Understanding Under Constraints
Choose dataset with at least 8 columns & mixed types.
Produce summary showing:
Type breakdown
Missing breakdown
Estimated model difficulty
Task 2: Feature engineering
Expected tasks:
Identify features, relevance and feature sufficiency
Create column conditionally (for example, if salary > avg_salary then "High").
Create grouped aggregation summary (mean, count).
Filter with compound boolean logic.
Task 3: Pattern Discovery
Identify 3 interesting patterns using visualizations.
Provide short explanations of possible root causes.
Task 4: Model selection, Model development, Model Evaluation and Performance tracking
Decide on model to be developed
Task 5: Application and API integration
Demonstrate the ability to serve ML predictions through an API and consume them through a
user-facing streamlit application.
Implement a single prediction endpoint using FASTAPI/FLASK
Build a simple streamlit app that:
Displays dataset shape and column names
Take user input using simple widgets like (text input/ number input)
Calls API to fetch predictions
Display the predictions, model evaluation and performance metrics
Task 6: Cloud deployment architecture Design
Draw a cloud deployment architecture diagram that includes
ETL/Data ingestion layer
UI layer
API layer
Data storage and model artefact storage layer
The architecture should be:
According to any one of the cloud service providers of your choice
(AWS/AZURE/GCP)
Clearly depict data flow and component responsibilities
Dataset to be used
Your assessment kit comes with a dataset that is to be used during your assessment.
Following are the details about the dataset.
This is a customer-level dataset from a credit card portfolio, primarily intended for churn
analysis and related tasks like segmentation and credit risk modeling. Each row corresponds
to an individual customer and includes demographic details, account attributes, financial
metrics, and behavioral indicators.
Intended use:
Churn prediction: Identifying early warning signs like declining transactions
and rising inactivity.
Deliverables:
Executable code base
Solution approach: Documentation and any other auxiliary information that can
include
Brief report that includes:
Data quality
Transformation applied
Insights on patterns discovered
Model selection and evaluation
Implicit and non-obvious insights or inferences about data and/or model
Architecture diagrams
Screen shots of UI
PS: Please add your personal details to the first slide/cover page of your report. Include
following details.
Full Name
Email Id
College Name
Stream (Ex: Computer Science Engineering, Artificial Intelligence and Data
Science, Electronics and Communication Engineering etc.)