0% found this document useful (0 votes)
10 views2 pages

Test File 3

Test File 3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views2 pages

Test File 3

Test File 3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Ananya Kukreti

Gurugram, Haryana, India| 7895448416 | [email protected] | www.linkedin.com/in/ananya-kukreti

SUMMARY
High-performing senior data engineer with over 3 years of experience in creating, deploying, refining and scheduling data
pipelines and ETL processes. Successfully designed and optimized data pipelines leading to up to a 40% increase in processing
efficiency. Proficient in thriving in high-pressure environments, leveraging expertise in Big Data, Python, Spark, Hadoop, SQL
and shell scripting to drive team success.

TECHNICAL SKILLS:
Programming: Python, SQL, Shell Scripting
Big data: Spark, Pyspark, Hadoop, Hive
Cloud: Amazon MSK, AWS Glue, Amazon EMR, AWS Lambda, Amazon RDS
Others: CA7 mainframe job scheduling

CERTIFICATIONS: Azure AZ900 Certification

PROFESSIONAL EXPERIENCE
Incedo Inc. Gurugram, Haryana / India

Project 1 September 2022 – Present


Senior Data Engineer
Python| Hive | Hadoop | Spark | Shell Scripting|PySpark|

Working as a senior data engineer for a client that ranks among the top 5 banks in USA.

● Enhanced and streamlined ETL processes focusing on data ingestion, enrichment, and analysis to improve data
accuracy, and overall pipeline performance.
● Collaborated with Data Analysts and converted STMs prepared by them into optimized Pyspark code to build robust data
pipeline.
● Crafted code tailored for ingesting more than 20 TB of data from diverse data sources and enriching it.
● Engineered more than 50 features from ingested data which generated leads that were pivotal in identifying prospective
mortgage customers for the client.
● Led a team of 3 data engineers in automating processes associated with data enrichment and ingestion of multiple projects
for the client using CA7 job scheduling.
● Achieved a 40% reduction in data processing time and query response by optimizing and performance-tuning data
enrichment procedures.
● Contributed to the development of control and validation framework using python and spark for data quality checks,
leading to the identification and resolution of 10+ erroneous data abnormalities monthly.

Project 2 April 2021 – September 2022


Data Engineer
Python| SQL | Kafka | AWS

Worked for a client that is one of the top designers, manufacturers, and distributors of end-to-end networking, security and
connectivity products in USA to develop a pipeline that enables detecting anomalies in data gathered by an IoT device.

● Engineered an ETL process from the ground up using Python to efficiently analyze 500 GB of wired and wireless data
generated by the client's IoT devices.
● Utilised Amazon MSK for data ingestion from kafka and sending results back to Kafka topics, AWS glue and AWS EMR
for data processing and Amazon RDS for data storage.
● Refined program performance by leveraging multithreading in Python, achieving a 60-70% reduction in runtime and
significantly enhancing efficiency.
● Designed and implemented modules to collect real-time data from Kafka, analyzed baseline patterns and anomalies across
network protocols- TCP, UDP, and ICMP
● Pushed refined data to Kafka topics and MySQL tables which enabled data scientists to leverage it for visualization.
EDUCATION & OTHER
UNIVERSITY- Banasthali Vidyapith GRADUATION YEAR- 2021
Bachelor of Technology (Computer Science) CGPA – 7.8
LANGUAGES: Native- Hindi, Fluent- English, Elementary Proficiency- French
INTERESTS: Reading, listening to podcasts

You might also like