100% found this document useful (1 vote)
97 views22 pages

Leaping Into The Career of Data Science: 4/27/20 Slide No: 1 Machine Learning by Sathish Yellanki

The document discusses the roles of data engineers, data analysts, and data scientists. It explains that data engineers build and optimize systems to allow data scientists and analysts to do their work. Data analysts review, analyze, and report on big data to find business insights. Data scientists use skills in computer science, modeling, and analytics to extract valuable insights from data and solve business problems.

Uploaded by

Yellanki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPSX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
97 views22 pages

Leaping Into The Career of Data Science: 4/27/20 Slide No: 1 Machine Learning by Sathish Yellanki

The document discusses the roles of data engineers, data analysts, and data scientists. It explains that data engineers build and optimize systems to allow data scientists and analysts to do their work. Data analysts review, analyze, and report on big data to find business insights. Data scientists use skills in computer science, modeling, and analytics to extract valuable insights from data and solve business problems.

Uploaded by

Yellanki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPSX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Leaping into The Career of Data Science

4/27/20 Machine Learning By Sathish Yellanki Slide No : 1


Data Engineer Versus Data Scientist

4/27/20 Machine Learning By Sathish Yellanki Slide No : 2


Let Us Understand Who is Data Engineer?

4/27/20 Machine Learning By Sathish Yellanki Slide No : 3


• Data Engineers Build And Optimize The Systems Allowing Data Scientists
And Analysts To Perform Their Work.
• Data Engineer Should Ensure That Any Data is Properly
• Received
• Transformed
• Stored
• Made Accessible
Data Engineer Responsibilities
• Establish The Foundation Architecture For Data Analysts and Data Scientists
• Take Responsibility To Construct The Data Pipelines, To Handle Huge Data
• Should Understand The Entire Software Development Life Cycle
• Should Keep Focus on Leveraging
• Data Tools
• Maintain Databases
• Create and Manage Data Pipelines
• Should Develop a Mind Set on Building and Optimizing Applications
What are The Tasks of Data Engineer?
• Building API’s For Data Consumption.
• Integrating External OR New Datasets into Existing Data Pipelines.
• Apply Feature Transformations For Machine Learning Models on New Data.
• Continuous Monitoring & Testing, System To Ensure Optimized Performance.
4/27/20 Machine Learning By Sathish Yellanki Slide No : 4
Finally What is Data Engineering?

Software Business BigData


Engineering Intelligence Abilities

Services Provided BY Data Engineer


Data Ingestion
• “Scraping” Databases, Loading Logs, Fetch Data From External Stores OR API’s.
Metric Computation
• Frameworks To Compute &Summarize Engagement, Growth OR Segmentation Related Metrics.
Anomaly Detection
• Automating Data Consumption to Alert People on Anomalous Events OR Changing Trends.
Metadata Management
• Allow Generation &Consumption of Metadata, Make it Easy to Find Information in DWH.
Experimentation
• A/B Testing And Experimentation Frameworks For Company’s Analytics With A Significant Data
Engineering Component integrated to it.
Instrumentation
• Log Events And Attributes Related To Every Event, Make Sure That High-Quality Data is Captured
Upstream
Dependencies
• Establish Pipelines That Are Specialized in Understand Series of Actions in Time, Allowing
Analysts To Understand User Behaviors
4/27/20 Machine Learning By Sathish Yellanki Slide No : 5
Learning To Be a Data Engineer
• Data Engineers Must Focus More on Learning
• Data Modeling Techniques
• Relational And Non-Relational Database Theory And Practice
• Database Clustering Tools And Techniques
• ETL Design
• Architectural Projections
Salary Projections

4/27/20 Machine Learning By Sathish Yellanki Slide No : 6


Let Us Understand Who is Data Analyst?

4/27/20 Machine Learning By Sathish Yellanki Slide No : 7


• Big Data Analyst Reviews, Analyzes And Reports on Big Data Stored And
Maintained by an Organization.
• Big Data Analysts Use
• Manual Techniques
• Automated Big Data Analysis/Analytics Software
• Big Data Analysts Analyze
• Large Amounts of Raw & Unstructured Data
• Big Data Analysts Main Intent is to Find
• Business Insight
• Intelligence
• Useful Information
Big Data Analyst Responsibilities
• Should be Well Versed in Big Data Concepts
• Possesses Knowledge & Skills in Using
• Database Querying Languages
• Big Data Analytics Software
• Should Have Good Understanding of
• Data Mining
• Data Extraction Technique
• Should Usually Work in Coordination With
• Data Scientists
• Database Developers/Administrators
• Management Team Machine Learning By Sathish Yellanki
4/27/20 Slide No : 8
Big Data Analyst Skills
• A High Level of Mathematical Ability.
• Programming Languages, Such As
• Oracle SQL Or Any SQL Flavor
• Python
• R Language
• Java OR Scala
• Good Ability To
• Analyze The Data and Business
• Model The Data For Business
• Interpret The Data in The Business
• Problem-Solving Skills With Design of Algorithms
• A Methodical And Logical Approach
• Should Have Good Ability To
• Plan The Work
• Meet Deadlines
• Develop Good Accuracy and Attention To Detail
• Accuracy and Attention
• Detail Interpersonal Skills
• Team Working skills
• Written & Verbal Communication Skills
4/27/20 Machine Learning By Sathish Yellanki Slide No : 9
Let Us Understand Who is Data Scientist?

4/27/20 Machine Learning By Sathish Yellanki Slide No : 10


• Data Science is a Study Which Involves Extracting Knowledge From Data
• A Data Scientist Should Have the Skill to Turn Raw Data into Valuable
Insights That An Organization Needs.
• A Data Scientist Should Find the Valuable Insight, Which Can Make the
Business Owner to Grow And Compete in His Business.
• Data Scientist Should Have the Skill to Interpret And Analyze the Data From
Multiple Sources To Come Up With Imaginative Solutions To Problems.
• Data Scientist Should Use Their Strong Business Sense Along With An Ability
To Communicate Findings To Both Business And IT.
• Should Have the Leadership That Can Influence “How An Organization
Approaches A Business Challenge”.
• Data Scientists May Have Different Functions Depending on Which
Industry/Sector They Are Involved.
• Should Have the Ability To Combine Practical Skills Such as Coding And
Mathematics With The Ability To Analyze Statistics.
• Should Have the Ability to Model the Data in the Interest of the Business
Growth and Targets.
• Data Scientist Should Eliminate the Noise and Identify the Canonical
Representative Data Points..
• Data Scientist “Generalizes the Data Model to be Able to Make Useful
Statistical Predictions.
4/27/20 Machine Learning By Sathish Yellanki Slide No : 11
Data ScientistResponsibilities
• Should Use Strong Business Acumen
• For Useful Insights, He Should Have Great Ability To
• Communicate Findings
• Mine Vast Amounts of Data
• Use Insights To Influence How An Organization Approaches Business
Challenges
• To Solve Problems Use A Combined Knowledge of
• Computer Science And Applications
• Modeling
• Statistics
• Analytics
• Mathematics
• Extract Data From Multiple Sources, Which Can be
• Un-Structured
• Semi-Structured
• Structured
• Fine Sift And Analyze Data From Multiple Angles, Looking For Trends That
Highlight Problems OR Opportunities
• Communicate Important Information &Insights To Business And IT Leaders
• Make Recommendations To Adapt Existing Business Strategies
4/27/20 Machine Learning By Sathish Yellanki Slide No : 12
Key Skills For Data Scientists (Non-Technical)
• Problem-Solving Skills
• Communication Skills
• Teamwork Skills
• Investigative Skills
• Interest in Statistics
• Interest in Predicting Trends and Identifying Patterns
• Innovative Thinking
• Observation Skills
• Critical Thinking
Key Skills For Data Scientists (Technical)
• Java OR Scala Coding
• Python Coding
• R Programming
• Understand Hadoop Platform
• SQL Database/Coding
• Apache Spark
• Machine Learning and AI
• Data Visualization With Reporting Tools
• Design of Algorithms
• Advanced Statistics
4/27/20 Machine Learning By Sathish Yellanki Slide No : 13
Let Us Get More Insights

4/27/20 Machine Learning By Sathish Yellanki Slide No : 14


4/27/20 Machine Learning By Sathish Yellanki Slide No : 15
4/27/20 Machine Learning By Sathish Yellanki Slide No : 16
4/27/20 Machine Learning By Sathish Yellanki Slide No : 17
4/27/20 Machine Learning By Sathish Yellanki Slide No : 18
4/27/20 Machine Learning By Sathish Yellanki Slide No : 19
4/27/20 Machine Learning By Sathish Yellanki Slide No : 20
4/27/20 Machine Learning By Sathish Yellanki Slide No : 21
Thank You Very Much

4/27/20 Machine Learning By Sathish Yellanki Slide No : 22

You might also like