0% found this document useful (0 votes)

13 views3 pages

Project Proposal Data Scraping

The project involves analyzing research papers in Computer Science from 2001 to 2020 to identify emerging trends and key areas of interest. Students will gather data from major journals, process it, and apply Deep Neural Networks to uncover insights into the evolving landscape of research. The final deliverables include a report, datasets, source code, and a presentation of findings.

Uploaded by

anamtoc9anam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views3 pages

Project Proposal Data Scraping

Uploaded by

anamtoc9anam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Semester Project (Deep Learning)

M.Sc. Data Science (Weekend)

Project Title: Analyzing Research Trends in Computer Science

Summary:

In this project, you will analyze a large corpus of research papers in the field of Computer
Science published between the years from 2001 to 2020 (20 years). The goal of this project
is to identify emerging research trends and key areas of interest within the field over the past
two decades. You will be tasked with gathering and processing data from a selection of
prominent Computer Science journals, then using advanced data analysis techniques,
specifically Deep Neural Networks (DNNs), to uncover insights into the evolving landscape
of research in Computer Science.

Project Overview:

Computer Science is a rapidly evolving field, with new advancements and subfields
constantly emerging. Over the years, the focus of research has shifted, with different areas
gaining prominence due to technological advancements, industry needs, and scientific
developments. Identifying these shifts can provide valuable insights into the future direction
of the field, as well as help researchers, educators, and industry professionals make
informed decisions about where to focus their efforts.

In this project, you will analyze a collection of research papers from major Computer
Science journals. You will focus on understanding how the focus areas of research have
changed from 2001 to 2020, with an emphasis on identifying emerging trends and key topics
in the field.

Data Collection:

Your first task will be to gather data from a selection of Computer Science journals (List
attached for each group). Each group is assigned a list of journals that are recognized as
major sources of Computer Science research.

The data you will collect from each journal should include the following key information for
each published article between 2001 and 2020:

1. Title of the Paper: The full title of the paper, which will help you identify the topic of
research.
2. Abstract of the Paper: The abstract provides a concise summary of the research and
is essential for understanding the focus of each article.

3. Citation Count: The number of citations that the article has received, which can be
an indicator of the paper's impact and relevance in the field.

4. Publication Year: The year in which the paper was published, which will allow you to
track how the focus of research has shifted over time.

5. Authors & their Affiliation: The affiliation of authors for collaboration network
analysis. You can use only the country name as an affiliation of an author which can
be extracted from author’s addresses.

6. Key Words: The keywords of each article (if available).

You will need to use web scraping techniques to extract this data from the journals. There
are various tools and programming languages you can use for web scraping, such as Python
with libraries like BeautifulSoup, Scrapy, or Selenium. The data you collect should be
organized in a structured format (such as CSV) to facilitate subsequent analysis.

Data Processing:

Once you have gathered the data, the next step will involve cleaning and preprocessing it.
This involves tasks such as:

• Removing duplicates or irrelevant entries.

• Handling missing data or incomplete records.

• Standardizing text (e.g., handling variations in spelling, formatting, or abbreviations).

At this stage, you will also need to ensure that your dataset is structured properly for
analysis. For instance, you should create a table or a database with each record containing
the relevant data fields (title, abstract, citation count, etc.) so that it can be easily input into
your analysis tools.

Applying Deep Neural Networks:

The final step of the project will involve applying Deep Neural Networks (DNNs) to the
cleaned data to identify emerging trends in the research. You will use DNNs to analyze
patterns in the text (such as topics and keywords) and the citation data (to identify highly
influential research). The DNN model will be trained to recognize patterns across multiple
articles, time periods, and research topics, enabling it to predict the trends in Computer
Science research over time.
Expected Outcomes:

By the end of this project, your team will have developed a model and performed data
analysis including:

1. Exploratory Data Analysis: You will perform Exploratory Data Analysis and draw
different plots to explore the data. Additionally, you will perform statistical analysis to
get insights into the data.

2. Identify Key Research Trends: The model will show how specific topics in Computer
Science have evolved, including which areas have gained more attention and which
have declined.

3. Highlight Emerging Areas of Interest: By analyzing the data from the last two
decades, you will be able to identify newly emerging fields or technologies that are
likely to shape the future of Computer Science.

4. Analyze Citation Impact: The model will also be able to identify which papers or
areas of research have had the most significant impact on the field, as measured by
citation counts.

This analysis will provide valuable insights into the development of Computer Science and
help predict future trends. It will also contribute to a better understanding of how research
in this field is interconnected, which areas have been neglected, and where more resources
may be needed.

Final Deliverables:

At the end of the project, each group will be required to:

1. Submit a report summarizing the findings, including detailed analysis and

visualizations of the trends identified.

2. Provide datasets of the collected articles, including titles, abstracts, citation counts,
and any other relevant metadata. (Both raw and cleaned datasets).

3. All the source code files including code for scraping, cleaning, exploratory data
analysis, DNN implementation etc. (Only Source Files)

4. Present your findings to the class, demonstrating the trends you identified, the
methods used, and the impact of your findings.

Untitled Document
No ratings yet
Untitled Document
4 pages
Lecture On Data Collection, Analysis, Interpretation, and Necessary Instrumentation
No ratings yet
Lecture On Data Collection, Analysis, Interpretation, and Necessary Instrumentation
20 pages
Projects Ideas 6 Sem
No ratings yet
Projects Ideas 6 Sem
11 pages
An Evolution of Computer Science Research
No ratings yet
An Evolution of Computer Science Research
48 pages
M.tech Computer Science Thesis Topics
100% (3)
M.tech Computer Science Thesis Topics
4 pages
Computer Science Thesis Topic Guide
100% (3)
Computer Science Thesis Topic Guide
8 pages
Project Instruction
No ratings yet
Project Instruction
6 pages
Research Method Habtamu
No ratings yet
Research Method Habtamu
6 pages
Review
No ratings yet
Review
12 pages
FAIR and Open Computer Science Research Software
No ratings yet
FAIR and Open Computer Science Research Software
24 pages
Project
No ratings yet
Project
2 pages
Activity Guide - Big, Open, and Crowdsourced Data - Unit 5 Lesson 5
No ratings yet
Activity Guide - Big, Open, and Crowdsourced Data - Unit 5 Lesson 5
1 page
CSC ReasearchModule I
No ratings yet
CSC ReasearchModule I
3 pages
Big Data Analysis Project Report
No ratings yet
Big Data Analysis Project Report
11 pages
Python Task Descriptions
No ratings yet
Python Task Descriptions
10 pages
Course Outline
No ratings yet
Course Outline
5 pages
Projects List
No ratings yet
Projects List
6 pages
Bachelor Thesis Computer Science Topics
100% (2)
Bachelor Thesis Computer Science Topics
4 pages
MCA Data Science Project Guide
No ratings yet
MCA Data Science Project Guide
4 pages
Data Science Project Guidelines 2025
No ratings yet
Data Science Project Guidelines 2025
3 pages
Python Data Science Project Guide
No ratings yet
Python Data Science Project Guide
4 pages
Computer Science Thesis
No ratings yet
Computer Science Thesis
9 pages
Course Project Guideline - New
No ratings yet
Course Project Guideline - New
6 pages
DSBDAlab Manual
No ratings yet
DSBDAlab Manual
116 pages
01 - Introduction To Research Methods
No ratings yet
01 - Introduction To Research Methods
37 pages
MSC Computer Science Thesis Topics List
100% (3)
MSC Computer Science Thesis Topics List
8 pages
Data Intensive Scalable Systems Project
No ratings yet
Data Intensive Scalable Systems Project
8 pages
Chat
No ratings yet
Chat
6 pages
DATASCIENCE (Unit-1) Question Bank
No ratings yet
DATASCIENCE (Unit-1) Question Bank
6 pages
E Data Analysis With Python Master Manual
No ratings yet
E Data Analysis With Python Master Manual
61 pages
Giridhar K
No ratings yet
Giridhar K
46 pages
Project Status Report For 6Th Semester: Niques
No ratings yet
Project Status Report For 6Th Semester: Niques
3 pages
Project Work I and Ii Course Outlines
No ratings yet
Project Work I and Ii Course Outlines
4 pages
List of Dissertation Topics in Computer Science
100% (2)
List of Dissertation Topics in Computer Science
6 pages
Chapter 0 - Research Project Topic Fomulation
No ratings yet
Chapter 0 - Research Project Topic Fomulation
12 pages
Unlocking The World of Computer Science An Introduction
No ratings yet
Unlocking The World of Computer Science An Introduction
4 pages
Project Report Cricket20 20 Analysis
No ratings yet
Project Report Cricket20 20 Analysis
24 pages
IT Python Intern Report Jun 25 Bargur
No ratings yet
IT Python Intern Report Jun 25 Bargur
30 pages
Class 12 List of Practicals 2022 11 27
No ratings yet
Class 12 List of Practicals 2022 11 27
5 pages
Library Management System Project Report
No ratings yet
Library Management System Project Report
32 pages
CS Thesis Help for Undergraduates
100% (3)
CS Thesis Help for Undergraduates
7 pages
Gaurav, AI Key Points
No ratings yet
Gaurav, AI Key Points
4 pages
Computer Science Thesis Topic Ideas
100% (3)
Computer Science Thesis Topic Ideas
7 pages
Image Processing Internship Report
No ratings yet
Image Processing Internship Report
49 pages
Internship Project List - 2024
No ratings yet
Internship Project List - 2024
2 pages
Informatics Practices Revision Book
No ratings yet
Informatics Practices Revision Book
5 pages
DSF - Unit V Notes
No ratings yet
DSF - Unit V Notes
7 pages
Day 1-Tasks
No ratings yet
Day 1-Tasks
3 pages
11-General Computer Science
No ratings yet
11-General Computer Science
4 pages
227C4A Data Science
No ratings yet
227C4A Data Science
2 pages
Assignment (Csa101)
No ratings yet
Assignment (Csa101)
12 pages
Hemanth Resume
No ratings yet
Hemanth Resume
2 pages
Data Science & Engineering Project Ideas
No ratings yet
Data Science & Engineering Project Ideas
2 pages
Data Science Fundamentals Overview
No ratings yet
Data Science Fundamentals Overview
3 pages
Russel Investments Prep JD
No ratings yet
Russel Investments Prep JD
14 pages
DRDMS
No ratings yet
DRDMS
11 pages
IV-IV B.tech I-SEM Regular Exams - Dec-2023 (R20) Results
No ratings yet
IV-IV B.tech I-SEM Regular Exams - Dec-2023 (R20) Results
92 pages
HMC773LC3B GaAs MMIC Mixer Guide
No ratings yet
HMC773LC3B GaAs MMIC Mixer Guide
6 pages
Pocket User Manual
No ratings yet
Pocket User Manual
10 pages
Cognifyz Intern Report Final Revised
No ratings yet
Cognifyz Intern Report Final Revised
13 pages
Jaclyn Saito Resume
No ratings yet
Jaclyn Saito Resume
1 page
Bus Bar Sizing Calculation For Substatio
No ratings yet
Bus Bar Sizing Calculation For Substatio
11 pages
Supplier/Payee Transactions List
No ratings yet
Supplier/Payee Transactions List
90 pages
Shabi Abbas - DotNet
No ratings yet
Shabi Abbas - DotNet
3 pages
MN 124b 6100 098 VPad A6 Operators Manual
No ratings yet
MN 124b 6100 098 VPad A6 Operators Manual
66 pages
Dhruvi Resume
No ratings yet
Dhruvi Resume
1 page
Computer Lab Guidelines
No ratings yet
Computer Lab Guidelines
4 pages
Summative Criterion D Updated
No ratings yet
Summative Criterion D Updated
4 pages
Office 365 Basics Training Guide
100% (1)
Office 365 Basics Training Guide
31 pages
Through Strategic Online Opportunities: Detailed Articles and Tutorials
No ratings yet
Through Strategic Online Opportunities: Detailed Articles and Tutorials
4 pages
Hbus 16 in 16 Out Board
No ratings yet
Hbus 16 in 16 Out Board
12 pages
Reliability Centered Maintenance
No ratings yet
Reliability Centered Maintenance
15 pages
Pawan Tiwari: Objective
No ratings yet
Pawan Tiwari: Objective
1 page
Xu 2006
No ratings yet
Xu 2006
12 pages
Anthologies ELA 40s
No ratings yet
Anthologies ELA 40s
14 pages
Javascript Health
No ratings yet
Javascript Health
3 pages
T 002236
No ratings yet
T 002236
4 pages
Analog vs Digital Signals Explained
No ratings yet
Analog vs Digital Signals Explained
3 pages
BS Testing - GoJTAG v.11.02 Print
No ratings yet
BS Testing - GoJTAG v.11.02 Print
62 pages
CSCI430 Assessment1 Fall24 25
No ratings yet
CSCI430 Assessment1 Fall24 25
7 pages
Computer Literacy Training Cert
No ratings yet
Computer Literacy Training Cert
2 pages
Harmonic Direction Analysis Guide
No ratings yet
Harmonic Direction Analysis Guide
5 pages
Windows 10 Installation and Configuration Guide
No ratings yet
Windows 10 Installation and Configuration Guide
75 pages
Basic 315 Easy Life Eng
No ratings yet
Basic 315 Easy Life Eng
12 pages
Tools and Techniques of Project Management
No ratings yet
Tools and Techniques of Project Management
4 pages
FAQ: Project Doppler
No ratings yet
FAQ: Project Doppler
4 pages

Project Proposal Data Scraping

Uploaded by

Project Proposal Data Scraping

Uploaded by

Semester Project (Deep Learning)

M.Sc. Data Science (Weekend)

Project Title: Analyzing Research Trends in Computer Science

6. Key Words: The keywords of each article (if available).

• Removing duplicates or irrelevant entries.

• Handling missing data or incomplete records.

• Standardizing text (e.g., handling variations in spelling, formatting, or abbreviations).

Applying Deep Neural Networks:

At the end of the project, each group will be required to:

1. Submit a report summarizing the findings, including detailed analysis and

You might also like