0% found this document useful (0 votes)
241 views3 pages

Data Acquisition Notes

The document outlines the Data Acquisition stage of the AI Project Cycle, emphasizing the importance of quality data for AI projects to function intelligently. It details characteristics of quality data, types of data (structured and unstructured), and various methods for finding reliable data sources such as interviews, surveys, observations, APIs, web scraping, sensors, and cameras. The document highlights that acquiring the correct data in the right format is crucial for training, testing, and deploying AI systems.

Uploaded by

donrps52
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
241 views3 pages

Data Acquisition Notes

The document outlines the Data Acquisition stage of the AI Project Cycle, emphasizing the importance of quality data for AI projects to function intelligently. It details characteristics of quality data, types of data (structured and unstructured), and various methods for finding reliable data sources such as interviews, surveys, observations, APIs, web scraping, sensors, and cameras. The document highlights that acquiring the correct data in the right format is crucial for training, testing, and deploying AI systems.

Uploaded by

donrps52
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Data acquisition

The next stage of Al Project Cycle, after Problem Scoping, is Data


Acquisition, where data required for the project is acquired in specific
forms and formats. In order to work upon and produce outcomes, correct
data in right form, must be fed to an AI Project.

SIGNIFICANCE OF DATA

AI Project means an (artificially) intelligent project that is capable of


making decisions or performing some intelligent tasks.

Data plays a crucial role for an AI project to behave intelligently as the AI


project is trained using data to behave in a specific way.

To build an Al system, you would need to source large amounts of data


and create data sets for training, testing and evaluation, and then
deployment of the Al project. This process is repeated through several
rounds of training, testing and evaluation.

Quality Data Characteristics

As data is crucial for the success of any AI project, it is important to


ensure that it is quality data. Quality data has these characteristics

1)Accuracy

Is the data accurate as per timeliness and real data?

2)Relevance (i.e., Do you really need this information?)

3) Completeness (i.e., How full (comprehensive) is the information?);


4)Timeliness (i.e., How up-to-date is the information?);

5) Reliability (i.e., Does the information contradict other trusted


resources?

6 Validity (i.e., Is the information compliant with requirements?)

Data is broadly of two types :

Structured Data

Structured data is data that has a purposely designed, pre-defined


structure as per some existing data model, such as simple 2D spreadsheet
arrays, complex relational databases or knowledge graphs etc. The
structured data has well-defined relationships among its elements.

Unstructured Data

Unstructured data is data that is not organised according to any pre-


existing data model. Unstructured data is unprocessed and is often
generated by machine-led systems for example, social media posts,
surveillance camera footage, or satellite imagery etc. The unstructured
data can have its own internal structure, which may not fit in some well-
defined format. For example, in an Al system for analysing the most
popular social media posts, the data - social-media-post, does not have a
predefined structure; it can be text or video or a link or an image or even
some other undefined structure.

Finding reliable data sources

1. Interview

It is one of the most effective sources of data gathering. In this method,


an analyst talks to the users and clients who know about the system, its
functions and flaws.

An interview refers to a one-on-one conversation between an analyst and


the users and clients to find out about the systems, its functions,
shortcomings and flaws

2. Survey

In Surveys, first the goal of the survey is ascertained and thereafter the
questionnaires are formed accordingly.

A survey refers to a study of the opinions, responses, etc. Of a group of


stakeholders

3. Observation

Under the observation method, the responsible person observes the team
in a real working environment and gets ideas about the required data and
its form, and subsequently documents the observation

The observation method refers to human or mechanical watching, noticing


or per- ceiving of what people actually do or what events take place in a
specific working environment.

4. Application Programming Interface (API) API is a specialized technique


in which specific type of data is collected through the use of a
programming interface, such as using social media programs' interface,
data like people's most preferred game, most liked post, most used time
etc. may be gathered

An API refers to Application Program- ming Interface that works behind a


popular software program or game to collect specific type of data
pertaining to users' way of using that program.

5. Web Scraping
Web scraping, web harvesting, or web data extraction is data scraping
used for extracting data from websites. A web scraper is a specialized tool
designed to carry the web Scraping

Web Scraping refers to a data collection technique using a tool called web
scraper that extracts data from websites.

6. Sensors

Sensors or electronic sensors can measure various different parameters


such as

Weather, humidity, body temperature, blood pressure, heart beat, weight


and many more. For instance, you can see that modern medical diagnosis
and wearables like Fitbit, ‘Apple watch’ make good use of sensors.

Internet of Things (IoT) cannot function without sensor

Sensors are mini devices that can collect data about an environment or a
body or a specific task.

7. Cameras

Cameras, because of their video recording and image capturing features


have proven to be good data collection tools in various situations such as
traffic rules violations, automatic detection of flaws in design and outlook
of products, places, buildings etc.

You might also like