Big Data
Big Data
GLOSSARY .......................................................................................................23
BIBLIOGRAPHY ................................................................................................25
GLOSSARY .......................................................................................................45
BIBLIOGRAPHY ................................................................................................47
GLOSSARY .......................................................................................................65
BIBLIOGRAPHY ................................................................................................67
TOPIC 4. Artificial Intelligence.....................................69
4.1. WHAT IS ARTIFICIAL INTELLIGENCE? .................................................71
GLOSSARY .......................................................................................................87
BIBLIOGRAPHY ................................................................................................89
GLOSSARY......................................................................................................111
BIBLIOGRAPHY...............................................................................................113
TOPIC 1
Introduction to big data
Big data
Topic 1. Introduction to big data
OBJECTIVES
• Acquire basic knowledge about big data and its characteristics to understand what it
represents.
• To learn about examples of technologies and devices derived from big data that contribute
to to carry out activities of various kinds in the most efficient way possible.
6
1.1. WHY DO WE NEED BIG DATA?
Firstly, the term big data refers to massive data, i.e., data sets of such magnitude that they
far exceed the capacity of traditional computer programs and applications, and that need to
be processed in a relatively reasonable time to reach the proposed objective.
DID YOU KNOW According to the European Union, 1700 billion bytes per minute are
THAT...?
generated today, equivalent to 360,000 DVDs. Each person generates 6
megabytes of information in a single day, the same amount as was
created in a lifetime by an individual in the 16th century. In addition,
90% of all available data has been created in the last two years.
It can be said that big data requires the development of tools and mechanisms that are able to
manage and process a large amount of data from many sources, in order to find repetitive
patterns, predictive models or more concrete statistics in them.
Therefore, the purpose of this processing is to transform all this amount of data into
understandable and processable information, capable of being interpreted by people, and
that all this process facilitates decision making in various fields.
Every situation that is experienced, every action generates relevant information that can be
managed and analyzed by big data tools.
Continuous technological advances mean that the sources of data supply are increasing
exponentially, so that they can be of multiple types, for example:
• People's daily actions. The most relevant are those that occur in social networks, such
as sending WhatsApp messages, visiting web pages, viewing advertisements and
reacting to them, etc.
• Machines. There are currently a large number of devices that generate and continuously
supply an infinite amount of data, such as cell phones with geolocation mechanisms,
temperature, sound and light sensors, smart watches that provide data on the quality of
sleep, calories, kilometers traveled or steps walked, and that allow physical
measurements such as weight, heart rate, blood pressure, etc. to be displayed.
• Biometric devices. This group includes fingerprint readers, DNA, retina and iris
scanners, facial recognition cameras, etc., which are integrated into security and defense
systems.
7
Big data
Topic 1. Introduction to big data
Once this large amount of data from applications, web pages, etc. has been generated,
software, etc., are stored in the cloud.
By 2020, at least a third of all data generated will pass through the cloud,
a network of servers connected via the Internet.
Then, each organization or big data specialist develops a series of algorithms that are
capable of processing this information, and then cross-checking all the data obtained that are
of interest at any given time.
Afterwards, behavioral patterns are identified and, once they are determined, predictive
models are generated based on historical data and, at the same time, they are almost
immediate.
These results are acquired by the entities or individuals concerned, to facilitate the
decision making.
It all started in 2006, when the Netflix Prize, worth one million dollars, was launched for the
person who invented an algorithm to improve recommendations based on users' previous
opinions of the platform's movies and series. Today, 80% of what is viewed on it originates
from recommendations.
Netflix combines various traditional business intelligence tools (such as Terada- ta and
MicroStrategy) with modern big data technologies such as Hadoop or Hive. The result is an
algorithm that determines which content users are most likely to watch on the popular
streaming platform.
Microstrategy, on the other hand, is a business intelligence application that performs a myriad of
data analysis. Among many others, it finds answers when analyzing business data, with powerful
data functions, data visualization, integrated intelligence, reports, etc.
As for Hadoop, it is an open source software framework for collecting and soul-dining large
amounts of data, then running applications on commercial hardware clusters.
Hive, finally, is software designed and built on top of Hadoop. It performs queries and
performs analysis of large amounts of data.
8
In this way, the basis of Netflix's success is personalization, in which big data plays a very
important role, as it allows to provide spectacular and unique experiences for each customer
according to their particular tastes.
Image 1. The use of platforms such as Netflix is the result of many years of development and evolution.
There are certain sectors in which big data has had a great impact and influence:
• Health. Thanks to the monitoring of vital signs, it is possible to improve the quality of life
of a large number of people. Through devices such as wristbands, patients' lifestyles can
be analyzed, allowing anomalies in their pulse, sleep patterns, etc. to be detected. These
data help researchers to make predictions about the health of society as a whole.
• Banking and insurance. Banks, savings banks, credit cooperatives and financial
institutions in general, as well as insurance companies, use the monitoring and cross-
referencing of data from their customers' transactions to segment them, predict their
behavior and determine their level of risk. In this way they can offer them products and
services suited to their characteristics and needs, while minimizing the risk of non-
payment.
9
Big data
Topic 1. Introduction to big data
• Human resources. In the field of human resources, big data and intelligence
can be of great help.
• Marketing and sales. Every time a person browses the Internet, they leave a trace that
can be analyzed with big data tools to achieve different objectives, such as improving the
design of web pages, optimizing the profitability of sales channels, carrying out market
research to help create targeted advertising campaigns through SEM positioning, etc.
• Detection of criminal activity. Thanks to big data, it is possible to detect future criminal
acts or the exchange of criminal messages, as well as to analyze psychological profiles,
something unfeasible with traditional methods.
• Policy. By analyzing the behavior of social network users, it is possible to identify what
concerns the citizens of a particular area, what their interests and needs are, etc. This
information is used by political parties to create tailored campaigns and win their votes.
• Education / training. Through the use of big data in the education sector by analyzing
data, the aim is to obtain relevant information to improve educational processes and
models, as well as the learning of teachers and students.
In conclusion, organizations of all types can benefit from the use of big data, as it can identify
problems and needs of potential customers, generate business opportunities, prioritize
processes, reduce costs and increase revenues, as well as increase the quality of life, prevent
diseases or cure them more effectively, improve the user experience of web pages or optimize
the job search process, among many other applications.
10
WHAT CHALLENGES DOES BIG DATA FACE?
We live in the information age, with all that it entails; citizens are increasingly using digital media
and, consequently, the usefulness and value of big data is growing. Programs and applications
and sources of data are growing exponentially.
In this context, one of the most important challenges facing society is the creation of a
regulatory framework to govern everything that big data entails. In addition, citizens need to
be aware of how the data they generate is used, who it reaches, how and for what purpose.
The impact of big data is not only felt in economic activity, but also in public administrations,
healthcare, etc. The analysis of this continuously generated data is a challenge of enormous
magnitude. In this regard, it is worth noting that one of the most in-demand professions will be
that of big data and business intelligence expert, in which there is still much to be researched
and discovered. But big data is the future of all sectors.
CITA
As is well known, an infinite amount of data is generated, which, when well analyzed, makes
it possible to know the customer in detail, enabling companies and businesses to create
processes that better satisfy users, meet their needs and tastes, improve the perception of
their services, and obtain greater profits.
One of the many objectives of companies is to really know what users are like with the help
of the buyer persona tool, an exhaustive study of the profile of the ideal buyer. With the data
collected, their behavior patterns can be profiled in order to offer better products and services
to all customers.
Big data poses an interesting scenario. The changes that are currently being experienced at
the technological level are overwhelming and make us foresee that data analysis will go much
further. Companies know that without a digitalization process they are doomed to disappear.
They are also aware that the digital trail of users is the key to success.
11
Big data
Topic 1. Introduction to big data
What does big data have in store for us, how far can all the development go?
of data analysis?
Data reception and analysis systems will bring about a technological revolution, as they will be able to
improve all procedures and enable better decisions to be made more quickly.
Another aspect to bear in mind is that the accumulation of data will serve to generate what is
called "automatic learning", which will provide much more complex algorithms. Machine
learning is a subsystem of artificial intelligence that seeks to develop ways to learn or
improve performance based on the information they consume.
As can be seen, big data is expanding into structural sectors that go beyond the economic
sector. One example is the healthcare sector, where millions of data are generated every
day. Big data is a tool that makes it possible to improve treatments and increase their
efficiency, and offers the possibility of gaining in-depth knowledge of certain pathologies or
patient behaviors. It is therefore essential to know how to interpret clinical data.
DID YOU KNOW ingestible sensors are now a reality. Proteus Digital Health has created
THAT...?
a model the size of a grain of sand. This device sends electrical signals
to a patch that the patient wears on the skin or implanted under the skin.
The sensor monitors various parameters and the data can be sent to
the user's smartphone and/or the physician's computer.
Image 2. Today, the use of big data is essential in all sectors of society.
12
1.2. CHARACTERISTICS OF BIG DATA
Data analysis is the detailed study of large amounts of information in order to facilitate
decision making and problem solving.
During analysis, these data can be subjected to operations to obtain statistical indicators that
help to understand certain behaviors. This is a characteristic process of "data science", which
takes place after the information has been collected.
This analysis comprises a set of tools that are commonly used in databases, including
graphical representations such as histograms (statistical data grouped into numerical
intervals or according to absolute values), bar charts, pie charts, etc.
DATA CHARACTERISTICS
The data under analysis have both quantitative and qualitative characteristics:
• Quantitative. This is numerical information, from which statistics can be generated for
decision making. An example of this is the grades of the students in a class in a term,
from which average grades, the percentage of students with higher grades, etc., can be
extracted.
To analyze the data, there are numerous tools that are derived from statistics, the
econometrics and, obviously, mathematics.
In this way, statistical ratios such as arithmetic mean, standard deviation, mode, median,
range, etc., are used to obtain relevant information about the behavior of a variable.
On the other hand, econometrics provides essential tools such as regression analysis. Along
the same lines, it is also possible to make use of graphs that provide visual information, such
as histograms.
It should be noted that data analysis is not without its limitations, as there are certain factors
that are difficult to quantify precisely. For this reason, terms related to probability are
frequently used in this field.
13
Big data
Topic 1. Introduction to big data
Data analysis has multiple applications and areas of use, from private companies and
organizations whose objective is to obtain economic benefits to official organizations that
have non-profit purposes.
For example, an organization seeking to combat child malnutrition in Africa will constantly
study and analyze the anemia values of the population in a given age range.
Similarly, a for-profit business can study customer satisfaction data by conducting surveys
among the purchasers of its products over a given period of time in order to make strategic
decisions regarding the company's future.
Today, companies have at their disposal huge databases, accessible by means of software
programs. and computer applications for the public seeking information about them.
CITA
DATA SCIENCE
Data science is a branch of science that is closely associated with the management
of databases, large digital files from which relevant information is extracted such as ratios
and statistical indicators whose correct assessment and interpretation can help all
types of entities to make decisions that affect their future.
Likewise, data science offers tools that, in addition to providing and interpreting
information, manage to represent the results of this analysis visually.
It follows that data science is interdisciplinary, as it encompasses knowledge and skills that
can be applied to a wide range of fields.
and computer science.
The initiator of data science was arguably the American statistician John Wilder Tukey, who
in the 1960s stressed the importance of conducting data analysis rather than testing
statistical models.
However, the term "data science" was not used until 1996, when it appeared for the first time
in the title of a conference at a meeting of members of the International Federation of
Classification Societies (IFCS) in Kobe, Japan: "Data science, classification and related
methods".
14
Moving forward in time, in 2005 the National Science Board published Long-Li- ved Digital
Data Collections: Enabling Research and Education in the 21st Century, which describes
data scientists as computer scientists, database and software programmers, vital to the
efficient management and analysis of any digital data collection.
Although enormous progress has been made in this field, it is still under development.
CITA
"Too often we forget that genius also depends on the data at hand; that
Archimedes could not have devised Edison's inventions." Ernest
Dimnet.
• Structured. They are those that are duly organized in tables with rows and columns, with
defined categories and fields, such as name, surname, address, e-mail, telephone,
municipality, province, etc.
• Unstructured. These are those that do not have a generic format and, therefore, cannot
be classified. An example of this is a freely written text: a natural language processing
application must be used to understand the content and, from this, analyze and process
everything that is useful.
IMPORTANT
Considering the above, professionals specialized in data science must
not only have analytical skills, but also the ability to communicate the
content of the information that has been processed.
For all these reasons, data science is closely related to big data, which develops
mechanisms capable of managing, analyzing and processing large amounts of data from
different sources (financial activity transactions, internet searches, geo-location, biometric
information, etc.). The goal is to convert them into useful information that can be interpreted
by people and help in decision making.
15
Big data
Topic 1. Introduction to big data
• Efficiency. Processes must be carried out without consuming more resources than
necessary. For example, this type of technology helps to determine at what times of the
year it is safer and more efficient to sow a certain type of seed, thus saving a great deal
of water, labor and energy.
• Efficiency. It is necessary to adjust to the proposed purpose, bearing in mind that it must
be the closest solution to perfection. For example, various intelligent technologies make
it possible to determine the best time to eat, sleep or have a baby based on the personal
data of each individual.
• Avoidance of human error. Many processes can be carried out manually, but at an
unbearable cost in time (and therefore greater inefficiency) and with a much greater
likelihood of errors being made in the chain, leading to the repetition of each procedure
until the desired result is obtained. The use of intelligent technology contributes greatly to
avoiding human error and maintaining efficiency and effectiveness.
However, smart technology still needs to evolve away from its dependence on the
human action. Clear examples of this are voice assistants or intelligent cars.
16
IMPORTANT The terms "automatic" and "intelligent" should not be confused.
It should be noted that smart technology is not a specific branch of technology, but rather a
set of them, which interact and use automation processes, artificial intelligence or big data,
among many others. Some of its derivatives are:
• Smart cities. Better known as smart cities, they are those that aspire to use resources
and information to make the lives of their citizens as sustainable and simple as possible.
• Smart homes. Home automation is key to achieve energy efficiency and, thus, mass
sustainability of houses, urbanizations, towns, etc.
The use of home automation in smart homes helps many families to generate tasks in the most efficient way.
17
Big data
Topic 1. Introduction to big data
According to this technology, depending on whether event "A" occurs or not, a response of
type "B" or "C" will be obtained. For example, if the rent is paid, the intelligent lock will accept
the tenant's entry into the property; on the other hand, if the rent has not been paid, the lock
will prevent the tenant from entering.
As can be seen, the programs and applications of the new intelligent technologies offer a
wide and varied range of options, and there are currently no restrictions on their development
and application.
CITA
The following are just a few of the many examples of smart technology that exist today, most
of which are in everyday use, but whose functions we often do not take full advantage of:
• Smartphones, smart watches and smart TVs. These objects make life easier and, on
many occasions, provide a wealth of information that is initially unsolicited, but can
indeed be very useful.
The downside is the provision of personal data. When a smart device provides us with
statistics on our hours of sleep, indicates the series that best suit our tastes or shows us
the advertising that seems most relevant to us, it is not random, but rather this level of
personalization is the product of data collection on our day-to-day activities.
• Chatbots. This term refers to the figure of the virtual assistant, increasingly present on
websites, who offers help and answers visitors' questions without the need for human
intervention.
• Intelligent sprinklers and irrigation pipes. These mechanisms are designed to adapt
to the time of year and the weather, among other aspects, in order to deliver water to the
crops as efficiently as possible, minimizing waste and optimizing the use of this
sometimes scarce resource.
18
DID YOU KNOW
THAT...? It is estimated that in the next few years the number of connected smart
devices in the world will exceed 50,000 million, which will be able to
collect, analyze and share all kinds of data. This will be a massive
increase that will affect the development of big data tools.
As the field of digital software and applications evolves, innovation in other sectors will
increase, moving from the R&D phase to the implementation of real products and services of
great use to society.
19
KEY IDEAS
• Big data requires the development of tools and mechanisms that are capable of
managing and processing a large amount of data from many sources, in order to find
repetitive patterns, predictive models or more concrete statistics in them.
• Continuous technological advances mean that the sources of data supply are increasing
exponentially, so that they can be of multiple types.
• Entities of all kinds can benefit from the use of big data, as it can identify problems and
needs of potential customers, generate business opportunities, prioritize
processes, reduce costs and increase revenues, as well as increase the quality of
life, pre-come diseases or cure them more effectively, improve the user experience
of web pages or optimize the job search process, among many other applications.
• Numerous tools derived from statistics are available for data analysis,
econometrics and mathematics.
• Data science is a branch of science that is closely associated with the management of
databases, large digital files from which relevant information is extracted such as ratios
and statistical indicators whose correct assessment and interpretation can help all types
of entities to make decisions that affect their future.
21
GLOSSARY
— Big data. Massive set of data that greatly exceeds the capacity of traditional software
and applications, and that needs to be processed in a relatively reasonable time to
reach the proposed objective.
— Chatbots. Virtual assistant that offers help and answers the doubts of web page
visitors without the need for human intervention.
— Smart cities. Cities that aspire to use resources and information to make life easier.
and simple as possible for its citizens.
23
BIBLIOGRAPHY .
Aldana Montes, J. F. (2018). Introduction to working with data. García-Maroto Editores.
Casas Roma, J. (2019). Big data: data analysis in massive environments. Editorial UOC.
25
TOPIC 2
Big data in projects
Big data
Topic 2. Big data in projects
OBJECTIVES
• To enable the student to understand how a work team is structured and the positions
available.
• To help the student develop ideas to meet the needs of the target m a r k e t .
28
2.1. STRUCTURE OF THE WORK TEAM
The structure of a big data work team is defined in the planning, management
and organization generated by any activity that makes it possible to train the human
resources of an organization in the described big data tasks.
And all of this is geared towards the realization and achievement of an ultimate goal for the
whole of the members.
In short, as in the organization of any project, the structure of the human team in a big data
project should be aimed at managing a team of individuals, assigning each of them
certain and specific tasks in relation to their qualities.
CITA
"We have to stop interrupting what people are interested in and be what
they are interested in that people are interested in. (Craig Davis).
The visible head and main person responsible for any project is the director, whose mission
is to to generate the structure of the work team.
Thus, it is safe to say that project management and project organization are closely related.
Despite the different possibilities when it comes to realizing a big data project, at a minimum
there should be a standard set of roles that includes data integration initiatives,
project manager, integration consultant and architect.
• Project manager. This is the project manager and the person most responsible for
the project, from start to finish. His functions are organization, execution and control.
His mission is to achieve the project objective within the stipulated time, with the
required level of quality. He must follow a methodology and manage risks.
29
Big data
Topic 2. Big data in projects
A fundamental aspect is the interaction of roles. The project manager will define the
necessary roles and their responsibilities. For example, he will consult and supervise
numerous situations with the architect. When all is well, they will start the implementation
and carry out the different roles.
In order to manage and administer a project in the best possible way, it is necessary to plan,
and organize in the most efficient way.
It may seem simple in principle, but it is a major problem associated with this type of project.
projects in the different organizations.
If you do not know what you want to do and how you want to do it, using the resources in the
best possible way, the project will be destined to fail. That is why good organization is of vital
importance.
For all these reasons, the figure of the big data project manager is essential.
It is also essential to generate a great organization and manage the team in order to lead it.
An important way of carrying out a large distribution of tasks and responsibilities of the
resources in each sector is a good organizational plan.
The plan developed by the project management must be focused on all the tasks that the
company undertakes.
The organization of the project focuses, above all, on the people factor to generate a good
gear, since the main thing is to be able to place each individual in the most appropriate
function.
IMPORTANT
The plan developed by the project management should focus on all tasks
performed by the company.
For all these reasons, it is important to carry out the following phases:
• Keep in mind who you have. The first thing to do is to make an inventory of existing
human resources, analyzing their strengths and weaknesses. It is necessary to know
which functions are best performed by each person.
• Do the same with the tasks. A breakdown of the big data project into small parts must
be done to break them down. This leads to the need for the collaboration of all the
individuals in the project. Once this is achieved, the tasks can be settled by formalizing a
double-entry table with both variables.
30
Big data
Topic 2. Big data in projects
• Assign each person his or her task. Knowing which is the most suitable task for each
person, the training and experience of each one must be taken into account, as well as
his or her personal motivations in the project.
• Finally, it is necessary to specify deadlines for the completion of the tasks, a few
slacks that can be given for delays and an adequate control. After each task has
been completed, a check should be made to see if there has been any deviation from the
established objectives.
32
WHAT IS AN ORGANIZATION CHART?
An organization chart is a graph that allows you to evaluate and study the structure of the
project and, if necessary to know who will carry out the big data project at the best terms.
The project manager has the obligation to know how each of the project employees is, as
well as the tasks and functions assigned, in order to be able to organize and do it in the best
way.
• Vertical.
• Hierarchical.
• Linear.
This type of organization chart is made when authority is well defined and decision making is
centralized.
In addition, we have horizontal and circular organization charts, which are used in projects.
with a more dispersed authority.
Vertical
Hierarchical
Linear
Horizontal
Types of
organization
charts Circular
Analytical
Matrix
33
Big data
Topic 2. Big data in projects
Let us assume the formalization and implementation of a big data project in a company.
such as, for example, a computer engineering consulting firm.
The first step for the project manager is to meet with the project partners. Next, the objectives
to be achieved and the work to be done to achieve the goals are set.
This meeting should be formalized in writing and as detailed as possible. Something that is
usually done is the so famous "brainstorming", in which all the ideas must be expressed to,
later, choose the most interesting ones.
Image 1. The project manager assigns tasks to each team for the proper development of the work.
Next, it is necessary to know who is available, in addition to the promotion team. That is to say, you
must determine which jobs will be performed by which members and which employees will be
hired.
In addition, the type of contract to be entered into with the employees must be determined, in
addition to establishing salaries.
The next step will be to distribute the work. By means of computerized means it is possible to
design the schedule and the final result. A very important tool can be Pert diagrams.
This plan consists of reflecting in writing everything detailed above. It will even detail the way
of control by means of feedback and how possible deviations will be corrected. And all this,
without forgetting a contingency plan for unforeseen events that may arise in the project.
34
2.1.1. DIGITAL PRODUCT DEVELOPMENT
First, we must answer the following question: What are the digital products?
Digital products, as opposed to traditional physical products, are all those that have a digital
format and are distributed through online sales.
For this reason, to start developing a digital product, such as an app, an image bank, a
website, e-books or video classes, it is not necessary to make large investments or create a
large-scale logistics system.
IMPORTANT
Due to the ease with which these products can be developed, digital entrepreneurship has
advanced considerably worldwide. And many digital products have become real successes,
such as Whatsapp, Uber or Facebook, among others.
Being able to manage and develop a digital product requires a minimum of digital skills.
EXAMPLE
• Ebook. The ebook is one of the digital products par excellence and most popular today.
day.
The ebook is characterized by its exclusivity, since customers can access the
content.
In order to obtain an ebook, the customer must fill out a form and will then receive the
content by email.
• Virtual classes. This type of digital product or infoproduct is one of the most traditional
on the Internet.
Due to its virtual interaction applications, this infoproduct has been favored over time,
becoming an almost indispensable and very important tool for the realization of online
courses.
Virtual classrooms are exceptional for teaching content and training that are in dire need
of practical application.
The combination of explanatory videos optimizes learning by the student, facilitating the
concreteness of what is taught.
YouTube is currently one of the most sought-after resources for videos and virtual
classrooms.
35
Big data
Topic 2. Big data in projects
• Have a good idea. You don't have to invent the light bulb or the wheel, but you have to
come up with an idea that meets people's needs or adds value.
But there are times when there is no idea that, a priori, may seem ideal, so it is
necessary to think and be attentive to those situations that can be solved with your ideas
and meet the corresponding needs.
• Study the market. Once we have generated those interesting ideas, or the important
idea, it is time to carry out a market study. Market research is fundamental to know our
competitors and the tastes of our customers.
In the digital world, a market study can be carried out with an internet research, and thus
know what companies and / or competitors exist, while observing whether our idea is
present in the market.
At the same time, other content applications can be tested in order to identify
improvements to be made and protocols that can be incorporated into our key idea for
the development of our digital product.
• Minimum viable product. The next step will be to develop a minimum viable product
(MVP) so that it can be tested by a group of potential customers and to study the
possible improvements to be made.
Example
– Podcast. This infoproduct has its content in audio and is characterized by streaming
services, such as Spotify.
The podcasts offer the audio of programs with a wide variety of topics and that the customer
can be replayed at any time.
Offered on various platforms, such as radio stations, the user can access the
podcasts at any time and listen to his or her favorite news, sports or science program.
The customer can learn and be informed 24 hours a day, 365 days a year.
From the creators' point of view, the creation of a podcast requires the creation of
scripts, computer equipment, microphones and a quiet place to record without
external interruptions.
36
• Skills. As in any other project or business, making a product and developing it requires
certain skills in technology, user experience (UX), digital business projects, as well as
agile programming.
In addition to the development of the digital product, the marketing plan must be defined
(with a communication strategy, positioning the digital product, as well as a plan for
monetization).
• Networking. One of the most important things you can do to acquire more knowledge is
to attend talks, events and meetings with professionals in the digital sector, because, in
addition to being able to learn from them and be updated, in these activities you can find
collaborators and even generate new ideas with which to create new digital products.
• How to monetize it. As with any other type of product, in order to establish a market
price it is necessary to investigate and take into account the price set by competitors, as
well as to manage production costs; the factors and needs of the target customers;
calculate the value placed on the product by the public and make a forecast of future
income and expenses.
At this point, studies can be carried out to determine the best strategy. For example, first
of all, the digital product can be launched free of charge or at a fairly low price. Or, on the
contrary, launch it at a high price.
If the product is added at no cost, it can be monetized through advertising, as is done with
many mobile applications that are downloaded for free.
Example
- Webinar. This digital product, based on offering a lecture, course or seminar, is essential
for dissemination and attracting future customers.
Thus, the webinar is a great format to sell courses and trainings, in addition to
other digital products.
• Where can it be sold? As mentioned in previous lines, digital products can be sold
through the Internet, so you must have a website, in addition to carrying and maintaining
social networks that advertise the product.
In the case of a mobile application, it must be available in the app stores, such as Google
Play or App Store.
• How can it be promoted? To make a good promotion of the digital product it is essential
to have some knowledge in digital marketing, since it will be necessary to carry out email
marketing actions, establish advertising strategies in social networks and be used in the
positioning of the website or blog.
37
Big data
Topic 2. Big data in projects
IMPORTANT
• How to perform the analysis. Once the digital product is launched, there is still a lot of
work to be done.
This phase should focus on understanding how the sales cycle works, in addition to
knowing the sales pattern that customers take into account and what objections they
have when buying the product.
Example
– Ezines.
" In PDF.
"Private content.
This product is characterized by its periodical publication, as is the case with a paper
publication, and the customer can purchase each edition with the corresponding
periodicity through online purchase.
The success of this type of product is causing paper magazines to work with online
versions. It is therefore very important to take into account the indicators that provide
us with information to be able to evaluate, study and make the best possible decisions
for the project.
• How can you stay up to date? It must be taken into account that both technologies and
consumer habits are evolving by leaps and bounds.
Therefore, it is essential to be aware and updated with new trends and adapt to them.
Those who have created a successful digital product must also remain attentive, since
sales strategies and marketing methods are changing and it is necessary to know how to
adapt to new situations.
• Factors for success. Developing a successful digital product is not impossible, but you
have to have a lot of knowledge, dedication, attend events, lectures, surround yourself
with the best and, of course, always be updated.
CITA
"Always focus on growing your database: new subscribers are more
reactive and have healthier open rates and ROI values". (Karl Murray).
38
The following is a brief overview of how to create a digital product with some degree of
success:
• Think about the idea on which the digital product will be developed.
39
Big data
Topic 2. Big data in projects
The purpose of digital products is to make information available in an accessible way, and
that different internet users can acquire knowledge about something in particular. All of this,
through online workshops, free courses, etc.
This type of product is also known as an infoproduct and usually manages to correct many
needs of potential customers on the Internet, such as acquiring knowledge through free
courses in virtual media.
Thanks to all this, it has become one of the main bases for anyone who wants to start a
digital business and earn money through the Internet.
As you probably know, infoproducts are one of the best strategies in commerce, as well as
one of the biggest new opportunities in the online world today.
However, as this sector is something initially new, there are many doubts regarding
this whole world.
The digital product is formalized under many types of formats. These products have the
mission of providing information in a simple and clear way, thus facilitating that customers on
the network can obtain knowledge on a specific topic, through courses, we-binars, etc. Or, in
other words, to solve some needs.
Therefore, the most important mission of any digital product or infoproduct will be to educate, to
facilitate the life of customers and, above all, to solve problems in the clearest and simplest way.
40
WHAT ARE THE BENEFITS OF INFOPRODUCTS?
When studying the pros and cons of a digital project, many factors must be taken into account
factors.
• Minimum investment cost. The person who enters into this world should invest mainly
in the disclosure and promotion of their product.
• Promotes social inclusion. The fact of carrying out a business with these characteristics
promotes the accessibility of this type of products to a wide range of the population.
• Great solution for busy people. When a person is very busy in their day-to-day life,
being able to access training quickly provides great solutions.
• It is a scalable business. Being able to present yourself with digital products through in-
and out-of-home
The worldwide availability of the Internet makes it possible to sell these products to all
countries.
So getting started and building your image with digital products can be a great decision.
But as with any business, starting out requires perseverance, a lot of patience and a
great strategy.
It is a scalable business.
As a last point to highlight, within a long list of benefits brought by the digital sector, it is
important to emphasize the significance of digital business.
It should be clarified that digital business is not the same as digital product.
The digital business is that way of working in which many companies are starting to offer
completely digital solutions, such as the online management of financial institutions.
In this way, the difference between the two concepts can be better understood. A digital
product can lead to the creation of a digital business, such as Uber, which started as an app-
based transportation service and now has an online service base.
41
Big data
Topic 2. Big data in projects
Similarly, starting a digital business can lead to the creation of a digital product.
Consequently, when a company works with digital solutions, through its evolution it
can create new digital products to its catalog, taking into account the new needs.
IMPORTANT
Digital businesses can be created from a digital product. Ama- zon is a
current and very successful organization that, in addition to marketing
physical products through an online store, is an e-commerce model.
42
KEY IDEAS
• The organization of any project and, t h e r e f o r e , the structure of the human team in a
big data project, should be aimed at managing a team of individuals, assigning each of
them certain and specific tasks in relation to the qualities of each of them.
• The organization of the project aims to place each individual in the most appropriate role.
adequate.
• After each task has been performed, check for any deviations
in relation to the established objectives.
• To start developing a digital product, such as apps, image banks, websites, ringtones, e-
books or video classes, it is not necessary to make very expensive investments or create
a huge logistics system.
• You don't have to invent the light bulb or the wheel, but you have to come up with an
idea that meets people's needs or adds value.
43
GLOSSARY
— Structure of a big data work team. It is the planning, management and organization
generated by any activity that makes it possible to train the human resources of an
organization in the described big data tasks.
— Digital business. This is the way of working in which many companies are starting to
offer fully digital solutions.
— Organizational chart. It is a chart that allows to assess and study the structure of the
project and, thus, to know who is in the structure of the big data project to carry out the
project to the best terms.
— Digital products. They are all those that present a digital format, which means that all
this type of products are distributed through online sales.
45
BIBLIOGRAPHY .
Aldominos Gómez, A. (2017). Intelligent processing and analysis of Big Data. García Maroto
Editores.
Casas Roma, J. (2019). Big data: data analysis in massive environments. Editorial UOC.
Joyanes Aguilar, L. (2013). Big data: analysis of large volumes of data in organizations.
Marcombo.
47
TOPIC 3
Agile methodologies
Big data
Topic 3. Agile Methodologies
OBJECTIVES
• Understand the advantages and peculiarities of each of the existing project development
methodologies.
• Have an overview of digital project management tools and the advantages of each one.
50
3.1. DEVELOPMENT METHODOLOGIES
Development methodologies are defined as a set of guidelines to be followed to achieve or follow a
software development project, using different tools, standards or phases.
There are many differences in the different project management methodologies with which we
would carry out projects. Knowing the distinctions between each of them and which methodology
should be used in each project can be quite complicated.
Different methodologies are shown below, explaining which of them is the best.
option for different types of projects.
• Agile. Important methodology to carry out projects in an interactive way for any work.
The Agile methodology is a tool used for decision making in those software projects
referred to engineering methods, related to iterative and incremental development, where
specifications and solutions change over time, depending on the needs of the project.
EXAMPLE
Zara makes products that are difficult to forecast. They have a relatively
short life cycle and it is not known whether they will be liked or not. The
company usually makes downward forecasts and, if in the end there is a
shortage of products, it sets in motion the express creation and
shipping wheel, which is continuously prepared, and in 15 days at the
most it has the product stock ready.
• Scrum. Scrum is a work method in which a group of good practices are regularly carried
out to work as a team, and thus, generate better results in the projects.
– Priority is given to the quality of work based on the tacit knowledge of individuals in
self-organized teams.
An example of the Scrum methodology is when groups of employees have to carry out
complex jobs.
The use of this method provides advantages, such as the fact that it is a methodology
participatory and collaborative both among group members and with the client.
To work with this methodology, it is necessary that the team is competent, that they have
the necessary skills and that they have the necessary knowledge and experience to be able
to work with this methodology. The tasks are clear and the customer is able to collaborate
with the team. From this point on, the
51
Big data
Topic 3. Agile Methodologies
will conduct a series of daily meetings, not very extensive, to see the evolution of the
project and that the client can check the result of the work.
• Kanban. It helps increase the speed and quality of project delivery, creating increased
visibility.
• Scrumban. The Scrumban development methodology limits the project in progress and
condenses it into simpler steps to achieve the project in very specific phases.
• Waterfall. Performs a major planning of the project as a whole. From this, it carries out the
phases.
• PMI PMBOK. This methodology establishes general guidelines for project management.
Each methodology is best suited to the requirements and specifications of the project to be
carried out. Hence the importance of knowing the characteristics of each methodology to
work in the best conditions.
52
Certain project management methodologies specify only principles, as is the case with Agile.
However, others define a methodology framework, such as themes, processes, etc. to
establish the project, as is the case with Prince2.
Others are defined by a large list of standards with some process, as is the case with PMI's
PMBOK, and some are fairly lightweight and just lay out the process, such as Scrum.
But instead of discussing and specifying what is a project management methodology and
what is not, it can be established that project management methodologies are only
frameworks of methods to carry out the project in question.
But before continuing, it should be pointed out that, despite the existence of many different
methodologies, it cannot be said that there is one good methodology for all types of projects.
Thus, it is not possible to establish a methodology that is correct and adapts to all projects.
Ultimately, the best methodology is the one that adapts and gives meaning to each project.
group, and the end user.
• Individuals and interactions. The members of the group must work together to achieve
the success of the project, encouraging interactions among all of them and working as a
team.
• User collaboration. Beyond the establishment of the contracts, they must produce-
Regular meetings of the work team with the client are held to facilitate collaboration.
• Responding to change. It goes beyond carrying out a plan. When regular meetings with
the client take place, the client will specify aspects to be improved or modified, so it is a
good example to be able to respond to the required changes.
53
Big data
Topic 3. Agile Methodologies
User collaboration.
Respond to changes.
Rather than a methodology that is applied to a given project, the agile methodology is
rather a custom and a set of principles to be established.
When individuals discuss agile project management methodology, the most important thing is
that they really clear is that it is an adaptive design and construction process.
Agile projects focus on a set of guidelines that are managed, established and adapted to the
needs of the project.
so depending on the project, rather than being a preset method.
Agile methodology helps project managers to be adaptable through incremental and iterative
work processes.
EXAMPLE The fashion brand Finery London managed to establish that some
people need two and a half years to know what they are going to wear
and eight years of shopping. And it has decided to change that.
And what are the r e s u l t s ? You got 100,000 subscribers in the first year.
year of operation.
In the same way that a great construction worker checks his work through the completed
project, an agile project management process needs groups of professionals to establish a
great planning. And also that, as progress is made, project evaluations are established.
54
• Kanban methodology. This project management methodology achieves improved
speed and quality of delivery by providing better visibility of the performance being
achieved, and therefore puts limits on multitasking.
The Kanban methodology is a form of project management based on Lean principles and
a rigid set-up to increase efficiency.
It is very similar in many points to Scrum, since it tries to carry out projects quickly, and
with a self-managed team in which all members collaborate with each other.
Unlike the Scrum methodology, Kanban's way of working is more developed and simpler,
since it is less prescriptive.
What is most commonly done with Kanban is to visualize the exchange of work, measure
delivery times, make process rules explicit and see what can really be improved.
Kanban is usually used in operational environments where the important things can
change very often. It focuses on knowing what the lead time is and how much it is delayed.
Project managers work with sticky notes on a Kanban board or with an online tool called
Trello to expose the work exchange, setting up items such as "done tasks", "to-dos",
"tasks to do", etc.
In this way, you can see at a glance what you want to do and limit the work in progress
(WIP) so that the work exchange is better while determining and optimizing the average
time to complete the items.
For the agency sector it can be an important factor, since it is more adaptive to change. If
Scrum may seem too strict a methodology for project management, Kanban would be the
better choice.
55
Big data
Topic 3. Agile Methodologies
It takes the adaptability of Kanban and takes part of the Scrum framework to bring to
a new project management methodology.
Instead of managing projects in iterations with strict timelines and rules, Scrumban uses
a project-based scheduling rule to finish the backlog of work. Functions are performed
and done by group members, as in Kanban.
All this makes us understand that work in progress is limited and members focus on
priority work. However, it is not like Kanban. Scrumban has a hint of the daily Scrum
methodology, with reviews to improve the work in progress. Additionally, without the
constraint of iterations, work is carried out on an as-needed basis, rather than around a
iteration.
It should be noted that Scrumban adds the adaptability of Scrum, removing repetition
and focusing on planning. In addition, it inserts a structure with meetings to help optimize
the process.
• Lean methodology. Lean methodology manages and eliminates what is not necessary
to deliver what is really necessary. It is a form of project management whose main
objective is efficiency. It is closely related to Agile and tries to "do more with less".
The first thing it does is to be clear about what brings value. It then enhances it through
continuous improvement by maximizing the exchange of value and leaving aside what is
worthless.
Instead of establishing processes and guidelines, this methodology highlights what can
be done using fewer resources. It does this through the 3Ms: Muda, Mura and Muri.
– Mute. Try to remove unneeded items or items that do not add value to the user. In the
virtual world, all of these can be eliminated through the use of scheduled revisions.
– Mura. Removes variations. It does not take into account the overload caused by
management variations.
– Muri. Try to eliminate overwork. The best way to operate is to operate at around 60-
70%. This minimizes the number of projects to carry out.
Muda
Muri
56
Lean methodology modifies the way work is done to focus on value delivery. It attempts to
make a change in the approach to optimizing all the elements used: technologies,
departments, etc.
Lean can change the mindset when performing project delivery reviews.
It is quite similar to the Scrum methodology, since it is characterized by its simplicity. The
point where it is not similar is in the definition of prescriptive guidelines or standards.
Some of these guidelines or rules are the same as the Scrum methodology, but others
related to design practices, testing, etc., are different. User stories must be introduced, the
project must evolve through testing, peer programming, etc.
57
Big data
Topic 3. Agile Methodologies
Once the proposal is approved, it cannot be improvised, unless changes are necessary.
Any changes that need to be made must be made after they have been requested and
approved.
The work is carried out according to established standards, including design, testing,
implementation and maintenance. With the Waterfall methodology there is little room for
improvisation.
When you are in the testing phase, it is very difficult to modify what has already been
defined. It should be emphasized that there is no obligation to show progress to the
client. In other words, the project is completed in its entirety and the client reviews it
afterwards.
The Waterfall methodology is viewed with some suspicion within the project management
world, as it has a traditional and somewhat old-fashioned approach. However, if the rules
are established, the guidelines are clear and the technology is understandable, Waterfall
can be quite useful.
Waterfall can achieve a better result when the purposes are clearly defined.
Business
Progress Case
Organization
Change PRINCE 2
Processes Quality
Risks Plan
PRINCE 2 Themes
PRINCE 2 Principles
Image 3. The Prince2 methodology prioritizes knowing the real need of the project.
Prince2, created in 1996 in the United Kingdom to carry out information technology (IT)
projects, is a methodology that focuses on several jobs, dividing the project into many
phases. This methodology creates inputs and outputs for each cycle of a project so that
everything is well established.
It focuses on knowing what is the real need of the project, who is the client requesting it
and determining the project costs.
58
Prince2 is characterized by eight high-level cycles and enables great control of
resources, in addition to having a high percentage of risk minimization.
It cannot be said that it is a methodology in itself, but rather a way of working through five
phases to make the project a reality. These five phases are: initiate, plan, execute,
control and close.
This type of methodology is more theoretical and, due to its rigidity, it does not adapt to all
types of projects and clients. Nevertheless, the standards can be used to achieve a standard
language and the best guidelines to generate a project.
If you compare them with Prince2, you may think that they are complementary, instead of
understanding that they are two different ways of working in cascade.
Establishing a methodology for project management is one of the most important decisions,
since this decision establishes the method of working.
The different project management methodologies generate guidelines for carrying out a
project. When deciding on the methodology to work with, the nature of the project, the client,
the available resources, the deadlines, etc. must be very clear.
There is no ideal project methodology that will achieve success for all types of business
organizations, as each has its strengths and weaknesses. The management methodology
chosen must culminate in a project that meets the client's needs. The best project
management methodology is one that is clearly and concisely adapted to the project, as well
as being liked and trusted by the client.
59
Big data
Topic 3. Agile Methodologies
Scrum is a project management methodology that focuses on standards and guidelines for
doing the following improve project delivery.
In addition, in software development, Scrum is one of the most famous methodologies and
that resembles the principles of Agile.
The goal of Scrum is to innovate and develop communication, speed and teamwork in
the project.
The Scrum methodology is defined as a simple set of functions and guidelines to efficiently
realize and enable valuable and consistent functionality.
In addition, this methodology tries to develop and evolve the self-managed team to achieve
and define the key points of the team itself. In this way, responsibilities are defined and an
acceptable pressure is achieved in order to create a quality work, correctly and on time.
DID YOU KNOW Scrum is a way of working and collaborating as a team, in which the
THAT...? The result is produced incrementally.
To achieve this, relatively short work periods are planned, in which the
same rules are always followed.
The Scrum methodology attempts to implement the use of a small team, with a maximum of
nine individuals, interacting on products in a backlog. It results in a prior establishment of the
user and has been required and established by the product owner.
The project is classified into "iterations", and takes two to four weeks to complete. During this
time, daily scrums (meetings) are held in which team members report on progress and
problems as they arise. Iterations are iterations of a process to achieve a goal. Each iteration
is called an iteration, and the result obtained is used as a starting point for the next iteration
or iteration.
At the end of each iteration, the work performed is reviewed at a requirements demonstration
meeting to establish, together with the product owner, whether it passes the completion
definition.
60
Scrum is facilitated and supervised by a "facilitator" (Scrum master), who enables the
monitoring of the requirements made and the revisions made, which causes the team to
perform its work in a better way. It also performs a retrospective to know the development of
the work, which is continuously improved and optimized.
Hom
e
Planning
Phases of the
Scrum process Implementation
Review and
retrospective
Launch
• Project planning.
• Control costs.
It is a cloud ERP that automates business areas thanks to its many functionalities, such as
project management.
61
Big data
Topic 3. Agile Methodologies
The project management part is designed to increase the productivity of the organization.
ganization. It has both paid and free versions.
It helps to plan through collaboration and to be able to control the projects to each team
member and each activity performed.
Work teams form projects with templates, without having to program, and on a platform
that adapts in an automated way, establishing connections in the work teams digitally.
Due to the simplicity of its interface, which is very visual, it is very easy to use.
It provides solutions to projects from the beginning, following up on them and obtaining
nizing final results.
In addition, you can generate reports to determine the final outcome of the project.
It is one of the most popular software, and helps you organize your work so that each
The team member knows what to do, how to do it and when to do it.
62
KEY IDEAS
• The development of project methodologies is continually evolving, highlighting-
do Agile, Kanban, Scrum, Waterfall, Prince2, etc.
• The goal of Scrum is to innovate and develop communication, speed and teamwork.
• The Kanban methodology is a form of project management based on Lean principles and
a rigid set-up to increase efficiency.
• Waterfall, also called "software development lifecycle", is a very simple way of working
that focuses on strong planning and getting the job done right.
• The different project management methodologies generate guidelines to carry out the project
management process.
project.
63
GLOSSARY
— Agile. Important methodology to carry out projects in an interactive way for any work.
— Kanban. It helps to increase the speed and quality of project delivery, creating
increased visibility.
— Waterfall. He does a great deal of overall project planning and from that he carries out
the phases.
65
BIBLIOGRAPHY .
Casas Roma, Jordi. Big data: data analysis in massive environments. Barcelona: Editorial
UOC, [2019].
González Díaz, I. (2017). Big data for CEOs and marketing managers. IndependentlyPublished.
67
TOPIC 4
Artificial intelligence
Big data
Topic 4. Artificial Intelligence
OBJECTIVES
With the study of this lesson, the student will be able to:
• To understand what Machine Learning and Deep Learning are. Differences, limitations
and advantages.
70
4.1. WHAT IS ARTIFICIAL INTELLIGENCE?
Artificial intelligence is a science that is developed through a series of systems, algorithms
and processes determined with the aim of simulating the intelligence of people, to develop
tasks and skills of individuals.
• Machine learning. This type of artificial intelligence is used when a large amount of data
has to be studied and analyzed. Algorithms are used and patterns are identified in order
to anticipate and avoid behaviors and thus be able to manage thousands of data.
• Deep learning. This type of artificial intelligence has the mission to simulate people's
abilities, such as being able to identify images.
The use of artificial intelligence has among its missions to simulate human behavior.
Artificial intelligence (AI) brings together a very large number of many computer applications,
which use a series of patterns to carry out inferences.
For example, in commercial applications, artificial intelligence functions are coupled with a
multitude of systems to generate benefits for companies, such as controlling and monitoring
their inventory, controlling the manufacturing process or keeping an updated database.
• 1921. In his play R.U.R., Karel Apek introduces the term "robot", which comes from
the Slavic word robota, "hard work".
71
Big data
Topic 4. Artificial Intelligence
• 1941. Isaac Asimov defines the laws of robotics in his short story Vicious Circle.
• 1956. John McCarthy coins the term "artificial intelligence" at the Darmouth conference.
• 1966. Joseph Weizenbaum of MIT develops ELIZA, a program that incorporates human
natural language processing into computers so that it communicates with our language
rather than with programming code.
• 1969. Perceptrons, a work by Marvin Misnky considered fundamental for the analysis of
artificial neural networks, appeared.
• 1996. The world chess champion Garri Kasparov is defeated by Deep Blue, a
IBM supercomputer.
• 2005. Raymond Kurzweil, using Moore's law, predicts that machines would reach the
level of human intelligence by 2029 and surpass it by a trillion times by 2045.
• 2014. Computational bot Eugene Goostman makes 30 of the 150 judges of the Turing
test he was subjected to believe they are talking to a 13-year-old Ukrainian boy.
CITA
Researchers want to design computers that can process human languages and reasoning.
When computers are able to process languages such as French or Spanish, people can give
directions and ask questions without having to know computer languages.
72
With this perspective, computers can learn from experiences and translate them to solve
problems.
Researchers still have to continue to investigate and develop the various computer systems,
but they may have come up with the invention of "fuzzy logic".
Computers operate on a "yes" or "no" basis, so they cannot reason about anything in
between. The most modern computers, capable of solving millions of calculations per
second, cannot know what a "maybe" is.
This simple situation has led scientists to many questions for years, but Dr. Lofti A. Zadeh
conducted a study, called "fuzzy logic".
Fuzzy logic focuses on generating to the computer "fuzzy sets" with con- crete and relative
information. For example, in a fuzzy set for industrial machinery, if you have a temperature of
about 1500 degrees, you can have a "membership" (relative value) of 0.95, while with a
temperature of about 800 degrees you can have a membership of 0.50.
A computer application can generate instructions such as "the higher the temperature, the
lower the pressure". In this way, scientists can make computers calculate with words, not
numbers.
EXPERT SYSTEMS
In addition to being able to use an infinite number of artificial intelligence programs and
applications, there is one called "expert system".
The expert system is a software that applies the primary knowledge of a specialist in any
sector that can be applied (law, medicine, mathematics, etc.), and all this to solve complex
situations without human intervention.
In carrying them out, researchers collaborate with specialists to develop systems and
guidelines to specify information and decision rules (heuristics) to solve problems.
Expert systems acquire the knowledge and experience of human beings, which has many
advantages. For example, they favor people. real-life experiences and in any sector, since
artificial intelligence is developed in any field.
73
Big data
Topic 4. Artificial Intelligence
But expert systems do not have the typical problems that people have, illness or fatigue, and,
if well realized, are less prone to inconsistencies and errors. All this creates a strong
attraction for their use in large enterprise organizations.
Businesses also use such expert systems for business analysis. For example, General
Electric developed a system called Delta that helped maintenance employees detect and
repair locomotive faults.
When an expert system is installed effectively, there are huge savings in human labor hours
and costs caused by worker errors.
It is estimated that savings of around tens of millions of euros per year are produced in large
corporate organizations.
A medium system, with a size of around 300 decision patterns, would have a
250,000 to 500,000 euros.
EXPERIMENTAL GAMES
To demonstrate that a machine could think for itself, without human intervention, researchers
in the 1960s generated computers capable of playing chess.
In the 1960s, computers were created to simulate human behavior in order to play chess.
74
As the possible times and alternatives were reduced, computers were able to play at the
same level as chess masters. To simulate the thought process, computers began to process
large amounts of information about many moves.
NEURAL NETWORKS
Neural networks are more advanced than expert systems in their ability to solve problems.
They carry out information patterns to gain new skills and learn to perform activities for the
user, generating a new preparation, feeding the system data to then search for patterns.
This way of working proves to be more efficient in different sectors, such as finance,
economy, health sector, etc.
An example would be a neural network to find out which financial operations are the most risky. The
neural network studies and evaluates all the data in depth and generates its own criteria of
evaluation. Then, with the generation of new applications, the machine would apply everything
knowledge to predict the associated risks.
As time progresses, the neural network is trained with new data and in- formation, perfecting
its way of acting and acting in relation to new trends.
IMPORTANT
Neural networks perform a complete study of the situation and create
their own method with their own criteria. In addition, they manage to
carry out new activities for the user, feeding the data of the problem.
Artificial intelligence is actively used in the business world, for example, in the fields of
management and administration, economics and finance, law, medicine, military industry,
etc.
Management is increasingly relying on artificial intelligence systems and using them for
help engineers, architects, doctors, etc., to create new knowledge systems.
Managers in many companies use artificial intelligence systems to plan their work, perform
competitive analysis, allocate resources, etc.
In addition, they use computer applications to help in the design of equipment, production of
goods, assessment and evaluation of workers, etc. In this way, artificial intelligence helps
companies in many aspects of their management and economic activity.
In addition, artificial intelligence also works in fields such as science and engineering.
Applications in these fields are used to organize and manage ever-increasing amounts of
data.
75
Big data
Topic 4. Artificial Intelligence
and data. Artificial intelligence performs complex processes, such as mass spectrometry,
semiconductor circuit creation and vehicle elements.
Artificial intelligence is increasingly being used in image analysis, robotics, power plant
design, etc. The largest use of artificial intelligence is in robotics.
In 1990, more than 200,000 robots were in use in companies in the United States.
Leading specialists believe that by the year 2025 robots will be able to replace humans in
many manufacturing and service jobs (shearing sheep, polishing walls, etc.). Nevertheless,
the presence of humans will always be necessary to design and create robots. However, as
robots are able to think and perform, human presence will be less necessary.
A large amount of data can be managed and, in this way, the different business
organizations have benefited greatly, since, by collecting a lot of data, companies
manage this information and get to know the tastes and needs of their customers,
offering them better services and products.
For example, in the case of Netflix, users search for a type of series or movie. The
platform, in relation to those searches, offers them series and movies related to the
tastes based on those searches.
• Anticipation. Artificial intelligence favors the automation of activities and organizes the
ways of using the flows of information generated. As a result, companies can
manufacture new goods and improve existing ones in order to be more competitive and
anticipate market tastes.
• Imitation of people. The goal of artificial intelligence is to get to know the human being
and manage to function and work like him. The goal is to help and solve problems in all
possible areas.
In addition, artificial intelligence can achieve many patterns and activities of man in
society, since the new technology is able to carry out complex human activities,
generating artificial patterns and thus avoiding the mistakes that humans usually make.
76
• Suppression of monotonous activities. Artificial intelligence eliminates the
performance of recurring activities that used to be performed by people. In business
organizations, machines are replacing many of the tasks performed by employees. In
this way, processes are perfected so that machines can perform tasks repetitively,
unlimitedly and without errors or breaks, achieving great results.
Many advances have been made thanks to artificial intelligence. In the healthcare sector,
it has made it possible to optimize the management and operation of hospitals and
healthcare centers by improving the organization of the tasks of healthcare professionals
and the recording of admissions.
If we look at the field of video games, artificial intelligence has improved the graphics of
games and the way players interact and can compete individually against the video
game itself.
Artificial intelligence has made many advances in recent times. Society is becoming
increasingly receptive to the advantages it offers, leading to more investors in the development
of applications for many sectors. For example, customer ordering systems, vehicle engine
designs and self-diagnostic systems, among other applications, are developed through
artificial intelligence.
Both terms are encompassed within artificial intelligence, which is being developed to make
machines even more intelligent than humans.
Although Machine Learning and Deep Learning are treated as different aspects, they are
closely related.
• Weak applied artificial intelligence (narrow AI or applied AI). This sector is generated by
through algorithms and learning through Machine Learning and Deep Learning.
77
Big data
Topic 4. Artificial Intelligence
Computer scientists know that they have to refine algorithms on a set of variables to
achieve reliable and concrete tasks.
Since the beginning of the development of artificial intelligence, algorithms have evolved
through decision trees and inductive logic programming (ILP) to store large amounts of
data.
IMPORTANT
Computer professionals continue to work and refine algorithms to
achieve more reliable and secure methods.
The development of Machine Learning in the last few years has generated a new technique
known as Deep Learning.
Deep Learning is a subset within Machine Learning that works with the idea of learning
by example.
Instead of teaching the machine a huge number of guidelines for solving problems, Deep
Learning provides it with a way of acting to evaluate examples and some instructions that
cause it to alter the model when errors occur. Over time, these models are able to solve
problems more efficiently, as the system creates new patterns.
It must be said that there are different techniques to develop Deep Learning. One of the
The most famous is to simulate a system of artificial networks of neurons within the software.
This artificial neuron network consists of different neuron bodies, as well as contacts and
a direction in which data is propagated through each body with a special analysis
function. In this way, an attempt is made to transfer a sufficient amount of data to the
neuron bodies so that they can recognize patterns and classify them.
A great advantage would be to work with unlabeled data and evaluate their behavioral
patterns.
An example of this could be to have an image as information from the first body. Here it
is divided into thousands of pieces that each neuron will study separately. Here it will
study the shape, behavior, etc., and in this way each body or part specializes in a
particular characteristic.
Finally, the bodies or parts of the end of neurons agglutinate all the information and
generate the result. Each neuron contributes a weight to the input, being correct or
wrong relative to its function. The output will be generated through the sum of these
weights.
78
For example, the image of a cup can be studied as the shape it represents, the
background, its handle, etc.
The neural network will terminate if it provides a signal or not, and over time it will be possible to
terminate with
best chance of success in each body or part.
Today, technology has advanced sufficiently to be able to ensure the real management
of Deep Learning.
Deep Learning is advancing in such a way that progress is taking us to another reality, being
able to interpret the world in different ways through language analysis and image recognition.
There are many business organizations that are currently developing large applications.
In 2012, the Deep Learning revolution, through Andrew NG, achieved a milestone in Goo-
gle, recognizing a cat among more than 10 million YouTube videos. At that time, some
16,000 computers were needed.
There are currently examples such as Facebook, where photos uploaded to the network can
be tagged, or Uber, which uses artificial intelligence to optimize the trips made by its
customers by taking different logistics data as a reference, instead of using urban
transportation variables.
Another important area of Deep Learning is speech recognition. Google has been working for
many years using techniques such as Long Short-term Memory Recurrent Neural Networks
to improve its functions.
In recent times, GPUs have been used to achieve these functions more efficiently, saving the
need for large numbers of machines to perform the calculations.
It can be assured that Nvidia is one of the main technologies, taking as a reference many of
its elements, working on the use of processors for artificial intelligence in an independent
way, as in the case of drones.
79
Big data
Topic 4. Artificial Intelligence
Insurtech is related to the insurance industry. Insurance means insurance and techno- logy is
technology. Thus, insurtech would be the new big data technologies used in the insurance
sector.
In recent years, tools such as Insurtech have been applied in sectors such as insurance to offer the best solutions
to customers.
The term fintech refers to recently created companies, i.e., startups that market financial
services with technological elements. In other words, they are financial companies that use
artificial intelligence technology in their marketing and activity.
These two terms have become increasingly important in recent years as technology has
evolved.
In this way, we are trying to implement new methods to optimize the operation and analysis of
this type of companies in order to offer their products and services in the best conditions.
• The cloud. It is a very large group of servers spread all over the world and capable of
storing large amounts of information. In this way, all this information does not have to be
stored on a disk or a USB stick.
• Big data. It is a technology that develops methods and processes of large amounts of
data. From there, more efficient models are created to achieve more accurate and error-
free operations.
80
DID YOU KNOW
THAT...?
Big data is a phenomenon that can benefit everyone if we know how to
deal with it, but we must not forget that this massive generation of data
can affect us personally, so increasing its security should be essential.
EXAMPLE OF INSURTECH
An example of this type of technology could be a computer program or application that offers
clients the possibility of generating an insurance policy in relation to their particular needs, in
addition to being able to initiate the contracting of the policy through the application on their
cell phone.
This application would facilitate customer transactions and, in addition, would offer insurance
business organizations to generate information about their customers and the general public.
BUT WHAT ARE THE MOST IMPORTANT BIG DATA TOOLS TODAY?
• Python. This application is a very advanced programming language very well known, since
that its use is quite simple in relation to others.
It is a software used in big data used for its simple use in the study and analysis of data.
Python is unique because of its open source status, which makes it a very collaborative tool,
as users allow others to use it under better conditions.
In this way, it offers users access to a large amount of information from a wide range of
community bookstores R.
Another highlight is the RStudio tool, which offers a syntax editor to support code
execution, as well as methodologies for plotting, among others.
• Hadoop. The Hadoop tool is also licensed as open source and is considered the ideal
framework for storing large amounts of information.
81
Big data
Topic 4. Artificial Intelligence
– Fault tolerance. Thus, if a node fails, the work done is moved to other forms to ensure
the success of the work.
– Ability to store and process immense amounts of information at the same time.
moment.
– High-speed processing.
• Apache Spark. This tool is one of the key pieces in the fastest data processing today.
It is open source licensed, offering better user-generated solutions. In this way, the
community facilitates problem solving for new processes.
The great advantage of Apache Spark is that it can work with a large number of
programming languages. Thus, users can program with any language, be it R, Scala or
Python.
• MongoDB. The difference that Mongodb offers is its specialization in relation to the rest.
of relational databases.
• Apache Cassandra. It is one of the most widely used software currently in use. It is a
database that offers high performance in data input and output.
• Elasticsearch. This application is one of the most powerful big data tools for searching
large amounts of information. It is a software you can use, even if you work with complex
data.
What makes this tool relevant is its ability to index and analyze large amounts of
information in real time, allowing queries to be performed on it.
• Apache Storm. Apache Storm is a big data tool with greater capacity for
process large amounts of data in real time.
This software is very efficient in monitoring processes. That is, it can identify information
from social networks with high volatility.
82
– DDFS.
– MongoDB.
– HBase.
– MapR-DB.
• Apache Oozie. Apache Oozie provides cluster administrators with the ability to design
changes of complex data from multiple component tasks.
Thus, the way the job flows offers to manage Hadoop jobs.
83
KEY IDEAS
• Modern artificial intelligence, collaborating with current technologies, has diverse
functionalities and creates many types of benefits to different users.
• Computers operate on a "yes" or "no" basis, so they cannot reason about anything in
between. Moreover, those more modern computers capable of solving millions of
calculations per second cannot know what a "maybe" is.
• Expert systems do not have the typical problems that people have (illness or fatigue) and
are less prone to inconsistencies and errors.
• Managers in many business organizations use systems to plan their work, assisting in
functions such as competitive analysis, resource allocation, etc.
• Although Machine Learning and Deep Learning are treated as different aspects, they are
closely related. The importance they are having in current and future technological
advances, both in business sectors and in people's daily lives, should be emphasized.
• Deep Learning is advancing in such a way that progress is taking us to another reality,
being able to interpret the world in a different way, through language analysis, image
recognition, and thus anticipate many existing problems.
85
GLOSSARY
— Deep Learning. It is a subset that would be within Machine Learning and that works
with the idea of learning by example.
— Fintech. This refers to recently created companies, i.e., startups that market financial
services with technological elements. These are financial companies that use artificial
intelligence technology in their marketing and activity.
— Fuzzy logic. Traditionally computers have worked with ab-solutes decisions, in which
everything can be a "yes" or a "no". Fuzzy logic would be a middle ground in which a
"maybe" can be accepted.
— Neural networks. They carry out information patterns to achieve new skills. Learning
to perform activities for the user, generating a new preparation, feeding the system
data and then looking for patterns.
87
BIBLIOGRAPHY .
Aldominos Gómez, A. (2017). Intelligent processing and analysis of Big Data. García Maroto
Editores.
Casas Roma, J. (2019). Big data: data analysis in massive environments. Editorial UOC.
Joyanes Aguilar, L. (2013). Big data: analysis of large volumes of data in organizations.
Marcombo.
89
TOPIC 5
Industry use cases
Big data
Topic 5. Use cases in industry
OBJECTIVES
With the study of this lesson, the student will be able to:
• Understand the benefits that big data can bring to the medical sector.
• Understand the benefits that big data can bring to the civil engineering sector.
• Understand the benefits that big data can bring to the service sector.
92
5.1. MEDICAL USE CASES
The uses of big data in medicine, and in the healthcare sector in general, can be used in
different areas such as clinical trials, genomics, clinical operations, telecare, administrative
management, etc.
The immediate future of the medical sector is closely linked to the use of digital tools to
analyze and consult, in a secure and organized manner, the large amount of data and
information that are being created with the technological evolution.
There are a multitude of data generated by all those big data techniques, such as per-
clinical reports, clinical reports, etc.
All this information generates a more effective medicine, obtaining data and personalized,
predictive and organized medicine.
IMPORTANT
The use of big data in the healthcare sector is carried out in many areas
such as telecare, administrative management or clinical trials, among
others.
The analysis and investigation of health was carried out by taking as a reference a set of
individuals who represented the population, and then extrapolating these data to the rest of
the people.
This approach has become obsolete due to the evolution of technologies such as the
analysis of the human genome.
93
Big data
Topic 5. Use cases in industry
• Individuals in a society can benefit from the use of techniques to analyze the
data in medicine, such as, for example:
• Evaluate the efficacy of all side effects of any drug or drug product.
specific treatment.
• Segment the population into different groups through different health characteristics.
• To classify individuals in a society in order to select those individuals and improve the
efficiency of clinical trials.
Due to new technologies in big data, the information can be analyzed and different situations
can be evaluated, showing the best possibility or treatment for each patient in question.
Being able to anticipate, and thus meet the needs of sick individuals, is one of the objectives of
big data in the use of medicine, in addition to generating better performance in laboratories
and healthcare centers, applying new big data technologies.
IMPORTANT
New technologies associated with big data make it possible to analyze
information, showing the best possibility to tackle any problem of the patient
in question.
Evaluate and study the impact of these new technologies on social and historical data.
can generate concrete preventive policies:
94
• Environmental health campaigns.
• Preventive medicine begins with a series of data, and from the study of these data, an
increase in social welfare is generated, since it provides us with relevant information
about our mental and physical health.
The evolution of the internet and our dependence on it offers both individuals and healthcare
professionals the ability to create and participate directly in online participation groups,
highlighting concepts such as e-patient, as well as developing approaches to big data such
as free text analysis, sentiment analysis, and automatic coding of clinical data to obtain:
• Personalized service.
• Adherence to treatment.
With the study of data obtained, people decide about their health and diseases.
95
Big data
Topic 5. Use cases in industry
In addition, a higher quality of assistance is achieved, having a close, direct and direct treatment.
and customized.
This is why, with the big data techniques used in the medical sector, it is possible:
• Take advantage of all the information generated through social networks, social media
applications and health, etc.
• Optimize teleassistance.
Personalized medicine leads to closer and more personalized relationships between the
various healthcare professionals and patients, leading to greater involvement of all of them.
The realization of data analysis leads to the modeling of the individual and enables each
individual to do what is best for him or her, increasing the efficiency of the corresponding
system.
Taking the existing model as a reference, and all the data available, algorithmic models can
be created in the health sector, allowing to improve information, manage and study
individuals and their needs in better conditions, supporting health professionals:
• Epidemic control.
• Detection of high-frequency patients (those who visit the doctor's office twelve or more
times a year).
• Prediction of readmissions.
96
• Early detection of diseases.
• Evaluation of expenses.
In addition, there are countless possibilities to improve the healthcare system, the general
well-being of people, etc.
IMPORTANT
These new technologies offer infinite possibilities that will soon revolutionize the way we do business.
how to manage the healthcare system.
For example, to carry out the different analytical techniques (prescriptive, predictive and
descriptive) to medicine in general, to study efficiently the risks of the processes, and to know
which can be the best effective treatment for the individual, to know what are the needs of
the organizations and also what is the availability of beds in a hospital, among many other
existing problems.
Another of the successes of the application of these new technologies is the ability to reduce the
cost of health services and mortality numbers, using big data techniques in medicine.
Data analysis in health, sports, and general wellness increases the likelihood of
success of all systems operating in the health sector.
The goal is to find a successful software to take advantage of the information obtained.
of the data, and to be able to make the best decisions.
The following are some of the applications in the engineering world in which big data plays a
key role:
97
Big data
Topic 5. Use cases in industry
However, and knowing that many cities in the world call themselves smart, in reality there are
very few that make use of new big data technologies, a n d t h a t is why the efficiency
achieved by these cities is far from what can be considered a real smart city.
For a city to be considered smart, all the variables that may affect it must be known and
coordinated to achieve efficiency in areas such as urban mobility, resource management,
public safety, etc.
One of the most important aspects of smart cities that new big data technologies can
contribute to is traffic management.
With enough information, and using a series of appropriate machine learning techniques, it is
possible to know in advance how something is going to behave or what is going to happen, in
order to make decisions.
EXAMPLE
Amazon Polly is a tool that converts text into speech, allowing you to
create applications that speak, as well as new product categories with
that capability.
• Study what happens when major events such as concerts or matches take place
soccer.
98
It is, t h e r e f o r e , one of the tools that will have the greatest impact in the future w i t h i n
modern cities, being essential in the development and evolution of these in the coming years.
Today, people are not aware of the importance of logistics in real life, and that it plays a
major role in the world as we know it.
This is why, due to its importance and the constant evolution of the sector, work must be
done to make logistics more efficient, and consequently increase management and planning
in all senses of the word.
IMPORTANT
Big data applied to new information and communication technologies
generates great benefits in the logistics sector, as it allows, among other
functions, to facilitate better transport routes, study cost data, and thus
improve customer experiences.
It should be noted that one aspect that revolutionized sea transportation has been the
standardization of the container, and it is through new big data technologies that the industry
continues to evolve.
Nowadays, and due to the use of big data applications, business organizations in the sector
can consult in real time the transports and shipments in order to control all the key aspects of
the business.
In this way, big data applications in relation to logistics lead to a great increase in efficiency,
in addition to improving the customer experience and even creating new business models.
In relation to road transport, the ORION (on-road integrated optimization and navigation)
system improves the planning and management of commercial routes, due to the study and
evaluation of routes through a system of recommendations, improving the efficiency of road
transport.
OTHER APPLICATIONS
As is well known, the use and development of big data technologies is mainly focused on
management and planning.
In previous points we have seen the use of big data in smart cities and logistics
management, but it can also be used in airports, road safety, etc.
It should be added that the use of big data is not limited to the management and planning of
any of the sectors. On the contrary. In the near future, it will be used and implemented in any
use or sector of civil engineering.
We just have to wait and appreciate the greatness of the new technologies and the successes that
will be achieved in the future will achieve.
99
Big data
Topic 5. Use cases in industry
The same is true for movie and series platforms such as HBO or
Disney+.
The markets are continuously functioning and operating, and the operations that are carried
out cause a constant flow that generates millions of movements, from which a great deal of
information can be extracted:
• Minimize risks.
• Detect illegalities.
PUBLIC SECTOR
It is well known that public administrations are the ones that manage and carry out those tasks
that must be solved in quite problematic situations in the following areas:
• Economy.
100
• Transportation.
• Safety.
• Environment.
This is why solutions are continually being sought to solve problems and thus
to meet the corresponding needs:
• Improve services.
• Knowledge management.
• Cybersecurity.
• Infrastructure.
• Evaluation of data for better management of complaints and claims filed by citizens.
As is well known, in recent years, the telecommunications media sector has evolved and is
evolving by leaps and bounds.
• This is why the different media in this sector should focus on:
• Obtain predictions from the information collected in order to be able to make decisions.
the right ones.
• Detect the different incidents that may occur in real time, and thus,
to provide an answer as soon as possible.
RETAIL
The trade sector in its various aspects continues to undergo continuous transformation and to
a terrifying speed.
IMPORTANT
Big data applications in the retail sector provide important customer
information reflecting tastes, trends and payment methods in purchasing
decisions.
101
Big data
Topic 5. Use cases in industry
This is why this sector is taking advantage of the different big data technologies to follow
better experiences for its customers:
Thus, the industry should use the different big data technologies for the following actions:
• Segmentation by profile.
• Inventories.
• Expense management.
• Revenue management.
• Market research.
• Customer-oriented approach.
Big data is a trendy term, and even today, it raises many questions every time it comes up.
More and more applications and examples of big data are being created in different sectors.
For example, when we shuffle Netflix or Spotify, they propose us songs and music based on
previous searches or plays already made in the applications.
Although the speed and amount of information used by these applications is not immediately
apparent, there are a myriad of machine learning technologies working to improve customer
experiences.
There are areas of big data that are being a big differentiator.
102
We have already talked about different sectors where big data improves people's content
and experiences, but, in addition, these applications are concentrated in economic sectors,
leading to better results:
• Big data in digital marketing. It is well known that traditional marketing used the
customer relations and surveys to generate their work.
In the past, organizations would run advertisements in different media and so on, and the
impact of all these types of advertisements on individuals was really known.
But along came the internet and the development of big data technologies, and digital
marketing was transformed.
Today, big data can generate enormous amounts of information and data in a short period
of time, thus providing insight into customer behaviors and needs.
This study of data and information helps professionals in the sector to carry out
campaigns and news, focused directly on their customers, depending on their
preferences and tastes, in order to generate better experiences among their customers.
One of the biggest success stories and examples of big data in this sector is Amazon,
which analyzed and studied the millions of purchases made by its customers around the
world, and then analyzed the preferences of tastes and purchases and payment
methods, in order to generate offers and advertisements focused on all these tastes and
ways of acting.
• Big data in business insights. One of the biggest applications of big data is the way in which it can be
used to
to generate business information.
If these data were used correctly, many current problems could be solved.
It would generate great profits, product development and great experiences in the
customer satisfaction.
However, it must be said that more and more organizations are realizing the importance
of big data and the use of these technologies to analyze all the information and generate
great business results.
For example, the success of Netflix, which uses the data obtained to know how
customers behave, which series and movies they like, etc., can be used as an example.
• Big data in the banking sector. The enormous amount of information and data that the
banking sector continuously receives has skyrocketed with the outbreak of the pandemic.
103
Big data
Topic 5. Use cases in industry
With the information received, studies and analyses can be carried out in order to help, by
example, to illicit activities, such as:
– Money laundering.
– Alteration of data.
Currently, there are different applications to detect money laundering, such as SAS AML,
which uses data analytics in banking to detect suspicious transactions and study the data
of this type of movements.
EXAMPLE Nowadays, all banking entities work with computer applications that
recreate and have files and statistics on each and every one of their
customers in order to offer them, in relation to their contracted products and
services, those other products or services that may be of interest to them.
For example, if a client has a certain amount of money in one of his
accounts at a certain bank, the bank's IT tools will offer on screen
investment possibilities in products such as mutual funds or fixed terms,
to move that money.
The employees of that entity will obtain that information, and will then sit
down with the client to offer them that investment possibility and be able
to generate a return on their savings.
• Big data in the media and entertainment industry. When customers access different
services and also through different devices, a large amount of data is created.
This is why today a large amount of information is generated and must be analyzed by big
data applications in the media and entertainment sector.
EXAMPLE An example of the use of big data in the media and entertainment
industry is the streaming platform for watching movies and series
Amazon Prime.
When searching on this platform for some type of series (adventure,
comedy, science fiction) the platform collects information about the
users' tastes, and thus knows what information to show, generating
useful information. For example, if a person has recently searched for
science fiction movies, the platform recognizes this and will again show
results or movies related to that topic.
In addition, there are social media platforms that generate a lot of information by
interacting with them.
The organizations in this sector have seen the following as a great possibility
to process this data with the different big data applications in order to grow.
104
Here are some of the many benefits of using big data in social networks:
– Predicting audiences.
– Optimized programming.
• Big data in the government sector. Nowadays, governments collect a large amount of
information and data on a daily basis. This means that they must manage and analyze
citizens' trends and tastes, as well as different variables such as geographic cases,
energy resources, etc.
This study and evaluation by governments helps them in many different ways, such as:
– Know how to study and evaluate different circumstances in order to be able to make
immediate and appropriate decisions.
Today, big data uses an infinite number of tools and applications that cover an infinite number of
areas. data in continuous movement and growth.
Traditional software applications and programs use databases that are not
the most appropriate.
DID YOU KNOW The media industry has always generated data, whether it's research,
THAT...? sales, customer databases, log files, etc. The technical solutions,
strategies, and data sets of big data offer the ability to manage and
disseminate data at speeds and scales that have never been seen
before.
105
Big data
Topic 5. Use cases in industry
In addition, current analysis and research applications are urgently needed to be able to
retrieve large amounts of information and data to achieve business objectives.
EXAMPLE
One example of the use of big data today is Amazon Translate. This is a
neural machine translation service that translates languages easily and
simply.
The following are some of the programs used to manage big data:
• Hadoop. It is an open source tool that helps to process large amounts of data, and from
that recovery, study and evaluate them.
One of the companies that has used Hadoop, to the point of signing a partnership
agreement with Cloudera, a distributor of Hadoop-based software and services, is
Oracle. Many of this company's customers had a problem with their data: since the early
2010s, it had been growing at a rate of 40% per year, which also meant an annual
increase of between 3% and 5% in IT budgets.
Oracle has Oracle Big Data Appliance, which is capable of reducing the volume of data
by up to ten times in order to consolidate its storage and increase its level of
sustainability. However, when the amount of data is problematic, Oracle turns to Hadoop.
• NoSQL. These are systems that do not use SQL as a query language, so although data
integrity cannot be assured, significant benefits in scalability and performance can be
obtained.
Among the most important NoSQL databases are MongoDB and Cassandra.
Cassandra is an open source database whose most relevant feature is that it serves to
link Amazon's Dyname and Google's BigTable. In this way, it offers the possibility of
solving the problem associated with search engine performance. In fact, this application
was created to make the operating configurations highly scalable, economical and
horizontal.
Among the organizations that use it, we can mention Facebook, Netflix or Twitter.
106
– Pinterest. This social network uses Spark to know how its users react to certain pins
in real time, so its algorithm can make personalized recommendations based on those
interactions.
– Conviva. This video streaming platform uses Spark to optimize video traffic and,
among other things, reduce video churn.
– Uber. It structures the vast amount of information it collects from its users' trips using
various programs, including Spark Streaming. However, more complex analyses of
structured data are only performed with Spark.
107
KEY IDEAS
• There is a multitude of data generated by all those big data techniques, such as data
personal, clinical reports, etc.
• Being able to anticipate and thus meet the needs of sick individuals is one of the
objectives of big data in the use of medicine.
• When new big data technologies are used in the medical sector, they are considered to
be the most important in the field of medicine gue to save lives, being the most important
thing, and also to minimize costs.
• Data analytics in health, sport, and general wellness, increases the likelihood
of success of all systems operating in the health sector.
• The trade sector in its various aspects continues to undergo continuous transformation
and
at a terrifying speed.
• Today, governments collect a large amount of information and data on a daily basis. This
means that they must manage and analyze the trends and tastes of citizens, in addition
to different variables such as geographic cases, energy resources, etc.
109
GLOSSARY
— Big data in business insights. How to generate business information.
— Big data in finance. Tools and computer applications focused on the study and
development of information data in the banking and financial sector.
— Big data in digital marketing. Big data allows to generate huge amounts of
information and data in a very short time, and thus, to know the behaviors and needs
of customers.
111
BIBLIOGRAPHY .
Casas, J. (2019). Big data: data analysis in massive environments. Barcelona: Editorial UOC.
Martínez, M. Á. (2018). Public health concepts and preventive strategies. Editorial Vital-
Source.
Marr, B. (2016). Using Big Data, analytics and SMART metrics to make better decisions and
increase performance. Editorial Teell S.L.
113