Big Data Framework

The paper discusses the significance of Big Data in the Information Era, emphasizing the need for organizations to adopt data-driven decision-making to maintain a competitive edge. It proposes a comprehensive framework consisting of three stages and seven layers to facilitate the development and integration of Big Data applications, addressing challenges such as data volume, variety, and privacy. The framework aims to enhance the management of Big Data by bridging the gap between technical capabilities and strategic decision-making.

Uploaded by

4ybjv7ykwq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views6 pages

Big Data Framework

Uploaded by

4ybjv7ykwq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

2013 IEEE International Conference on Systems, Man, and Cybernetics

Big Data Framework

Firat Tekiner1 and John A. Keane

School of Computer Science,
The University of Manchester,
Manchester, UK
ftekiner@[Link], [Link]@[Link]

Abstract— We are constantly being told that we live in the In the Big Data era, computation and storage is cheap per
Information Era – the Age of BIG data. It is clearly apparent that TB. Therefore, with ever-growing computational capabilities,
organizations need to employ data-driven decision making to system utilisation is no longer as critical a factor. It is now
gain competitive advantage. Processing, integrating and feasible to use more computational power to do the same work
interacting with more data should make it better data, providing (hence with lower utilisation). At the same time, the amount of
both more panoramic and more granular views to aid strategic data that needs processing has been increasing exponentially in
decision making. This is made possible via Big Data exploiting the past decade as a result of improvements in data generation
affordable and usable Computational and Storage Resources. and storage capacity [1]. Above all, programming tools and
Many offerings are based on the Map-Reduce and Hadoop
methodologies have matured with globalisation and the
paradigms and most focus solely on the analytical side.
Nonetheless, in many respects it remains unclear what Big Data
Internet. It is increasingly feasible to reuse code (and also share
actually is; current offerings appear as isolated silos that are
it), therefore, the focus has moved to integrating codes created
difficult to integrate and/or make it difficult to better utilize by different communities.
existing data and systems. Paper addresses this lacunae by High performance network capacity, that provides the
characterising the facets of Big Data and proposing a framework backbone for high end computing systems, has not increased at
in which Big Data applications can be developed. The framework the same rate as processing and storage capabilities. Therefore,
consists of three Stages and seven Layers to divide Big Data the constraint in computation has simply shifted from moving
application into modular blocks. The aim is to enable
data to a big supercomputer, to moving the application to many
organizations to better manage and architect a very large Big
Data application to gain competitive advantage by allowing
smaller computers where the data resides (function shipping
management to have a better handle on data processing. rather than data shipping). Programming such an approach is
not new, the application is executed where the data is kept in a
Keywords: Big Data, data scientist, analytics, business loosely coupled and highly distributed architecture [8].
intelligence, information management, strategy, hadoop In contrast, Relational Database Management Systems
I. INTRODUCTION (RDBMS) tend to provide access to data as one Big Data silo
based on efficient closely coupled systems. Structured Query
We live in the Information Era – the Age of BIG data Language (SQL) is the de-facto method to access databases as
[1][2]2. it provides relatively easy access to data at different levels
As an example, Big Data’s significance and power became within organisations. It is common to see low-level
apparent when the results of the 2012 US Presidential Elections programmers and high level business analysts sharing the same
were announced. Complex analytics processing large data not piece of SQL and understanding, or trying to understand it.
only predicted the exact election results but may also have This sharing model has its limitations and cannot exploit
influenced it [3][4]. Further, leading business magazines and and handle the massive increase in static non-changing data.
economical newspapers run frequent articles about Big Data’s Recently, there has been an increase in NoSQL approaches to
success [5][6]. overcome these weaknesses [9]. Despite their relatively recent
However, it should be recognised that Big Data is not emergence, there are now more than one hundred NoSQL
something new, it has long been the playground of the elite. approaches that specialize in management of different multi-
The aim was to maximise expensive CPU utilisation. As a modal data types (from structured to non-structured) and with
result, it had a limited audience as computation and storage was the aim to solve very specific challenges. Most are powered by
expensive and difficult to utilise requiring detailed systems the Map-Reduce paradigm that came from Google, which is
knowledge where capacity doubles every 18 months [7]. based on a massively distributed architecture that exploits
cheap commodity hardware. As a result, the need for efficient
1
mechanism for storing and processing data is eliminated. It is
Firat Tekiner has an honorary research fellowship at the School of Computer in fact cheaper to duplicate (for reliability) and to over-
Science, University of Manchester.
2
We bemusedly note that “big” has to an extent replaced “very large” from a
compute (process duplicate data) as communication is
previous generation – both remain undefined in any quantitative senses and relatively more expensive than storage and computational
seem to mean “whatever data amount challenges the state-of-the-art”. resources (and this gap is increasing).

DOI 10.1109/SMC.2013.258
Despite falling hardware and system development costs, the II. BACKGROUND
size and complexity of the systems have been increasing to There have been many attempts to provide a standard
process more data in shorter period resulting in higher staff and programming environment, such as Message Passing Interface
management overhead; consequently, the total cost of (MPI) and OpenMP, but by definition these do not solely target
ownership (TCO) of systems remains approximately the same performance exploitation. Therefore, the paper will use the
[10]. Map-Reduce paradigm as the basis for discussion as it has
Big Data is perceived as “the new driver of competitive acted as the catalyst for increasing Big Data uptake despite it
advantage” [11]. Big Data applications have high Volume, high not being the sole model for Big Data.
Variety and high Velocity as findings are expected to be The combination of cheap storage/computation
delivered very quickly [12]. In reality, Big Data is largely complemented with the Map-Reduce programming
driven by the need to analyze massive volumes of data to gain methodology has enabled non-traditional High Performance
competitive advantage and to use previously intractable Computing (HPC) users to process massive amounts of data in
processes to find information/relationships [13] [14]. This also a timely manner. Furthermore, availability of open-source java
provides more opportunities to bring in applications and fields applications and Map-Reduce resulted in very rapid uptake and
to enrich data further to provide even better analysis. However, use. Google’s success with Map-Reduce, followed by Yahoo’s
the wide technology gap between industrial applications and Pig and Facebook’s Cassandra applications, has attracted much
decision makers remains a challenge. Decision makers need to research and industry attention [17].
understand the data and technologies better in order to extract
information to aid strategic decision making. Therefore, we A. MapReduce and Hadoop
conjecture that firms that bridge this gap quickly should gain Map-Reduce was introduced by Google in order to process
significant competitive advantage by unearthing information and store large datasets on commodity hardware. It provides a
that is not (immediately) available to others. programming paradigm which allows useable and manageable
Nonetheless, despite its popularity, both the characteristics distribution of many computationally intensive tasks. As a
of “Big Data” applications and how they fit in a larger business result, many programming languages now have Map-Reduce
context remain unclear [15]. For example, Big Data means implementations which extend its uptake. On the other hand,
different things to a DBMS vendor and a hardware vendor. Hadoop is a highly popular free Map-Reduce implementation
They focus on solving problems solely in their area rather than by the Apache Foundation [17]. With the popularity of the
on the whole picture. As a result, there has been criticism that hadoop applications there have been many complementing
Big Data is simply hype. We disagree and believe Big Data applications developed by the open source community and
offers a paradigm shift in both technical challenge and business packaged up under apache foundation [18].
opportunity – and requires this hybrid view.
Isolated applications processing data in a standalone
fashion are insufficient; an approach is needed to make it easier
to combine knowledge with Big Data analysis to provide an
integrated enterprise level solution. This can be achieved only
by marrying quantitative and qualitative analysis. Furthermore,
Data Sources together with the underlying systems needs to be
taken into account whilst integrating this information to
achieve the velocity and volume aspects of the data. Therefore,
there is a need to bridge the gap between domain experts,
statisticians and computational methodologies [16].
The purpose of this paper is to more towards a more
comprehensive definition of what “Big Data” really is by
identifying and reviewing its characteristics; derived from this
we propose a framework that encompasses all layers of a Big
Data application and by defining lines of Big Data with a
strategic view. The framework therefore, aims to synthesise
and analyse all relevant aspects from users to Systems that
define a Big Data environment. This work attempts to suggest
future directions that such systems will take and how Figure 1. Map-Reduce - key/value pair processing
organizations can tractably and reliably increase their
competitive advantage by exploiting advances in Big Data Map-Reduce involves two main parts, a map operation,
analysis. where a simple function is used to emit key/value pairs in
parallel similar to using primary keys in the relational database
The structure of the paper is as follows: in Section II related world. Once the data to be processed is mapped into key/value
background is considered; the Big Data framework is proposed groups then a reduce operation is used to apply the core
in Section III; finally, Section IV presents conclusions. processing logic to produce results in a timely manner [19].
The simple concept of Map-Reduce removes many traditional

1495
challenges in HPC to achieve fault tolerance and availability. market place. Therefore, organizations that can respond and
Therefore, it paves the way for development of highly parallel, employ talents to understand, analyze, process and manage this
highly reliable and distributed applications on large datasets. information life cycle will lead the way. This is what has
generated the interest, and hype, associated with big data. As a
III. BIG DATA FRAMEWORK caveat, while integration of data provides many advantages, a
A. Big Data Characteristics significant associated risk is data privacy and ownership of the
data. This is usually omitted or not understood at this stage and
Big Data is not only driven by the exponential growth of sometimes later) as the rush is to gain competitive advantage.
data but also by changing user behaviour and globalization.
Much more time is being spent online and using mobile The paper argues that a Big Data application is the
devices. Furthermore, globalization of the marketplace orchestration of all the software and hardware systems within
increases competition. As a result, organizations constantly the enterprise that generates and processes data. It means
look for opportunities to increase their competitive advantage something different for each person, application or
in an increasingly competitive market place by using better organization. For example, even a click whilst browsing a web
analytical models. Hence, it is necessary to present findings in page can be an input or a heartbeat packet that is sent over the
a more clear and concise form. In turn, there has been a network to inform that a system is still up and running. Up to
commensurate increase in business intelligence applications now it was infeasible to store or process at this level of
that allow better reporting and visualization of the data. information. However, the information era is changing this.
To derive the framework, we firstly define the B. Framework
characteristics of “Big Data”:
There are formal approaches to project management that
1. Data/Processing Volume and Scale provide a methodology to manage Information Systems and
2. Variety and Heterogeneity of Data/Sources drive strategy in Organizations; to name a few, Open Group’s
TOGAF, IBM’s Zachman and Gartner’s methodology [23].
3. Speed and Timeliness of Information Requirement However, these are not designed to provide a framework that
has a data focus, rather their aim is to provide methodologies to
4. Targeted Services, Products, Solutions and
manage large information systems. In addition, [24] proposes a
Applications
framework that looks into Big Data governance with an aim of
5. Data Presentation, Usability and Interpretation managing people and policies. In contrast, having identified the
characteristics of Big Data, this paper aims to define a
6. Data Privacy, Error Handling and Security framework that captures all the stages of a Big Data application
Data volume has been increasing exponentially: up to 2.5 with a strategic point of view of focusing on data. Although,
Exabytes of data is already generated and stored every day. [25] provides a Big Data methodology with a data focus it does
This is expected to double within 40 months by 2015 [20]. As not take into consideration the systems aspect of a Big Data
always there remains the challenge to process such volume. environment. Furthermore, there is a need to bridge strategic
decision making and real life scenarios. This paper aims to fill
The variety and heterogeneity of data sources and storage this space.
has increased, fuelled by the use of cloud, web and online
computing. The challenge then becomes identification of data Without a coordination and structuring framework there is
that will add value, and hence increase information content and likely to be much overlap amongst applications, duplication in
competitive advantage [21]. Clearly, currency of information is stored information and confusion around the responsibilities of
crucial as analytics derived from new data is usually more each business unit and application. The framework here seeks
valuable than old. to document the borders of each modular block to allow gaps to
be spotted in a Big Data application and provide solutions by
For example, consider an online business which analyses closer integration. Further, it aims to highlight how Map-
every click on its website. An advert or offering is made based Reduce can be included into the different stages and layers of
on user’s movement and activities whilst browsing and the Big Data application life cycle. Whilst doing this, all
shopping online. However, the adverts can be better targeted surrounding issues and approaches are considered.
and the customers better segmented if customer profiles can be
updated and integrated in sub-second intervals [22]. There are The framework should ultimately provide a basis to
different patterns to the data and it is presented in different develop and manage Big Data applications whilst identifying
shapes. For example, management requires reporting and strategies based on core competencies and weaknesses. In
statistical analytics to be made available based on new data in addition to the 7 layers identified in Figure 2, the process as a
order to be able to respond rapidly to changing requirements whole can be summarized in 3 main stages as below:
[21]. As these analytics provide predictive insight, the resulting STAGE 1 Multiple Data Sources - Choose the Right
decisions are both more robust and timely. Data [26]
However, a shortage of skills and immature tools makes it a STAGE 2 Data Analysis and Modelling
daunting task for organizations to present and interpret this
newly discovered information and capability. Current STAGE 3 Data Organization and Interpretation
hierarchical management models introduce difficulties in
dynamic development and adaptation in an ever changing

1496
Stage 1 is concerned with acquisition and filtering of data available in Big Data applications [28]. Therefore, there may
by applying correct metadata and processes. Multiple data not be an upfront model whilst organizing the source data to
sources are integrated and transformed to add meaning to the the target. As a result of this, there have been a large number of
data. This process is the major source of added-value (to data) applications that focus on providing access to these data
and allows organizations to gain competitive advantage. sources via NoSQL without using SQL. They attempt to create
indexing schemes similar to RDMBS and provide quick access
Stage 2 then uses the information prepared in Stage 1 to to data residing in the Hadoop file system [29][30].
apply analytics and predictive models to find relationships and
patterns that were not initially known. The level of intelligence Presentation and visualisation of data is an important task.
applied depends on the computational capabilities and skill-set The NoSQL option changes the dynamics in terms of accessing
available together with the business requirements. Big Data and presenting the data. With increasing data to be analysed
uses internal and external datasets from a variety of sources to and processed, therefore, output needs to address both clarity
provide information to aid strategic decision making to gain and precision of presentation. In addition, interpretation of
competitive advantage. It allows focus on the current and the results is a major challenge that requires highly skilled staff.
future rather than traditional historical reality. Whilst doing
The Processing stages described map onto the 7 layers of
this, it further requires cross-functional collaboration at both
the framework. Each application may focus on different layers
business and technical level (data sources and systems) [27].
and may not employ all parts of it. A Big Data application then
Stage 3 then deals with modelling the source information becomes a major orchestrating effort whereby a large number
and mapping the data to the target model whilst interpreting the of moving parts needs to be composed to work seamlessly to
meaning of the newly discovered information. The relational achieve results that enable competitive advantage.
data model does not naturally accommodate the unstructured
and heterogeneous data sources that are expected to be

Figure 2. Big Data Framework

1497
From a detailed perspective, how the Map-Reduce tasks in order to manage and integrate this spread within an
will be applied depends on the application. As each map and organisation. To clarify what Big Data really is, it is the
reduce process can run in parallel, both can be used to speed enterprise data processing environment for heterogeneous
up processing. Furthermore, at any given time, a number of data and computational sources in a timely manner to gain
Big Data applications can run at different layers or at competitive advantage. This results in the processing of high
different stages. volume of data and presenting this in a concise and clear
manner to aid operational and strategic decision making.
An important challenge is to bring together and map the
relational database model with columnar, key-value stores Due to the immaturity of the field, there is little or no
and unstructured data. For example, Banks are experts about coordination across Big Data silos (applications). Big Data
their customers; such information may be multiplied in value does not only require in-depth systems and data expertise but
if joined together with unstructured sources [5]. In addition, also requires strategic insight due to the nature of the
Business Intelligence and Reporting applications requiring applications [11]. Such applications are evolving very
aggregations on a certain field are best served by DBMS that quickly and designed to aid strategic decision making by
employes a columnar storage. Given that Business responding rapidly to changes in market place. Organisations
Intelligence traditionally uses RDBMS accessed via tools that are very hierarchical and bureaucratic may initially
based on SQL, a change is needed. struggle to compete with the new economy companies [33].
This is evident from the fact that the companies that
Whilst the framework looks at and across all dimensions successfully apply Big Data applications are the likes of
of the problem, almost all current Big Data approaches are Google, Amazon, Yahoo and Facebook who are the leading
silo-based without coherent linkage or integration. The new economy companies.
modelling and mapping layer aims to do this with respect to
data. For example, how the system and storage resources There is a lack of the multi-faceted role skills (analyst,
could be shared amongst different applications that would developer, architect and management) required to orchestrate
come under the Big Data framework is not a primary apsect such applications [16]. The framework proposed here aims to
of the design. This resource scheduling and maintenance document and structure this gap and provide a starting point
needs to be managed at the system layer. In terms of storage, for practitioners, analysts and management to develop and
the challenge is bigger due to recent improvements in the exploit their Big Data applications.
medium; hybrid solid state, optical and hybrid disks operate
at various speeds in addition to slower archiving systems. The framework can be seen as a cube corresponding to
Separately, the data layer is expected to manage different the levels. Each face represents an important level within the
data sources, handle data lineage and eliminate duplication Big Data space, while the cube as a whole represents the
that would otherwise be inevitable in an island of entire Big Data space and the integrated whole we believe to
applications. This gap will grow further as storage struggles be essential for effective deployment and evolution of the
to cope with data growth and the Hadoop File System associated applications. To achieve this requires efficient and
(HDFS) provides a cheaper yet reliable and performant effective orchestration, integration and coordination of skills
alternative to current storage systems [31]. In addition, data that address the challenges both within and across all seven
privacy and security [32] aspects could be more easily levels defined in the previous section. This then needs to be
managed within this layer under one common enterprise further complemented by novel management and decision
wide policy. making strategies [34][35]. Thus there is a need for more
technical managers and decision makers, and the lack of
It has to be noted that, there are components which people with analytic skills is likely to be a challenge.
cannot be divided, such as SQL and DBMS or DBMS and
hardware. For instance, modern DBMS make explicit use of Big Data intersects with numerous domains including
and manage the underlying hardware. As a result of this there data integration, hardware and software, databases, Business
will also be overlaps and islands in an organisation. Hence, Intelligence, system integrators and consulting firms [36].
the application of the framework and abstraction based on The associated skill set is vast, and this is one of the
the different layers. challenges and confusions surrounding Big Data applications
[15]. The associated scale is daunting and well indicates the
IV. CONCLUDING REMARKS need for integration to achieve effective and efficient use of
Big Data [37]. Organizations need to be singularly focused
Processing larger datasets has become increasingly whilst providing and employing Big Data solutions as
possible over the past few years for a much larger management of all elements of the framework is challenging.
community, not least via the development of the Map- Furthermore, there are many, and growing, numbers of
Reduce paradigm. Map-Reduce enable the power of parallel applications that aim to use this improved “knowledge”.
computing to be available to standard data analysis tasks3. When all these aspects are considered, the combined issue is
As mentioned before, the main challenge in applying technical management and people management. Traditional
many islands of Big Data applications is to identify the management does not understand and cannot be expected to
defining lines of each application and their inter-relationship, understand what can be achieved and what the related
challenges are. Hence, the framework is proposed to aid
3
It should be noted that Map-Reduce has weaknesses in that it is not, by decision making process and bridge the gap with business
design, general-purpose, but rather was designed for something very needs and technical realities.
specific: keyword processing and access.

1498
REFERENCES [23] Roger Sessions, Microsoft Developer Network Architecture Center,
"A Comparison of the Top Four Enterprise Architecture
[1] The Economist, Nov 2011, "Drowning in numbers – Digital data will Methodologies", May 2007, [Link]
flood the planet and help us understand it better", us/library/[Link]
[Link]
[24] Malik P., "Governing Big Data: Principles and practices", IBM
[2] Lohr S., Feb 11, 2012, "The Age of Big Data", New York Times, Journal of Research and Development, Volume:57 , Issue: 3/4, May-
[Link] July 2013, pp 1-13.
[Link]
[25] Miller G. H. and Mork P., IT Pro, Jan-Feb 2013, “From Data to
[3] Lynch M., Nov 13, 2012, "Barack Obama's Big Data won the US Decisions: A Value Chain for Big Data”, IT Profesional, 15(1), 57-59
election", Computerworld,
[Link] [26] Barton D. and Court D., October 2012, "Making Advanced Analytics
_s_Big_Data_won_the_US_election Work for You", Harvard Business Review 89(10):78-83.
[4] Fanning K. and Grant R.,July/August 2013, "Big Data: Implications [27] Goyal M., Hancock M. Q., and Hatami H., July-August 2012,
for Financial Managers", Wiley Journal of Corporate Accounting & "Selling into Micromarkets", Harvard Business Review 89(7-8):78-
Finance, (24):5:23–30. 86.
[5] The Economist, 19 May 2012, "Big data - Crunching the numbers", [28] Rong C., Lu W., Wang X., Du X., Chen Y., Tung A. K. H., 02 Oct.
[Link] 2012. "Efficient and Scalable Processing of String Similarity Join"
pre-print IEEE Transactions on Knowledge and Data Engineering,
[6] Taylor, P, 26 June 2013, "Big data in the spotlight as never before", <[Link]
Financial Times
[29] Chandrasekar S., Dakshinamurthy R., Seshakumar P.G., Prabavathy
[7] French R. M., December 2012, "Moving beyond the Turing test", B., Babu C., 4-6 Jan. 2013, "A novel indexing scheme for efficient
Communications of the ACM, 55(12):74-77 handling of small files in Hadoop Distributed File System", In
[8] Tekiner F., Tsuruoka Y., Tsujii J., Ananiadou S., Keane J., "Parallel Computer Communication and Informatics (ICCCI), 2013, ISBN:
Text Mining for Large Text Processing", pages 348-353, in 978-1-4673-2906-4, 1-8.
Proceedings of IEEE CSNDSP2010, 21-23 July, Newcastle,UK [30] Gudmundsson G.P., Amsaleg L., Jonsson B.P., 27-29 June 2012,
[9] Bonnet L., Laurent A., Sala M., Laurent B., Sicard N., September "Distributed High-Dimensional Index Creation using Hadoop,HDFS
2011, "Reduce, You Say: What NoSQL Can Do for Data Aggregation and C++”, Content-Based Multimedia Indexing (CBMI), 83-88, E-
and BI in Large Repositories", dexa, pp.483-488, 22nd International ISBN:978-1-4673-2369-7
Workshop on Database and Expert Systems Applications, 2011 [31] Saran C., "Storage struggles to keep up with data growth explosion",
[10] Merrill R. D., June 2011 , "Storage Economics - Four Principles for Computer Weekly, 12-18 February 2013, pp.17-19
Reducing Total Cost of Ownership", Hitachi Data Systems [32] Terence C. and Ludloff M. E., "Privacy and Big Data", Orielly, 2011,
Coroporation White Paper. ISBN: 978-1-449-30500-0
[11] McGuire T., Manyika J., Chui M., July / August 2012, "Why Big [33] McCallum J. S., March/April 2001, "Managing in the new economy:
Data is the New Competitive Advantage”, Ivey Business Journal, evolution or revolution?", Ivey Business Journal,
[Link]/topics/strategy/why-big-data-is-the- [Link]
new-competitive-advantage organization/managing-in-the-new-economy-evolution-or-revolution
[12] Mark B., "Gartner Says Solving 'Big Data' Challenge Involves More [34] McAfee A. and Brynjolfsson E., October 2012, "Big Data: The
Than Just Managing Volumes of Data". Gartner, June 27, 2011, Management Revolution", Harvard Business Review 89(10):60-69.
[Link]
[35] Rosenbush S. and Totty M., 8 March 2013, "How Big Data Is
[13] Johnson E. J., July/August 2012, "Big Data + Big Analytics = Big Changing the Whole Equation for Business", The Wall Street Journal,
Opportunity", Journal of Financial Executive, pp. 1-4. [Link]
[14] Nichols, W., March 2013, "Advertising Analytics 2.0", Harvard [Link]
Business Review, 91(3): 60-68. [36] Chen H., Chiang R. H. L., Storey V. C., December 2012, "Business
[15] Stonebraker M., Hong J., February 2012, "Researchers' Big Data Intelligence and Analytics: From Big Data to Big Impact," MIS
Crisis; Understanding Design and Functionality", Communications of Quarterly, 36(4):1165-1188.
the ACM, 55(2):10-11 [37] Courtney M., December 2012, "Big Data analytics: putting the puzzle
[16] Davenport, Thomas H., and D. J. Patil., October 2012, "Data together", Engineering and Technology Magazine, 7(12):pp 56-60.
Scientist: The Sexiest Job of the 21st Century." Harvard Business
Review 90(10):70-76.
[17] White T., May 2012, "Hadoop: The Definitive Guide", Third Edition,
O'Reilly, 978-1-449-31152-0
[18] Saecker M. and Markl V., "Big Data Analytics on Modern Hardware
Architectures: A Technology Survey", 2013, Springer Lecture Notes
in Business Information Processing, Volume 138, pp 125-149
[19] McCreadie R., Macdonald C., Ounis I., 2012 "MapReduce indexing
strategies: Studying scalability and efficiency", Journal of
Information Processing and Management: an International Journal
archive, 48 (5). pp. 873-888. ISSN 0306-4573
[20] Manyika J., Chui M., Brown B., Bughin J., Dobbs R., Roxburgh C.,
Byers A. H., "Big data: The next frontier for innovation, competition,
and productivity", McKinsey Global Institute, May 2011,
[Link]
_next_frontier_for_innovation
[21] Allen B., Bresnahan J., Childers L., Foster I., Kandaswamy G.,
Kettimuthu R., Kordas J., Link M., Martin S., Pickett K., Tuecke s.,
February 2012, "Software as a Service for Data Scientists",
Communications of the ACM, 55(2):81-88
[22] Smith S., Mar 04, 2013, "Is Data the New Media?" EContent
Magazine, March 2013 Issue:14-19.

1499

Hadoop Report
No ratings yet
Hadoop Report
110 pages
Hadoop & BigData (UNIT - 2)
No ratings yet
Hadoop & BigData (UNIT - 2)
22 pages
Understanding Big Data Characteristics
No ratings yet
Understanding Big Data Characteristics
20 pages
BIG Data1
No ratings yet
BIG Data1
49 pages
Unit 4 LT
No ratings yet
Unit 4 LT
16 pages
Seminar Big Data Hadoop
No ratings yet
Seminar Big Data Hadoop
28 pages
Big Data Analytics
No ratings yet
Big Data Analytics
49 pages
Big Data Seminar Overview and Insights
No ratings yet
Big Data Seminar Overview and Insights
57 pages
BIG DATA Notes
No ratings yet
BIG DATA Notes
11 pages
Bda Unit-I
No ratings yet
Bda Unit-I
15 pages
Data Science Essentials & Big Data Concepts
No ratings yet
Data Science Essentials & Big Data Concepts
20 pages
Data Mining With Bigdata
No ratings yet
Data Mining With Bigdata
30 pages
Big Data Processing and Analytics
No ratings yet
Big Data Processing and Analytics
29 pages
Introduction To Bda
No ratings yet
Introduction To Bda
67 pages
Daily Class Notes: Ugc Net
No ratings yet
Daily Class Notes: Ugc Net
5 pages
Big Data Analytics
No ratings yet
Big Data Analytics
21 pages
Big Data Analytics Overview and Tools
No ratings yet
Big Data Analytics Overview and Tools
38 pages
$RM5TSDQ
No ratings yet
$RM5TSDQ
70 pages
BDA UNIT - 1 - PDF
No ratings yet
BDA UNIT - 1 - PDF
143 pages
DBIS Lecture 4 - Slides (AI and Big Data)
No ratings yet
DBIS Lecture 4 - Slides (AI and Big Data)
84 pages
01 Unit-I Introduction To Big Data
No ratings yet
01 Unit-I Introduction To Big Data
11 pages
Seminar Report Alisha
No ratings yet
Seminar Report Alisha
22 pages
Big Data: Challenges and Technologies
No ratings yet
Big Data: Challenges and Technologies
6 pages
Big Data Analytics
No ratings yet
Big Data Analytics
10 pages
Bda CHP1
No ratings yet
Bda CHP1
83 pages
It-222 Reviewer
No ratings yet
It-222 Reviewer
3 pages
Big Data Applications in Business
No ratings yet
Big Data Applications in Business
11 pages
Big Data: Concepts, Techniques & Challenges
No ratings yet
Big Data: Concepts, Techniques & Challenges
9 pages
Introduction to Big Data Concepts
100% (1)
Introduction to Big Data Concepts
17 pages
Big Data Seminar Report Overview
100% (2)
Big Data Seminar Report Overview
27 pages
Unit 1 Big Data Notes
No ratings yet
Unit 1 Big Data Notes
48 pages
BDS Session 1
100% (1)
BDS Session 1
70 pages
Understanding Data: Types and Careers
No ratings yet
Understanding Data: Types and Careers
18 pages
Chapter 2-Data Science
No ratings yet
Chapter 2-Data Science
23 pages
Chapter 2
No ratings yet
Chapter 2
22 pages
Hadoop - MapReduce
No ratings yet
Hadoop - MapReduce
51 pages
Big Data Intro
No ratings yet
Big Data Intro
32 pages
Big Data Answers
No ratings yet
Big Data Answers
14 pages
Bangladesh University of Professionals: Submitted by Submitted To ID: Section: Batch
No ratings yet
Bangladesh University of Professionals: Submitted by Submitted To ID: Section: Batch
6 pages
Stream Processing Chapter 2
No ratings yet
Stream Processing Chapter 2
21 pages
Session 8 - George Strawn - Big Data
No ratings yet
Session 8 - George Strawn - Big Data
34 pages
Big Data Presentation Slide
100% (1)
Big Data Presentation Slide
30 pages
Bda Unit 1
No ratings yet
Bda Unit 1
47 pages
Big Data Overview: Types and Characteristics
No ratings yet
Big Data Overview: Types and Characteristics
15 pages
Understanding Big Data Concepts
No ratings yet
Understanding Big Data Concepts
16 pages
Big Data Project
100% (3)
Big Data Project
61 pages
Data Science
No ratings yet
Data Science
87 pages
Chap 1
No ratings yet
Chap 1
41 pages
Anand J. Kulkarn
No ratings yet
Anand J. Kulkarn
4 pages
BDA Unit 1
No ratings yet
BDA Unit 1
39 pages
Types of Digital Data & Big Data
No ratings yet
Types of Digital Data & Big Data
136 pages
Big Data Analysis Concepts and References
100% (1)
Big Data Analysis Concepts and References
60 pages
BDA Unit 1
No ratings yet
BDA Unit 1
50 pages
Advancing Organizations with Big Data
No ratings yet
Advancing Organizations with Big Data
10 pages
Unit1 - BDH
No ratings yet
Unit1 - BDH
77 pages
Chapter 1
No ratings yet
Chapter 1
21 pages
YouTube Video Summarizer in Regional Language
No ratings yet
YouTube Video Summarizer in Regional Language
5 pages
Big Bear 2A Block Diagram
No ratings yet
Big Bear 2A Block Diagram
54 pages
CP ENum Projectkuno
No ratings yet
CP ENum Projectkuno
5 pages
DL Module 3 Notes
No ratings yet
DL Module 3 Notes
25 pages
Mcsl26 See QP Solution 2024
No ratings yet
Mcsl26 See QP Solution 2024
33 pages
BS en Iso 22600-3-2014
No ratings yet
BS en Iso 22600-3-2014
80 pages
Rest Dataware: A Delphi Guide
No ratings yet
Rest Dataware: A Delphi Guide
20 pages
PXC Compact Series
No ratings yet
PXC Compact Series
9 pages
Malilipot High School Work Immersion Portfolio
No ratings yet
Malilipot High School Work Immersion Portfolio
16 pages
Neu Neuviz 16
No ratings yet
Neu Neuviz 16
12 pages
Cyber Security All Units Notes 2021-2022 (B.tech 2-2) Cse-Cs
No ratings yet
Cyber Security All Units Notes 2021-2022 (B.tech 2-2) Cse-Cs
58 pages
Math Reviewer Stem 2 2
No ratings yet
Math Reviewer Stem 2 2
16 pages
The Future of Cryptography
No ratings yet
The Future of Cryptography
2 pages
fd604gw DX Datasheet v2.0
No ratings yet
fd604gw DX Datasheet v2.0
5 pages
Recruitment, Selection and Hiring Process
No ratings yet
Recruitment, Selection and Hiring Process
5 pages
CTE 111 - Introduction To Computers & Information Technology
No ratings yet
CTE 111 - Introduction To Computers & Information Technology
169 pages
UIUX Cheatsheet
No ratings yet
UIUX Cheatsheet
3 pages
Comprehensive COA Question Bank
No ratings yet
Comprehensive COA Question Bank
12 pages
Kasilingam 2020
No ratings yet
Kasilingam 2020
15 pages
Mastercam Dynamic Milling Tutorial
No ratings yet
Mastercam Dynamic Milling Tutorial
100 pages
March Test1 Xiia
No ratings yet
March Test1 Xiia
10 pages
Project Reference Paper
No ratings yet
Project Reference Paper
5 pages
Final Review Problems
No ratings yet
Final Review Problems
3 pages
CE-303 Operating Systems Exam Guidelines
No ratings yet
CE-303 Operating Systems Exam Guidelines
3 pages
Apple Model Identifier List
No ratings yet
Apple Model Identifier List
3 pages
UPSC PRELIMS CSAT (9) - 32 Synopsis
No ratings yet
UPSC PRELIMS CSAT (9) - 32 Synopsis
49 pages
CSS Session Plan
No ratings yet
CSS Session Plan
12 pages
SimPy 7 Day Mastery Plan
No ratings yet
SimPy 7 Day Mastery Plan
2 pages
Application To Congruences
No ratings yet
Application To Congruences
16 pages
Paralysis Patient Healthcare System Using IOT
No ratings yet
Paralysis Patient Healthcare System Using IOT
8 pages

Big Data Framework

Uploaded by

Big Data Framework

Uploaded by

2013 IEEE International Conference on Systems, Man, and Cybernetics

Big Data Framework

Firat Tekiner1 and John A. Keane

978-1-4799-0652-9/13 $31.00 © 2013 IEEE 1494

Figure 2. Big Data Framework

You might also like