0% found this document useful (0 votes)
17 views2 pages

2022 Dec Bda 53151

Uploaded by

vishankcodes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views2 pages

2022 Dec Bda 53151

Uploaded by

vishankcodes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

3

6
8

B
7F
4F

7
C7
D2

11

57
BB

6
89
45
Paper / Subject Code: 53151 / Big Data Analysis

F7
85
97

5D
67

4A
76

BB
68
45

74
23
B8

7F
4F

C7

11

57
68

89
45
6D
5B

85
97

5D
67

4A
57

BB
68
7

23
B8
4

7F
4A

4F

11

7
68

5
D

A5
B

8
4
97

D
11

7
57

B
68
5

6
[Time: 3 Hours] [Mks: 80]

5
23

14
B8
4
5D

B
7F
A

C7

7
8

45
74

6D

D1
4

5
76

5B

5
C7

11

67

4A
8

68
9
5

5
NB : 1) Question 1 is compulsory.

23
B8
84
D

F
67

4A

7
57

7C
4
5

D1

A5
6
2) Attempt any three questions from the remaining questions.

B
7F

97
7

57

76

8
C

45
D1

F6

5
3

14
85

3) Assume suitable data wherever applicable.

8
67

4A

4F

C7
2

7
B
68

7
75

D1
23

5
B
F

5
7
11

67

A
57

76

38
7

89
C

5
6D

14
85

84
5D

F
1 (a) Explain characteristics of big data. Analyze the tourism data and identify the 5
7

7
D2
B
F6

57

7C
F7

1
3

5B

7
7

D
characteristics which are difficult in storing in RDBMS and the need for big data
2

11

57

8
57

9
7C
6D

6
F7

5
23
8
4
D

7F
A

C7
techniques for storing and analyzing them.
38

BB
68
6
7

4
75

D
14
7F

5
F

7
D2

67
7

8
9
7C

45
74

D1
(b) Explain any five components of Hadoop ecosystem. 5

F7

5
23
85

7F
4A

7
76

BB
8
9

C
4
75

D
23

(c) Give problems in Flajolet-Martin (FM) algorithm to count distinct elements in a 5

76
B8

5
4F

97
11

67
6

8
57

5
6D

F7

3
stream.
5B

8
4
7

7F
7

4A

2
8

B
8
9

F6
F7

74
75

D
3

6
8

(d) Explain the nearest neighbor problem. What similarity measure can be used in an 5

5B

85
D2

7
B

76
7

89
C
74

D1

A5

23
B

85

4
application to find plagiarism in documents?
67

F
76

B
8
89
5

4
5

D
3

6
84

5B
F

5
4F

97
7
2

7
BB

38
7

7C
D

D1

5
6

F7
5

B8
4
7

4A
2 (a) Discuss Matrix-Matrix Multiplication. Perform Matrix Multiplication with 1-step
57

D2
38

68
9
5

F6
F7

74
5
B8
84

5B
4A

7
2

1
Map Reduce method 10

76
7

9
7C
4

A5
6

5B

8
84
7

D
7

4F
6

38

BB
89

F6
5

F7

14

76
84
4A

97
C7
D2
BB

1 2 3 4
57

5
74

5
76

8
4
D
11

4A

F
6

B
68
89

5 6 * 1 2
45

6
A5

74
5
3
D

5B
F
F

7
2

11

57
B
8

57

1 0

89
7C
74
75

D
4

84
D
11

4A
7

76

38

BB
9
7C

45

6
5

6
8
D

F
4A

C7
2

11

7
B
8

45
74
5

(b) Explain with example Collaborative based filtering in a recommendation system. 10

5
76

5B

85
7

5D
11

67

4A
6

68
9
7C

A5

F7

23
B8
84
5D

7F

57
F6

7C

5
74

D1
14

76

84
C7

4A
6

3 (a) Explain the concept of Parallel Decision Trees with the help of an example. 10
8
57

9
5
D1

6
5

F7

75
23

76
8
84

7F
7

11
38

BB
F6

7C
74
5

6D
14

A5
6

(b) Recall all NoSQL design patterns with example. Justify CAP with suitable example. 10
85
7

D
D2

7
57

89
7C

5
1

F6
5

5
3

14
84
5D

4A

4F

7
D2
8

B
F6

57

7C

D1
3

5B

4 (a) For the graph given below use Clique percolation and find all communities 10
97
C7
2

11

57

76

38
7
D

75
85

B8
84
5D

7F
67

4A

4F
76

11
7C
D
23

76

5B
F

85
F

97
7

5D
11

76
7

C
74

6D

F6
A5

23
5

8
84
D
67

4F

C7
38

B
89

57
F7

6D
4

76

B
F

7
7
2

11

67
8
57

B C
89
7C

45
74

F7

23
D

7F
4A
6

A
8

BB
68
89

F6
7

74
75

D
23

85
4F

7
BB

76
7

89
7C

45
6D

D1

F6
5

23
5
97

4A

4F
38

BB
68
5

F6

57
7

6D

E
B8
84

97
C7

D
D2

11

38
7

45
74

5
76

F7
5B

B8
5D
67

A
6

D2
8

8
89

74
3

14

76
84

5B
7F
4F

7
2
B

76
89
7C
6D

D1

A5
76

85

84
97

4F

H I
BB
45

F G
6
A5

5
3

14

76
B8

A
7F
F

97
7

A
2
68

7C

45
74

D1
4

A5
5B

85

B8
11

57

76

68
89

F6

5
23

14
84
5D

5B
4A

57
BB

57

C
4

6D

(b) Employ the DGIM algorithm. Shown below is a data stream with N=14 and the 10
76

84
7

5D
11

67

4A
8
9
45
A5

F7

23

current bucket configuration. New elements enter the window at the right. Thus,
76
B8
5D

7F

C7

11
68

74

D
14

A5

the oldest bit of the window is the left-most bit shown 10011010101010111
B

5
C7

5D
7
57

76

8
9
45
D1

F6
23

14
8

i) Show one way of how the above initial stream will be divided into buckets
67

4A

4F

C7
BB
68

57
75

6D

D1

and count distinct 1’s.


7F

97
11

67
57

38
7C

F7

75
85

B8
4
5D

7F
A

2
8
F6

7C
74

6D
14

76

11711 Page 1 of 2
5B

85
C7
57

89
D1

F6
A5

F7

23
84
67
38

BB

57
74
75

6D
14

76
7F
D2

38
89
7C

45
D1

A5

F7
85
6

D2

6D23857F67C75D114A576845BB8974F7
BB
68
F6
F7

74
75
23

14
F7 85 5D 76 89 D2 7C 4A 5B F7
6 7F 11 84 74 38 75 57 B8 6D
C7
D2 4 A5
67 5 BB F 76 57 D 6 84 9 74 23
38 D F6 11 85
57 5D 76 89 2 7C 4A 5 B F 7 7F
F6 11 84 74 3 8 7 5 B 6D
5B F7 57 5 D 7 6 8 9 2
67
23 7C 4A
6D F6 11 84 7 4F 3 85
C7
85 75 57 B8 7 4 5 7 5D
7F D1 68
45
97
4F
23
85
C7 A5 BB 76
D2 F 67 11
67 14 7 5D 76 89 C 4A
C7 A5 B B 7 6 F6 11 84 74 38 7
5D 76 89 D 2 7C 4A 5B F7 57 5 D
57

11711
68

6 (a)
5 (a)

(b)
(b)
F6 11 84 74
F
38
5 7 5 5 7 B8 6D F6
7
11
4 45
7C 4A 5B 7 7 F D1 6 8 9 7 23 C 75 A BB
75 57 B8 6D
23
67 14 45 4F
76
85
7F D1
57
68 8
D1 68 97 C7 A5 BB 4
14 45 4F 8 57 5D 76 89 D2 6 7C 1 4A 5B
75 A BB 7 6 F 1 8 45 7 4 3 8 7 5 B8
57 D2 67 14 57 5D 76
D1 68 89 C A B B
F7
6 F 8 4
97
74
F
38
5 7 5 D 6 11
5 4F
DGIM?

14 45 5D 76 89 2 7C 4A B
A5 B 7 6D 7 F6 11 84 74 3 85 75 57 B8
B8 D
76 9 2 3
7C 4A 5B F7
6 7 6 97
8 4 7 4 8 7 5 5 7 B 8 D F6 11 84 4F
14 5 F 76 5 7 D 6 9 2 7 C 4A 5B 76
A5 BB F6 11 84 74 38 75 57 B8
8 D 2 5 B F 7 5 7
D2
76
84 9 74 3
7C
75
4A
57 B8 6 F D1 68
45
97
4F 3
85 D2 67 14
5B F7 7F D 11 6 84 9 74 38 C A B 76D
B8 6D 67 5 F 5 75 57 B8
68 97 23 C7 4A5 BB 76 7 F6 D 11 6 84 974 23
45 4F 85 5 8 D 2 5 B F 7
85
76 7F D1 76 9 3
7C
7
4A
5 B 6 7F
BB
89 D 6 7C 1 4A
84
5B
74
F7 8 57 5 7 8 D 23
23 6
D1 68 97
74 8 7 5 B D F6 1 4 4 F 85
F7 57 5D 76 89 2 7C 4A 5B
B 7 7F
BB 6D F6 11 84
5B
74
F7
38
57 7 5 5 76 89
6D 67
89

Page 2 of 2
23 7C 4A F ************ D1 8 7 23 C7
74 8 7 5 5 B8 6 D 6 1 45 4F 8 5
F7 5 7F D1 7 68 97 23 7 C7 4 A5 BB 76 7F
6D 67 14 45 4F 85 5 D 7 8 D 67
2 C A B 7 7 F 1 6 8 97 23 C7
74 38 7 5 5 7 B 8
6D 6 7C 1 4A 4 5 4 F 8 5 5D
F7 57 68 97 23 BB 76 7F

6D23857F67C75D114A576845BB8974F7
D1 7 5
6D F6 45 4F 8 5 5 7 8 D 6 7
11
4
7 14 7F D 1 6 8 9 7 23 C
23
85
C7 A5 BB 76
D2 67 14 4 5B 4 F7 8 57 75D
5D 76 89 F
7F 1 8 7 3 C7 A5 B 6 11
Paper / Subject Code: 53151 / Big Data Analysis

D2 67 14 45 4F 85
7F 5 D 7 68 8 97
D2 67
C7 4A
C7 A5 BB 76 4 4 38
38 D 6 11 57 5D
57
6
57 5D 76 89 2 7C 4A 5B
B
F7
6
F6 11 84
5B
74
F7
38
57 7 5D 5 76 89 D F6
7C
11
4A
7C 4A 23
6D F6 11 84 74 8 75 57
Explain Park-Chen-Yu algorithm. How memory mapping is done in PCY.
Explain working of all phases of Map Reduce with one common example.

75 57 B8 57
Explain how Hadoop goals are covered in Hadoop Distributed File System.

7C 4A 5B F7 68
D1 68
45
97
4F
23
85 75 57 B8 6D F 67
D1
14 45
B
Compute the page rank of each page in the following figure, assuming β = 0.8.

14 7F D1 68 97 23 C7 A5
A5 BB 76 4 4 8
D 6 1 5 F 5 5 7
ii) The following bits enter the window one at a time: 10101. What is the bucket

89
configuration in the window after this sequence of bits has been processed by

76 23 7C 4A BB 76 7F D1 68
84 74 8 7 5 5 7 8 D 6 1 45
5B F7 57 D 6 9 2 3
7C 4A BB
6D F6 11 84 74 8 5 75 5 7
10

B8
10
10
10

7C 4A 5B F7 89
97 23 B 6 D
7F D1 68
4 7
4F 85
7F
75
D1
57
68 8 97 23
67
C7
14
A5 5 BB
76
D2 67 14 45 4F 85
7F 5 D 7 68 89
38 C7 A5 BB 76
D2 67 1 14 45 74
57 5D 76 89 C A B B
F7
6
F6 11 84
5
74
F
38
5 7 5 5 7 8

You might also like