0% found this document useful (0 votes)
20 views4 pages

MS4S21 - Big Data Engineering and Its Applications Assessment 2

The document outlines an assessment for students in the MS4S21 Big Data Engineering course, where they act as consultants for a video streaming start-up, BoxStream. Students are tasked with migrating services to the cloud using AWS, creating step-by-step guides for various tasks including setting up S3 buckets, DynamoDB tables, and custom Amazon Machine Images. The assessment also includes querying data and developing networking solutions, with specific documentation requirements for each task.

Uploaded by

pentestersangwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views4 pages

MS4S21 - Big Data Engineering and Its Applications Assessment 2

The document outlines an assessment for students in the MS4S21 Big Data Engineering course, where they act as consultants for a video streaming start-up, BoxStream. Students are tasked with migrating services to the cloud using AWS, creating step-by-step guides for various tasks including setting up S3 buckets, DynamoDB tables, and custom Amazon Machine Images. The assessment also includes querying data and developing networking solutions, with specific documentation requirements for each task.

Uploaded by

pentestersangwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

PUBLIC / CYHOEDDUS

MS4S21 – Big Data Engineering and its Applications

MS4S21 – Big Data Engineering and its Applications


Assessment 2
Assessment Outline
The following assessment, has you (the student) play the role of a consultant for a start-up
company, BoxStream. BoxStream is an online video streaming service whose users are made up
of both creators (users who create content for the platform) & followers (those who watch the
content made by creators).

The company would like to relocate many of their on-site services to the cloud and have hired
you to undertake this task.

As part of your role, you will be required to:

1. Undertake the technical challenges that the company requires you to complete,
including storage, network & compute resources. Validating the services as the
company outlines.
2. Provide Documentation, in the form of step-by-step guides, that the company can use
as instruction manuals to share with their internal employees, who can then replicate
the work you have undertaken. These guides should also include relevant screenshots
that demonstrate the work has been completed. You should assume that the employees
have no prior knowledge of cloud computing but have access to the AWS management
console through their employer.

Consider the Lab Tutorials completed in class as good examples of step-by-step guides,
but yours should include relevant screenshots where suitable & have a professional look
that would suit a corporate environment. Finally, these are NOT traditional reports, they
should contain the steps needed to replicate the tasks with outputs include, with some
concise discussion where appropriate.

Your Submission
Your submission will be comprised of the following (Note: These will become clear as you read
the assessment brief):

1. {student number} S3 Bucket step-by-step guide (PDF)


2. {student number} DynamoDB step-by-step guide (PDF)
3. {student number} Query Validation (PDF)
4. {student number} Amazon Machine Image step-by-step guide (PDF)
5. {student number} Networking and Web Application step-by-step guide (PDF)

Each must be a separate document as specified above and uploaded as a PDF.

Where possible, do not zip the files.


PUBLIC / CYHOEDDUS

MS4S21 – Big Data Engineering and its Applications

Contact Information
Most importantly, if the wording of the assessment is unclear, please don’t hesitate to get in
touch via email: [email protected] . Use the subject title MS4S21 – Assignment
Query. This will enable me to find your assessment queries faster within my inbox.

You are responsible for ensuring that you do not exceed the limit of your AWS Budget.

Question 1 – Databases and Storage (30 Marks)


StreamBox, the online video streaming service, has contacted you regarding some consultancy
work they would like you to undertake as part of your contract. The company is interested in
Amazon Web Services, more specifically, they’re interested in exploring opportunities with
storing their data in an S3 bucket and using DynamoDB to query their data, but they’re unsure
whether it will be able to meet their company demands.

The company has provided you with a pseudo-sample of their user data in file called
“user_data.csv”, that contains information on creators and followers such as view time, stream
time, and total income from the platform. Each username is only unique within the region the
account was created.

The company would like you to undertake a number of tasks using this data to determine AWS’s
suitability.

“user_data.csv” is available on Blackboard under the “Assessment Information” tab.

The tasks are as follows:

Task A (10 Marks for Technical Work, 5 Marks for Quality of Step-by-Step
Guides)
1. The company would like to store their CSV data on the cloud, using an S3 bucket called
{student number}-box-stream-bucket, you should look to provide a step-by-step
guide on how you setup the bucket, and any settings you’ve selected/ changed to
improve the bucket.
2. Using the data stored on the S3 Bucket you created, the company would like you to
generate a DynamoDB Table called {student number}-box-stream-db, you should look
to provide a step-by-step guide on how you setup the DynamoDB Table and any
settings you’ve selected/ changed.

You should create two separate documents (PDF) for each setp-by-step guide. The first called
“{student number} S3 Bucket step-by-step guide” and the second called “{student number}
DynamoDB step-by-step guide”. Include relevant screenshots and instructional steps.
PUBLIC / CYHOEDDUS

MS4S21 – Big Data Engineering and its Applications

Task B (10 Marks for Queries, 5 Marks for Query Document & Uploading of
Results to S3 Bucket)
Using the Scan/ Query functionality of your DynamoDB Table, the company would like to
identify if the following queries can be undertaken. The results for each query should be
downloaded as a CSV File & Uploaded to your S3 Bucket, with naming convention {student
number}query{question number}.csv

1. Return all data for accounts that are older than 5 years.
2. Return only the account names for accounts that have streamed for at least 1100 hours
of more.
3. Return all data for accounts which are the follower type and have watched more than
980 hours on the platform.
4. Return the account name for the most successful creator on the platform, in terms of
total income.
5. Return the total income data for creator accounts that have streamed for less than 278
hours and have streamed for less than two years.

To verify that these were done successfully, the company would also like you to provide a
document (PDF) called {student number} Query Validation, which should include only
screenshots of each query, including the logic to create the query and the resulting output of
the query. Ensure that your database table name is visible within the screenshots.

Question 2 – Compute & Networking (30 Marks)


BoxStream, was satisfied with the work you produced regarding their data warehousing
infrastructure. As such they have contracted you for additional work. The company would like to
shift their attention to their compute and networking solutions.

Task A (10 Marks for Technical Work, 5 Marks for Quality of Step-by-Step
Guide)
The company would like to avoid purchasing dedicated servers to scale-up their streaming
platform service. Instead, they would like to trial using Amazon to create custom virtual
machines that can meet their demands. They’ve heard of Amazon Machine Images (AMI’s) and
how you can setup custom machine images by using an existing virtual machine.

The company would like you to complete the following:

1. Using a standard Amazon Linux EC2 instance as your base template, with a t3.nano and
25GiB of gp3 storage, create a custom Amazon Machine Image (AMI) that comes with
the Ruby programming language pre-installed, by connecting to your template
instance, manually installing the software and creating the AMI.

(Note: your template instance should have the name {student number}-ec2-template-
instance. Your AMI should have the name {student number}-ruby-image)

2. The company would like you to validate the custom Amazon Machine Image (AMI), by
creating an EC2 instance using the AMI, verifying that the software installed by running
the following commands in the terminal (ensure to include screenshots of this within
your step-by-step guide):
PUBLIC / CYHOEDDUS

MS4S21 – Big Data Engineering and its Applications

a. >>nano {student number}_test.rb (Create a Ruby File)


b. >>puts “{student number}. Ruby has been installed successfully!”
c. >>”Ctrl + S” to save. “Ctrl + X” to exit. (Save and exit Ruby File).
d. >>ruby {student number}_test.rb (Run File)
The company would also like you to ensure that the instance cannot be accidentally
terminated.

(Note: The instance you create to test the AMI should be called {student number}-ec2-
ruby-instance.)

As part of the above task, the company requires that you create a step-by-step guide called
“{student number} Amazon Machine Image step-by-step guide”. Your document should
include the step-by-step instructions required to create and test the Amazon Machine Image, as
well as any relevant screenshots, including the validation of the created EC2 instance.

Task B – Networking (10 Marks for Technical Work, 5 Marks for Quality of
Step-by-Step Guide)
Separate to the above work, the video streaming service BoxStream, would like to explore
Amazon’s networking capabilities. They realise the importance of ensuring that their data and
services sit within a suitable network infrastructure. Thus, the client would like you to complete
the following:

1. Develop a Virtual Private Connection called “{student number}-box-stream-vpc”, the


company would like to have a public subnet and a private subnet.
2. The company would also like you to demonstrate the ability to host a web application
that prints “Hey {student number}! Thanks for dropping by BoxStream, are game of the
day is Valorant. Check it out!” The company would like this web application to be hosted
on a suitable EC2 instance within the newly designed Virtual Private Cloud (VPC), they’d
like for the application demo to be publicly accessible by anyone.

The following code may be helpful (ensure to include a screenshot of this code, and the
resulting output, in your step-by-step guide):
a. sudo su –
b. yum update -y
c. yum install -y httpd
d. echo “[INSERT REQUIRED MESSAGE]” > /var/www/html/index.html
e. systemctl enable httpd
f. systemctl start httpd

As part of the above task, the company requires that you create a step-by-step guide called
“{student number} Networking and Web Application step-by-step guide”. Your document
should include the step-by-step instructions required to create the VPC, Web Application and
suitable screenshots of the code and resulting output.

END OF ASSESSMENT

You might also like