C LOUD M ONITORING USING T ERRAFORM
CloudWatch, SNS, Prometheus, NodeExporter, Grafana
Mohammed Amir
P ROJECT O VERVIEW
Created an automated cloud observability project using Terraform, where a secure
AWS infrastructure was provisioned with a VPC, public subnet, route table, internet
gateway, security groups, and an EC2 instance. SSH access was restricted to a specific
CIDR range for enhanced security.
The project showcases two parallel monitoring approaches:
• AWS CloudWatch Monitoring & Alerting – A CloudWatch Dashboard was con-
figured to track EC2 performance metrics such as CPU utilization. Additionally,
a CloudWatch Alarm was integrated with Amazon SNS to send email notifica-
tions whenever CPU utilization reached or exceeded 80%, ensuring proactive
incident response.
• Prometheus & Grafana Monitoring Stack – On the EC2 instance, Prometheus,
Node Exporter, and Grafana were automatically installed via user data. Prometheus
collected system metrics, Node Exporter exposed host-level metrics, and Grafana
provided interactive dashboards for real-time visualization and analysis.
This dual setup highlights both AWS-native monitoring and open-source observability
tools, delivering a comprehensive and resilient monitoring solution.
K EY S KILLS
• AWS (VPC, EC2, Security Groups, CloudWatch, SNS): Designed and deployed
secure cloud infrastructure with monitoring dashboards and automated alerting.
• Terraform (IaC): Automated provisioning of infrastructure components such as
networking, EC2 instances, and monitoring services using Infrastructure-as-Code.
• Prometheus & Node Exporter: Implemented open-source monitoring stack to
collect and expose system-level metrics.
• Grafana: Built interactive dashboards for real-time visualization of system per-
formance and resource usage.
• Monitoring & Alerting: Integrated CloudWatch alarms with SNS for proactive
incident response.
1
1 I NSTALL T ERRAFORM
Terraform was installed on WSL (Linux) to enable infrastructure provisioning. For
installation, refer to theofficial documentation page
Figure 1: Terraform Installation on WSL
To check if Terraform is installed and verify its version
terraform - -version
Terraform was initialized, plan was checked, and configuration applied.
- terraform init: Initializes the Terraform working directory and downloads
required provider plugins.
- terraform plan: Generates an execution plan showing what actions Terraform
will perform.
- terraform apply: Applies the changes required to reach the desired state of the
configuration.
2 I NFRASTRUCTURE S ETUP
In this step, the fundamental AWS networking components were provisioned using
Terraform. The goal was to create a secure and scalable network environment where
the monitoring stack could be deployed.
2.1 VPC Creation
A Virtual Private Cloud (VPC) named Monitoring VPC was created with the CIDR
range 10.0.0.0/16. This provides an isolated virtual network to host all monitoring
resources securely.
2.2 Public Subnet
Inside the VPC, a public subnet with CIDR range 10.0.1.0/24 was defined. The
subnet is mapped to automatically assign public IPs to EC2 instances.
2
2.3 Internet Connectivity
An Internet Gateway (IGW) was attached to the VPC, enabling external connectivity.
A Route Table was created with a default route (0.0.0.0/0) pointing to the IGW. The
route table was then associated with the public subnet, ensuring resources inside the
subnet could access the internet.
2.4 Security Group (Firewall Rules)
A Security Group (SG) named Monitoring SG was created to control inbound and
outbound traffic:
• ICMP (Ping): Allowed from anywhere for basic connectivity tests.
• SSH (22): Restricted to the user’s own IP (var.My Ip) for secure remote access.
• HTTP (80): Opened to the world to allow web access.
• Prometheus (9090) & Grafana (3000): Allowed from anywhere to enable external
monitoring and visualization access.
• Egress: All outbound traffic allowed for updates and service communication.
2.5 Security Considerations
By restricting SSH access only to a specific CIDR (var.My Ip), unauthorized access
is prevented. Specific monitoring ports (9090 & 3000) were intentionally opened to
support Prometheus and Grafana dashboards. Using Terraform ensures that all rules
are reproducible, version-controlled, and auditable.
1 # - - - - - - - - - - - - - - - - - - VPC Setup - - - - - - - - - - - - - - - - - -
2 resource " aws_vpc " " My_Vpc " {
3 cidr_block = "10.0.0.0/16" # Defines VPC CIDR range
4 tags = {
5 Name = " Monitoring_VPC "
6 }
7 }
8
9 # Public Subnet inside VPC
10 resource " aws_subnet " " Public_Subnet " {
11 vpc_id = aws_vpc . My_Vpc . id # Links subnet to VPC
12 cidr_block = "10.0.1.0/24"
13 avai labil ity_zo ne = var . aws_az
14 m a p _ p u b l i c _ i p _ o n _ l a u n c h = true # Enables auto public IP assignment
15 tags = {
16 Name = " Monitoring - Public - subnet "
17 }
18 }
19
20 # Internet Gateway for Internet Access
21 resource " a w s _ i n t e r n e t _ g a t e w a y " " My_IGW " {
22 vpc_id = aws_vpc . My_Vpc . id
3
23 tags = {
24 Name = " Monitoring - IGW "
25 }
26 }
27
28 # Route Table for Internet Access
29 resource " aws_route_table " " Public_RT " {
30 vpc_id = aws_vpc . My_Vpc . id
31 route {
32 cidr_block = "0.0.0.0/0"
33 gateway_id = a w s _ i n t e r n e t _ g a t e w a y . My_IGW . id
34 }
35 tags = {
36 Name = " Monitoring - Public - RT "
37 }
38 }
39
40 # Associate Route Table to Public Subnet
41 resource " a w s _ r o u t e _ t a b l e _ a s s o c i a t i o n " " P u b l i c R T _ A s s o c i a t i o n " {
42 subnet_id = aws_subnet . Public_Subnet . id
43 route_table_id = aws_route_table . Public_RT . id
44 }
45
46 # - - - - - - - - - - - - - - - - - - Security Group - - - - - - - - - - - - - - - - - -
47 resource " a ws _s ecu ri ty _g ro up " " Monitoring_Sg " {
48 name = " Monitoring "
49 description = " Allow ICMP , SSH , HTTP , custom ports "
50 vpc_id = aws_vpc . My_Vpc . id
51
52 ingress {
53 from_port = -1
54 to_port = -1
55 protocol = " icmp "
56 cidr_blocks = ["0.0.0.0/0"]
57 }
58
59 ingress {
60 from_port = 80
61 to_port = 80
62 protocol = " tcp "
63 cidr_blocks = ["0.0.0.0/0"]
64 }
65
66 ingress {
67 from_port = 22
68 to_port = 22
69 protocol = " tcp "
70 cidr_blocks = [ var . My_Ip ] # Restricts SSH to your IP
71 }
72
73 ingress {
74 from_port = 9090
75 to_port = 9090
76 protocol = " tcp "
77 cidr_blocks = ["0.0.0.0/0"]
78 }
79
80 ingress {
4
81 from_port = 3000
82 to_port = 3000
83 protocol = " tcp "
84 cidr_blocks = ["0.0.0.0/0"]
85 }
86
87 egress {
88 from_port = 0
89 to_port = 0
90 protocol = " -1"
91 cidr_blocks = ["0.0.0.0/0"]
92 }
93
94 tags = {
95 Name = " Monitoring_SG "
96 }
97 }
3 I NSTANCE S ETUP
In this step, an EC2 instance was provisioned inside the previously created VPC and
Public Subnet. The instance was configured to automatically install and start the mon-
itoring stack using the user data script. This ensured a fully automated setup without
requiring manual intervention.
3.1 System Preparation
The EC2 instance was updated and upgraded to include the latest security patches
before installing monitoring tools.
3.2 Prometheus Installation
• An open-source monitoring and alerting toolkit that collects and stores time-
series metrics from applications and systems.
• Prometheus binaries and configuration files were installed.
• A configuration was created to scrape metrics from Prometheus itself (localhost:9090)
and from Node Exporter (localhost:9100).
3.3 Node Exporter Installation
• A lightweight exporter for Prometheus that exposes system-level metrics from
Linux servers.
• Node Exporter was installed to expose system-level metrics such as CPU, mem-
ory, and disk usage.
5
• A dedicated node exporter user was created for enhanced security.
• A systemd service was configured to ensure Node Exporter runs automatically
on system startup.
3.4 Grafana Installation
• A visualization and analytics platform that integrates with Prometheus (and
other data sources) to create interactive dashboards and alerts.
• Grafana was installed from the official repository.
• Grafana server was enabled and started as a systemd service, ensuring persis-
tence across reboots.
3.5 Security Considerations
• The EC2 instance was launched inside the Monitoring VPC and Security Group.
• SSH access (port 22) is restricted to the user’s own IP range (var.My Ip).
• Prometheus (9090) and Grafana (3000) ports were intentionally exposed for mon-
itoring dashboards.
• Services such as Node Exporter run under dedicated system users to minimize
privilege risks.
1
2 # - - - - - - - - - - - - - - - - - - EC2 Instance - - - - - - - - - - - - - - - - - -
3 resource " aws_instance " " Aws_Vm " {
4 ami = " ami -0861 f4e788f5069dd " # Amazon Linux 2 AMI
5 instance_type = " t2 . micro " # Free tier instance
6 key_name = var . Key_Name
7 subnet_id = aws_subnet . Public_Subnet . id
8 v p c _ s e c u r i t y _ g r o u p _ i d s = [ a ws_ se cu ri ty _g ro up . Monitoring_Sg . id ]
9 avai labil ity_zo ne = var . aws_az
10
11 tags = {
12 Name = " Ec2 - Monitoring "
13 }
14
15 user_data = <<- EOF
16 #!/ bin / bash
17 set -e # Exit on any error
18
19 # - - - - - - - - - - - - - - - - - - - - Update System - - - - - - - - - - - - - - - - - - - -
20 sudo yum update -y
21 sudo yum upgrade -y
22
23 # - - - - - - - - - - - - - - - - - - - - Install Prometheus - - - - - - - - - - - - - - - - - - - -
24 cd / opt
25
26 # Download and extract Prometheus
6
27 wget https :// github . com / prometheus / prometheus / releases / download / v2 .1.0/
prometheus -2.1.0. linux - amd64 . tar . gz
28 tar - xf prometheus -2.1.0. linux - amd64 . tar . gz
29
30 # Move Prometheus binaries to / usr / local / bin
31 sudo mv prometheus -2.1.0. linux - amd64 / prometheus / usr / local / bin /
32 sudo mv prometheus -2.1.0. linux - amd64 / promtool / usr / local / bin /
33
34 # Create Prometheus directories
35 sudo mkdir -p / etc / prometheus / var / lib / prometheus
36
37 # Move consoles and libraries
38 sudo mv prometheus -2.1.0. linux - amd64 / consoles / etc / prometheus /
39 sudo mv prometheus -2.1.0. linux - amd64 / co nsole_ librar ies / etc / prometheus /
40
41 # Cleanup
42 rm - rf prometheus -2.1.0. linux - amd64 *
43 # Create Prometheus configuration
44 cat << EOT | sudo tee / etc / prometheus / prometheus . yml
45 global :
46 scrape_interval : 10 s
47 scrape_configs :
48 - job_name : ’ prometheus_metrics ’
49 scrape_interval : 5 s
50 static_configs :
51 - targets : [ ’ localhost :9090 ’]
52 - job_name : ’ node_exporter_metrics ’
53 scrape_interval : 5 s
54 static_configs :
55 - targets : [ ’ localhost :9100 ’]
56 EOT
57
58 # - - - - - - - - - - - - - - - - - - - - Install Node Exporter - - - - - - - - - - - - - - - - - - - -
59 cd / opt
60 sudo curl - LO https :// github . com / prometheus / node_exporter / releases /
download / v1 .7.0/ node_exporter -1.7.0. linux - amd64 . tar . gz
61 sudo tar - xvzf node_exporter -1.7.0. linux - amd64 . tar . gz
62 sudo mv node_exporter -1.7.0. linux - amd64 / node_exporter / usr / local / bin /
63
64 # Create a dedicated user for Node Exporter
65 sudo useradd --no - create - home -- shell / bin / false node_exporter
66
67 # Create systemd service for Node Exporter
68 sudo tee / etc / systemd / system / node_exporter . service > / dev / null << EOC
69 [ Unit ]
70 Description = Node Exporter
71 After = network . target
72
73 [ Service ]
74 User = node_exporter
75 ExecStart =/ usr / local / bin / node_exporter
76
77 [ Install ]
78 WantedBy = multi - user . target
79 EOC
80
81 # Enable and start Node Exporter
82 sudo systemctl daemon - reload
7
83 sudo systemctl enable node_exporter
84 sudo systemctl start node_exporter
85
86 # - - - - - - - - - - - - - - - - - - - - Install Grafana - - - - - - - - - - - - - - - - - - - -
87 sudo tee / etc / yum . repos . d / grafana . repo > / dev / null << EOC
88 [ grafana ]
89 name = grafana
90 baseurl = https :// packages . grafana . com / oss / rpm
91 repo_gpgcheck =1
92 enabled =1
93 gpgcheck =1
94 gpgkey = https :// packages . grafana . com / gpg . key
95 EOC
96
97 sudo yum install grafana -y
98
99 # Enable and start Grafana
100 sudo systemctl enable grafana - server
101 sudo systemctl start grafana - server
102
103 EOF
104
105 }
Figure 2: Grafana Dashboard
8
4 C LOUD WATCH D ASHBOARD AND U TILIZATION
A LERTING WITH SNS
In this step, AWS CloudWatch was used to build a custom monitoring dashboard for
the EC2 instance and configure an automated alerting mechanism. The dashboard
provides real-time visibility of EC2 performance metrics, while alerts ensure proactive
notifications in case of threshold breaches.
4.1 CloudWatch Dashboard
• A dashboard named EC2-Demo-Monitoring was created.
• A title widget was added to identify the monitored region (Mumbai).
• A line chart widget was added to display the average CPU utilization over time.
• A single value widget was configured to show the latest CPU utilization per-
centage at a glance.
Figure 3: Cloudwatch Dashboard
9
Figure 4: Alert Dashboard
4.2 CloudWatch Alarm and SNS Integration
• A CloudWatch Alarm was created to monitor CPU utilization.
• The alarm threshold was set at 80%.
• When the threshold was breached, an SNS Topic triggered an email notification
to subscribed users.
Figure 5: Email Triggered
4.3 Security Considerations
• Alerts were delivered securely via SNS.
• The monitoring setup ensures early detection of resource exhaustion and abnor-
mal usage.
• Dashboards provide visibility without exposing sensitive system information.
10
1
2 # - - - - - - - - - - - - - - - - - - SNS Topic - - - - - - - - - - - - - - - - - -
3 resource " aws_sns_topic " " c l o u d w a t c h _ a l a r m s _ t o p i c " {
4 name = " C l o u d W a t c h _ A l a r m s _ T o p i c "
5 }
6
7 # - - - - - - - - - - - - - - - - - - SNS Subscription - - - - - - - - - - - - - - - - - -
8 resource " a w s _ s n s _ t o p i c _ s u b s c r i p t i o n " " e ma il _s ubs cr ip ti on " {
9 topic_arn = aws_sns_topic . c l o u d w a t c h _ a l a r m s _ t o p i c . arn
10 protocol = " email "
11 endpoint = " c l o u d e n g i n e e r 1 2 8 2 @ g m a i l . com " # Replace with your email
12 }
13
14 # - - - - - - - - - - - - - - - - - - CloudWatch Alarm - - - - - - - - - - - - - - - - - -
15 resource " a w s _ c l o u d w a t c h _ m e t r i c _ a l a r m " " cpu_high_alarm " {
16 alarm_name = " High - CPU - Alarm "
17 c om p a ri s o n_ o p e ra t o r = " G r e a t e r T h a n O r E q u a l T o T h r e s h o l d "
18 ev al ua ti on _p er io ds = 1
19 metric_name = " CPUUtilization "
20 namespace = " AWS / EC2 "
21 period = 60
22 statistic = " Maximum "
23 threshold = 80
24 alar m_des cripti on = " Alarm triggers if CPU utilization is >= 80% (
Maximum ) ."
25 alarm_actions = [ aws_sns_topic . c l o u d w a t c h _ a l a r m s _ t o p i c . arn ]
26
27 dimensions = {
28 InstanceId = aws_instance . Aws_Vm . id
29 }
30 }
31
32 # - - - - - - - - - - - - - - - - - - CloudWatch Dashboard - - - - - - - - - - - - - - - - - -
33 resource " a w s _ c l o u d w a t c h _ d a s h b o a r d " " e nh an ce d_d as hb oa rd " {
34 dashboard_name = " EC2 - Demo - Monitoring "
35
36 # The dashboard_body uses JSON to define widgets
37 dashboard_body = jsonencode ({
38 widgets = [
39
40 # Dashboard Title Widget
41 {
42 type = " text "
43 x = 0
44 y = 0
45 width = 24
46 height = 2
47 properties = {
48 markdown = "# EC2 Mumbai "
49 }
50 },
51
52 # CPU Utilization Chart
53 {
54 type = " metric "
55 x = 0
56 y = 2
11
57 width = 12
58 height = 6
59 properties = {
60 metrics = [
61 [" AWS / EC2 " , " CPUUtilization " , " InstanceId " , aws_instance .
Aws_Vm . id ]
62 ]
63 period = 60
64 stat = " Average "
65 title = " CPU Utilization "
66 region = " ap - south -1"
67 }
68 },
69
70 # Single Value CPU Widget
71 {
72 type = " metric "
73 x = 12
74 y = 8
75 width = 12
76 height = 6
77 properties = {
78 metrics = [
79 [" AWS / EC2 " , " CPUUtilization " , " InstanceId " , aws_instance .
Aws_Vm . id ]
80 ]
81 view = " singleValue "
82 stat = " Average "
83 period = 60
84 title = " Current CPU %"
85 region = " ap - south -1"
86 }
87 },
88 ]
89 }
90 )
91 }
12
5 R ESTRICTED SSH A CCESS WITHIN VPC
To enhance security, the EC2 instance was configured so that SSH (port 22) access is not
open to the public internet. Instead, the security group rules allow SSH connectivity
only from other instances within the same VPC.
This ensures that:
• Direct SSH access from external IP addresses is blocked.
• Administrative access can only be performed by first connecting to another in-
stance inside the VPC.
• The attack surface is reduced, as no public SSH exposure exists.
By enforcing this design, the EC2 instance is effectively shielded from unauthorized
SSH attempts
Figure 6: SSH Login
13
C ONCLUSION
This project successfully demonstrated how to build a secure and automated cloud
monitoring environment using Terraform as the Infrastructure-as-Code (IaC) tool. The
solution integrated both AWS-native monitoring (CloudWatch) and open-source ob-
servability tools (Prometheus & Grafana), showcasing a dual monitoring strategy
that ensures resilience, flexibility, and deeper system insights.
By provisioning the complete infrastructure with Terraform — including the VPC, sub-
net, internet gateway, route tables, security groups, and EC2 instance — the project
highlighted the power of automation, reproducibility, and version-controlled infras-
tructure deployment.
Key achievements:
• Automated Infrastructure: Terraform enabled fully reproducible setup of net-
working and compute resources without manual intervention.
• Dual Monitoring Setup: CloudWatch with SNS provided native alerting for
CPU thresholds, while Prometheus and Grafana delivered rich visualization and
host-level metrics.
• Security Hardening: SSH access was restricted to trusted sources and limited
within the VPC, reducing the attack surface.
• Scalability: The modular design allows easy scaling of monitoring resources or
extension to multiple instances.
• Cost-Efficiency: Resources could be safely cleaned up using terraform destroy,
ensuring no unnecessary cloud expenses.
In conclusion, this project demonstrated not only how monitoring can be automated
in AWS, but also how combining cloud-native and open-source tools provides a more
comprehensive observability solution. It emphasizes the importance of automation,
security, and proactive monitoring in modern cloud infrastructure management.
14