AWS Watchdog

Monitor and collect any error, in any AWS service, in any project and send it to me via email.

📐 Architecture

AWS Watchdog is a very basic way to proactively monitor for errors in AWS (any project, any service) and send them to me via email.

It collects errors in different ways.

a) CloudWatch Logs

It monitors any CloudWatch logs, of any AWS service, for the text "ERROR" or "WARN" or "CRITIC".
If an error is found, a Lambda is triggered: the Lambda formats the error log message and publishes it to the SNS topic that sends an email to me.

The Lambda trigger is NOT a classic trigger, like:

myfunction:
  ...
  events:
    - cloudwatchLog:
        # Mind that logGroup should match exactly a logGroup, no wildcards or similar.
        logGroup: '/aws/lambda/botte-be-prod-endpoint-message'
        filter: '?ERROR ?WARN ?CRITIC'

because otherwise we would have to write such an entry for any existing service. And, for future projects, we would have to remind to add new entries here.
A much better solution is to define a (CloudWatch) Log Account-Policy that monitors any log (existing and future) and triggers the Lambda.
Such an Account-Policy is done with CloudFormation templates in serverless.py, with one AWS::Lambda::Permission and one AWS::Logs::AccountPolicy.

b) SNS error Topic

The SNS error Topic aws-watchdog-errors-prod is the core of AWS Watchdog:

it has an email subscription to my email address, so that any message published to this topic is sent by email to me;
it's where the Lambda described in a) forwards log errors by publishing a message;

Also, it should be used:

by any Lambda, in any project, with ASYNC invocation as onError destination; see fi. botte-be in botte-monorepo.
Note: Lambda sync/async invocations examples:
- ASYNC: S3, SNS, SQS, CloudWatch Logs, EventBridge Scheduler, aws cli, etc.
- SYNC: API Gateway, etc.
by any DynamoDb Stream, in any project, as destination for discarded records; see fi. botte-be in botte-monorepo.
by any other AWS service, in any project, as destination for errors or DLQ if SNS is supported.

c) SQS error Queue

The SQS error Queue aws-watchdog-errors-prod with a Pipe that forwards enqueued messages to SNS topic here (that sends emails to me).

It should be used:

by any SQS, in any project, for failed messages;
that is because SQS does not support SNS (like the SNS topic here) for failed messages, it only supports SQS as DLQ.
We need this for all SQS otherwise SQS keeps delivering failed messages to subscribed Lambdas (triggering them even if the retry policy on Lambda is set to 0) until MessageRetentionPeriod expires. With the DLQ instead we can configure "maxReceiveCount: 1" to discard the message and send it to the DQL after N attempts. See fritarol project in the old, and now archived, patatrack-monorepo.
Note: I try NOT to use the pattern SQS-trigger-Lambda, as it works with polling and it incurs (a little) monthly cost.
by (best choice) EventBridge Scheduler schedule (AWS::Scheduler::Schedule) as Target > DeadLetterConfig (only SQS supported), see reborn-automator.
by (worst choice) EventBridge bus schedule (AWS::Event::Rule) as Targets > DeadLetterConfig (only SQS supported) (no example, since AWS::Scheduler::Schedule is better)
by any other AWS service, in any project, as destination for errors, but only if they do not support SNS (the SNS topic here) directly.

🛠️ Development setup

1 - System requirements

Python 3.13
The target Python 3.13 as it is the latest Python runtime available in AWS Lambda.
Install it with pyenv:

$ pyenv install -l  # List all available versions.
$ pyenv install 3.13.7

Poetry
Pipenv is used to manage requirements (and virtual environments).
Read more about Poetry here.
Follow the install instructions.

Pre-commit
Pre-commit is used to format the code with black before each git commit:

$ pip install --user pre-commit
# On macOS you can also:
$ brew install pre-commit

2 - Virtual environment and requirements

Create a virtual environment and install all deps with one Make command:

$ make poetry-create-env
# Or to recreate:
$ make poetry-destroy-and-recreate-env
# Then you can activate the virtual env with:
$ eval $(poetry env activate)
# And later deactivate the virtual env with:
$ deactivate

Without using Makefile the full process is:

# Activate the Python version for the current project:
$ pyenv local 3.13  # It creates `.python-version`, to be git-ignored.
$ pyenv which python
/Users/nimiq/.pyenv/versions/3.13.7/bin/python

# Now create a venv with poetry:
$ poetry env use ~/.pyenv/versions/3.13.7/bin/python
# Now you can open a shell and/or install:
$ eval $(poetry env activate)
# And finally, install all requirements:
$ poetry install
# And later deactivate the virtual env with:
$ deactivate

To add new requirements:

$ poetry add requests

# Dev or test only.
$ poetry add -G test pytest
$ poetry add -G dev ipdb

# With extra reqs:
$ poetry add -G dev "aws-lambda-powertools[aws-sdk]"
$ poetry add "requests[security,socks]"

# From Git:
$ poetry add git+https://github.com/aladagemre/django-notification

# From a Git subdir:
$ poetry add git+https://github.com/puntonim/utils-monorepo#subdirectory=log-utils
# and with extra reqs:
$ poetry add "git+https://github.com/puntonim/utils-monorepo#subdirectory=log-utils[rich-adapter,loguru-adapter]"
# and at a specific version:
$ poetry add git+https://github.com/puntonim/utils-monorepo@00a49cb64524df19bf55ab5c7c1aaf4c09e92360#subdirectory=log-utils
# and at a specific version, with extra reqs:
$ poetry add "git+https://github.com/puntonim/utils-monorepo@00a49cb64524df19bf55ab5c7c1aaf4c09e92360#subdirectory=log-utils[rich-adapter,loguru-adapter]"

# From a local dir:
$ poetry add ../utils-monorepo/log-utils/
$ poetry add "log-utils @ file:///Users/myuser/workspace/utils-monorepo/log-utils/"
# and with extra reqs:
$ poetry add "../utils-monorepo/log-utils/[rich-adapter,loguru-adapter]"
# and I was able to choose a Git version only with pip (not poetry):
$ pip install "git+file:///Users/myuser/workspace/utils-monorepo@00a49cb64524df19bf55ab5c7c1aaf4c09e92360#subdirectory=log-utils"

3 - Pre-commit

$ pre-commit install

🔨 Test

To run unit and end-to-end tests:

$ make test

🚀 Deployment

1 - Install deployment requirements

The deployment is managed by Serverless. Serverless requires NodeJS.
Follow the install instructions for NVM (Node Version Manager).
Then:

$ nvm install --lts
$ node -v > .nvmrc

Follow the install instructions for Serverless, something like curl -o- -L https://slss.io/install | bash. We currently use version 3.12.0, if you have an older major version you can upgrade Serverless with: sls upgrade --major.

Then to install the Serverless plugins required:

#$ sls upgrade  # Only if you are sure it will not install a major version.
$ nvm install
$ nvm use

2 - Deployments steps

Note: AWS CLI and credentials should be already installed and configured.\

Deploy to PRODUCTION in AWS with:

$ sls deploy
# $ make deploy  # Alternative.

To deploy a single function (only if it was already deployed):

$ sls deploy function -f endpoint-health

Deploy to a DEV STAGE

Pick a stage name: if your name is Jane then the best format is: dev-jane.
Create the keys in AWS Parameter Store with the right stage name.

To deploy your own DEV STAGE in AWS version:

# Deploy:
$ sls deploy --stage dev-jane
# Delete completely when you are done:
$ sls remove --stage dev-jane

©️ Copyright

Copyright puntonim (https://github.com/puntonim). No License.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
aws_watchdog		aws_watchdog
docs/img		docs/img
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
Makefile		Makefile
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
serverless.yml		serverless.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AWS Watchdog

📐 Architecture

a) CloudWatch Logs

b) SNS error Topic

c) SQS error Queue

🛠️ Development setup

1 - System requirements

2 - Virtual environment and requirements

3 - Pre-commit

🔨 Test

🚀 Deployment

1 - Install deployment requirements

2 - Deployments steps

Deploy to a DEV STAGE

©️ Copyright

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AWS Watchdog

📐 Architecture

a) CloudWatch Logs

b) SNS error Topic

c) SQS error Queue

🛠️ Development setup

1 - System requirements

2 - Virtual environment and requirements

3 - Pre-commit

🔨 Test

🚀 Deployment

1 - Install deployment requirements

2 - Deployments steps

Deploy to a DEV STAGE

©️ Copyright

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages