Go Transcribe API

A Golang serverless API for transcribing audio and video files using AWS Transcribe, featuring S3 event-driven processing and API key authentication, designed for serverless deployment on AWS Lambda.

Features

✅ Pure Go implementation using only net/http (no external frameworks)
✅ S3 event-driven transcription processing
✅ API key authentication via X-API-Key header
✅ Health check endpoint at /health
✅ Automatic transcription on file upload to S3
✅ Multiple audio/video format support:
- Audio: MP3, WAV, FLAC, OGG, AMR
- Video: MP4, WebM
✅ Automatic output to separate S3 bucket
✅ Asynchronous job processing with polling
✅ Graceful shutdown support
✅ Request logging middleware
✅ AWS Lambda deployment ready with Serverless Framework
✅ Dual deployment: Run locally as HTTP server or deploy to AWS Lambda

Project Structure

/
├── cmd/
│   ├── api/
│   │   └── main.go           # Local server entry point
│   └── lambda/
│       └── main.go           # AWS Lambda entry point (S3 + API Gateway)
├── internal/
│   ├── handlers/
│   │   ├── health.go         # Health check handler
│   │   ├── transcribe.go     # Transcription handler (audio & video)
│   │   ├── health_test.go    # Health handler tests
│   │   └── transcribe_test.go # Transcription tests
│   ├── middleware/
│   │   ├── auth.go           # Authentication middleware
│   │   └── auth_test.go      # Middleware tests
│   └── server/
│       ├── server.go         # Server setup and configuration
│       └── server_test.go    # Server tests
├── tests/
│   ├── e2e_test.go           # End-to-end integration tests
│   └── data/                 # Test data files (audio/video samples)
├── serverless.yml            # Serverless Framework configuration
├── Taskfile.yml              # Task runner configuration
├── .air.toml                 # Hot reload configuration
├── Dockerfile                # Docker configuration
├── .gitignore                # Git ignore file
├── go.mod                    # Go module file
└── README.md                 # This file

Architecture

S3 Event-Driven Flow

Upload audio/video file to Input S3 Bucket
S3 triggers Lambda function automatically
Lambda starts AWS Transcribe job
Lambda polls for job completion
Transcription result saved to Output S3 Bucket
- AWS Transcribe raw output saved to transcribe-{jobname}-{timestamp}.json
- Processed result saved to results/{filename}.json
- Plain text transcript saved to transcripts/{filename}.txt
AWS Transcribe output triggers webhook (if configured)
- S3 event fires when transcribe-*.json file is created
- Lambda downloads both AWS Transcribe output and processed result
- Webhook receives complete payload with both JSONs and metadata
Original transcript available in AWS Transcribe output location

Components

Input Bucket: go-transcribe-api-{stage}-input
Output Bucket: go-transcribe-api-{stage}-output
Lambda Function: Handles both S3 events and HTTP API requests
AWS Transcribe: Performs actual audio/video transcription

Local Development Setup

Prerequisites

Go 1.21 or higher
Git
Task - Task runner (recommended)
AWS CLI configured with appropriate credentials
(Optional) Serverless Framework for deployment
(Optional) Docker for containerized deployment

Installation

Clone the repository:

git clone https://github.com/nicobistolfi/go-transcribe-api.git
cd go-transcribe-api

Install dependencies:

go mod download
# or using Task
task mod

Create a .env file from the example:

cp .env.example .env
# Edit .env and set your API_KEY and bucket names

Install Task runner (if not already installed):

# macOS
brew install go-task/tap/go-task

# Linux
sh -c "$(curl --location https://taskfile.dev/install.sh)" -- -d

# Windows (using Scoop)
scoop install task

Quick Start with Task

View all available tasks:

task --list
# or simply
task

Common operations:

# Run the server locally
task run

# Run tests
task test

# Run tests with coverage
task test-coverage

# Build the binary
task build

# Format code
task fmt

# Start development server with hot reload
task dev

Environment Variables Configuration

The application uses the following environment variables:

Variable	Description	Default	Required
`API_KEY`	API key for authentication	-	Yes
`PORT`	Port to run the server on	`8080`	No
`AWS_REGION`	AWS region for Transcribe service	-	Yes
`AWS_ACCESS_KEY_ID`	AWS access key ID	-	Yes (unless using IAM roles)
`AWS_SECRET_ACCESS_KEY`	AWS secret access key	-	Yes (unless using IAM roles)
`INPUT_BUCKET`	S3 bucket for input files	-	Yes (for Lambda)
`OUTPUT_BUCKET`	S3 bucket for output transcripts	-	Yes (for Lambda)
`WEBHOOK_URL`	Webhook URL to receive transcription results	-	No

Running the Server Locally

Using Task (Recommended)

# Run with default dev API key
task run

# Run with custom API key
API_KEY="your-secret-api-key" task run

# Run on custom port
PORT=3000 task run

Using Go directly

Set the required environment variables:

export API_KEY="your-secret-api-key"
export PORT="8080"  # Optional, defaults to 8080
export INPUT_BUCKET="go-transcribe-api-dev-input"
export OUTPUT_BUCKET="go-transcribe-api-dev-output"

Run the server:

go run cmd/api/main.go

Using Docker

# Build and run in Docker
API_KEY="your-secret-api-key" task docker

The server will start on http://localhost:8080 (or the port specified).

Testing the API

Health check (no authentication required):

curl http://localhost:8080/health
# or using Task
curl http://localhost:8080/health | jq .

Expected response:

{"status":"ok"}

Check transcription status (with authentication):

curl -H "X-API-Key: your-secret-api-key" http://localhost:8080/transcribe/status

Expected response:

{"message":"Transcription is triggered automatically via S3 uploads to the input bucket"}

Running Tests

Using Task (Recommended)

# Run all tests
task test

# Run tests with coverage
task test-coverage

# Run end-to-end tests (requires AWS credentials)
task test:e2e

# Run linter
task lint

# Clean build artifacts
task clean

Using Go directly

Run all tests with coverage:

go test -v -cover ./...

Run tests for a specific package:

go test -v ./internal/handlers
go test -v ./internal/middleware
go test -v ./internal/server

Run end-to-end tests (requires AWS credentials):

# Set up AWS credentials first
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION="us-west-1"
export INPUT_BUCKET="go-transcribe-api-dev-input"
export OUTPUT_BUCKET="go-transcribe-api-dev-output"

# Run e2e tests
go test -v ./tests

Generate coverage report:

go test -coverprofile=coverage.out ./...
go tool cover -html=coverage.out -o coverage.html

Serverless Deployment

Prerequisites

Install Serverless Framework:

npm install -g serverless

Install dependencies:

npm install

Configure AWS credentials:

aws configure

Deployment Steps

Using Task (Recommended)

Make sure you have a .env file with your API_KEY set, or pass it explicitly:

# Deploy using .env file
task deploy

# Or deploy with explicit API_KEY
API_KEY="your-api-key" task deploy

# Deploy to specific stage
STAGE=production task deploy

# View logs
task logs

Using Serverless directly

Set your API key as an environment variable:

export API_KEY="your-production-api-key"

Deploy to AWS:

serverless deploy --stage production --region us-west-1

Deploy to a specific stage:

serverless deploy --stage dev
serverless deploy --stage staging
serverless deploy --stage production

Post-Deployment

After deployment, the Serverless Framework will output:

API Gateway URL
Lambda function names
S3 bucket names

Note: The input bucket is automatically created by the S3 event configuration. You can upload files directly to the input bucket to trigger transcription.

Viewing Logs

View function logs:

serverless logs -f transcribe --tail
# or using Task
task logs

Removing the Deployment

Remove the deployed service:

serverless remove --stage production
# or using Task
STAGE=production serverless remove

Usage

Transcribing Audio/Video Files

Upload file to Input S3 Bucket:

aws s3 cp my-audio.mp3 s3://go-transcribe-api-dev-input/

Lambda automatically processes the file:
- Detects file format
- Starts AWS Transcribe job
- Polls for completion
- Saves result to output bucket
Retrieve transcription results:

# Download full JSON result
aws s3 cp s3://go-transcribe-api-dev-output/results/my-audio.json ./

# Download plain text transcript
aws s3 cp s3://go-transcribe-api-dev-output/transcripts/my-audio.txt ./

Webhook Notification (optional):
- If WEBHOOK_URL is configured, webhook is triggered when AWS Transcribe completes
- S3 event fires when transcribe-*.json file is created in output bucket
- Lambda downloads both AWS Transcribe raw output and processed result
- Authentication: Includes X-Api-Key header with the API_KEY value
- Timeout: 30 seconds
- Content-Type: application/json
- Payload includes:
  - Event metadata (bucket, key, timestamp, region)
  - Complete AWS Transcribe JSON output
  - Processed result JSON with extracted transcript
Webhook Request Headers:

Content-Type: application/json
User-Agent: go-transcribe-api/1.0
X-Api-Key: <your-api-key>

Webhook Payload Example:

{
  "event": {
    "bucket": "go-transcribe-api-dev-output",
    "key": "transcribe-my-audio-1234567890.json",
    "timestamp": "2024-01-15T10:30:00Z",
    "region": "us-west-1"
  },
  "transcribe_output": {
    "jobName": "transcribe-my-audio-1234567890",
    "accountId": "123456789",
    "results": {
      "transcripts": [{
        "transcript": "Full transcript text from AWS Transcribe..."
      }],
      "items": [...]
    },
    "status": "COMPLETED"
  },
  "processed_result": {
    "filename": "my-audio.mp3",
    "job_name": "transcribe-my-audio-1234567890",
    "status": "COMPLETED",
    "transcript": "Full transcript text...",
    "language_code": "en-US",
    "success": true
  }
}

Processed result format:

{
  "filename": "my-audio.mp3",
  "job_name": "transcribe-my-audio-1234567890",
  "status": "COMPLETED",
  "transcript": "This is the full transcript of the audio...",
  "language_code": "en-US",
  "output_file_url": "https://...",
  "success": true
}

Supported File Formats

The API automatically detects file formats based on file extensions:

Audio Files:
- MP3 (.mp3)
- WAV (.wav)
- FLAC (.flac)
- OGG (.ogg)
- AMR (.amr)
Video Files:
- MP4 (.mp4)
- WebM (.webm)

API Documentation

Endpoints

`GET /health`

Health check endpoint that returns the service status.

Authentication: Not required

Response:

Status: 200 OK
Body: {"status": "ok"}

`GET /transcribe/status`

Information endpoint about transcription triggering.

Authentication: Required (X-API-Key header)

Response:

Status: 200 OK
Body: {"message": "Transcription is triggered automatically via S3 uploads to the input bucket"}

Authentication

All endpoints (except /health) require API key authentication via the X-API-Key header.

Example:

curl -H "X-API-Key: your-api-key" https://your-api-url.com/transcribe/status

Error Responses:

401 Unauthorized - Missing or invalid API key
- {"error": "Missing API key"}
- {"error": "Invalid API key"}
- {"error": "API key not configured"}

Development Guidelines

Development Tools

Install development dependencies:

go install github.com/cosmtrek/air@latest
go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest

This installs:

air - Hot reload for development
golangci-lint - Linting tool

Start development server with hot reload:

task dev

Run code checks:

# Format code
task fmt

# Run linter (if installed)
task lint

# Run default task (format, test, build)
task default

Clean build artifacts:

task clean

Transcription Processing

Processing Flow

File Upload: User uploads audio/video to input S3 bucket
Event Trigger: S3 triggers Lambda with event details
Job Start: Lambda starts AWS Transcribe job with appropriate media format
Polling: Lambda polls job status every 5 seconds (max 10 minutes)
Completion: When completed, downloads transcript from AWS output
Storage: Saves formatted result to output S3 bucket

Features

Automatic Format Detection: Determines media format from file extension
Asynchronous Processing: Uses AWS Transcribe's async API
Polling with Timeout: Polls every 5 seconds, max 120 attempts
Error Handling: Individual file processing errors logged and stored
Unique Job Names: Timestamp-based unique identifiers prevent conflicts

AWS Transcribe Configuration

The implementation uses:

Language Code: English (en-US) by default
Media Format: Auto-detected from extension
Output Location: Controlled by OUTPUT_BUCKET environment variable
Job Timeout: 10 minutes maximum polling duration

Adding New Endpoints

Create a new handler in internal/handlers/
Add authentication by wrapping with middleware.AuthMiddleware()
Register the route in internal/server/server.go
Write comprehensive tests

Example:

// In internal/server/server.go
mux.HandleFunc("/api/jobs", middleware.AuthMiddleware(handlers.JobsHandler))

Code Style

Follow standard Go conventions
Use gofmt for formatting
Keep functions small and focused
Write tests for all new functionality
Use meaningful variable and function names

Troubleshooting

Common Issues

Server fails to start
- Check if the port is already in use
- Ensure all environment variables are set correctly
Authentication failures
- Verify the API_KEY environment variable is set
- Check that the X-API-Key header matches exactly
Transcription not triggering
- Verify file is uploaded to correct S3 bucket
- Check Lambda CloudWatch logs for errors
- Ensure file extension is supported
- Verify IAM permissions for S3 and Transcribe
Deployment issues
- Ensure AWS credentials are configured
- Check Serverless Framework version compatibility
- Verify the Go version matches the Lambda runtime
- Ensure the binary is built for Linux (GOOS=linux GOARCH=amd64)
Transcription jobs failing
- Check file format is supported by AWS Transcribe
- Verify IAM role has transcribe:StartTranscriptionJob permission
- Check CloudWatch logs for detailed error messages
- Ensure output bucket has write permissions

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with AWS Transcribe for high-quality speech-to-text
Uses AWS Lambda for serverless scalability
Inspired by modern serverless architectures

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github		.github
cmd		cmd
internal		internal
tests		tests
.air.toml		.air.toml
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
Taskfile.yml		Taskfile.yml
go.mod		go.mod
go.sum		go.sum
package.json		package.json
serverless.yml		serverless.yml

Folders and files

Latest commit

History

Repository files navigation

Go Transcribe API

Features

Project Structure

Architecture

S3 Event-Driven Flow

Components

Local Development Setup

Prerequisites

Installation

Quick Start with Task

Environment Variables Configuration

Running the Server Locally

Using Task (Recommended)

Using Go directly

Using Docker

Testing the API

Running Tests

Using Task (Recommended)

Using Go directly

Serverless Deployment

Prerequisites

Deployment Steps

Using Task (Recommended)

Using Serverless directly

Post-Deployment

Viewing Logs

Removing the Deployment

Usage

Transcribing Audio/Video Files

Supported File Formats

API Documentation

Endpoints

GET /health

GET /transcribe/status

Authentication

Development Guidelines

Development Tools

Transcription Processing

Processing Flow

Features

AWS Transcribe Configuration

Adding New Endpoints

Code Style

Troubleshooting

Common Issues

Contributing

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /health`

`GET /transcribe/status`

Packages