URLGuard

On-device ML-powered phishing defense with privacy-first P2P threat sharing and homograph detection.

Recognition

Ranked 13th out of 925 submissions at the Amplicode Hackathon and 2nd Place in the Social Security & Privacy track. PrivacyGuard was evaluated against commercial-grade browser security tools and research prototypes from participants across the world.

Overview

URLGuard is a Chrome extension that protects users from phishing attacks and malicious websites using a combination of on-device machine learning, heuristic analysis, and Unicode homograph detection. Built with a privacy-first architecture — all URL analysis happens locally in the browser; no data is transmitted to any external server.

Features

AI-Powered Detection

Custom TensorFlow.js model trained on 100k+ balanced samples, achieving 88.88% test accuracy
16 lexical URL features extracted and analyzed entirely on-device
Sub-500ms inference latency per URL

Tiered Alert System

Risk Level	Score Range	Behavior
High Risk	> 75	Full-page interstitial block
Caution	30 – 75	Non-intrusive corner notification
Safe	< 30	Silent background monitoring

Multi-Layer Protection

Homograph detection — flags Punycode (xn--) and mixed-script Unicode spoofing
Heuristic engine — evaluates HTTPS presence, suspicious form elements, IP-based domains, URL entropy
Smart whitelisting — user-controlled trust preferences persist via local storage
P2P threat sharing — mock implementation demonstrating a community-driven, privacy-preserving threat network

Architecture

PrivacyGuard/
├── js/
│   ├── content.js          # Core analysis engine and alert injection
│   ├── tf.min.js           # TensorFlow.js runtime
│   └── tfjs_model/         # Converted TF.js Graph Model
├── popup/
│   ├── popup.html          # Extension popup interface
│   ├── popup.js            # UI logic and controls
│   └── popup.css           # Popup styling
├── manifest.json           # Extension configuration (Manifest V2)
└── icons/

Detection Pipeline

Browser Extension
├── Content Script        ML inference, heuristics, homograph checks, DOM alerts
├── Popup Interface       Risk score display, user controls, per-URL analytics
└── Background Service    Persistent storage, settings management, P2P simulation

Detection Engines
├── ML Engine             TF.js model, 16 lexical features, sigmoid output
├── Heuristic Analysis    HTTPS, form detection, URL structure patterns
└── Homograph Detection   Punycode decoding, mixed-script analysis, confusable mapping

Alert Isolation via Shadow DOM

Alerts are injected using a closed Shadow DOM to prevent CSS bleed into host page layouts:

const createAlert = (riskData) => {
    const container = document.createElement('div');
    const shadow = container.attachShadow({ mode: 'closed' });
    shadow.innerHTML = `
        <style>
            .privacy-guard-alert {
                position: fixed;
                z-index: 2147483647;
                font-family: -apple-system, BlinkMacSystemFont, sans-serif;
            }
        </style>
        ${getAlertHTML(riskData)}
    `;
    document.body.appendChild(container);
};

Model Details

Training

Property	Value
Primary dataset	ebubekirbbr/dephides (~100k samples)
Validation dataset	Tranco Top 1M
Model accuracy	88.88%
Architecture	`Input(16) → Dense(32, ReLU) → Dense(16, ReLU) → Dense(1, Sigmoid)`
Conversion pipeline	Keras → TensorFlow SavedModel → TF.js Graph Model (tensorflowjs_converter v4.22.0)
Feature normalization	MinMaxScaler parameters ported to JavaScript

Feature Set (16 Lexical Features)

url_length, hostname_length, path_length, query_length, num_dots, num_hyphens, num_at, num_question_marks, num_equals, num_underscore, num_percent, num_slash, has_https, has_ip, num_digits, num_letters

Runtime Performance

Metric	Value
Analysis latency	~450ms per URL
Additional memory	~15MB
CPU impact	< 2% during analysis
Model load time	1.8s (cached after first load)

Installation

git clone https://github.com/AdityaP700/PrivacyGuard.git
cd PrivacyGuard

Open chrome://extensions/
Enable Developer Mode (top-right toggle)
Click Load unpacked and select the PrivacyGuard directory
Pin the extension for quick access

Test Scenarios

# Serve local test files
python -m http.server 8000

# Green  — safe browsing
https://www.google.com

# Yellow — suspicious HTTP site
http://localhost:8000/college.html

# Red    — Punycode homograph attack (fake PayPal)
http://www.xn--pypal-4ve.com/

Developer Reference

// Inspect current whitelist
chrome.storage.local.get('privacyGuardWhitelist', console.log);

// View P2P data stores
chrome.storage.local.get(
    ['privacyGuardP2PUserPhishing', 'privacyGuardP2PUserSafe'],
    console.log
);

// Trigger manual analysis
analyzeCurrentURL().then(console.log);

Known Limitations

P2P threat sharing is a mock implementation backed by chrome.storage.local; no real peer network exists
Homograph detection covers Latin-script confusables; full multilingual coverage is pending
False positive rate is approximately 5.8%, addressable via per-domain whitelisting
Page content analysis is limited to form element detection; no DOM-level behavioral analysis
Optimized for Chromium-based browsers (Chrome, Edge); Firefox requires API polyfills

Roadmap

v1.1

Retrain model on a larger, more diverse multilingual dataset
Dark mode UI support
Threat statistics dashboard with trend visualization
Export and import of user settings and whitelists

Future Work

Federated learning — collaborative model improvement without raw data collection
Real P2P via WebRTC — decentralized, anonymized threat sharing with trust scoring
Visual identity checks — perceptual hashing for logo and favicon mismatch detection
Domain intelligence — Newly Registered Domain (NRD) detection and WHOIS integration
Manifest V3 migration — declarativeNetRequest for improved performance and security
Cross-browser support — Firefox (via browser.* polyfills), Safari (WebExtensions API)

Contributing

# Fork and clone
git clone https://github.com/AdityaP700/PrivacyGuard.git

# Create a feature branch
git checkout -b feature/your-feature-name

# Test changes thoroughly before submitting a pull request

Bug reports and feature requests: GitHub Issues
Discussions: GitHub Discussions

Acknowledgments

Training data: ebubekirbbr/dephides, Tranco
ML runtime: TensorFlow.js
Hackathon: Amplicode

License

MIT — see LICENSE for details.

Contact: [email protected] · @AdityaPat_ · LinkedIn

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
css		css
icons		icons
js		js
ml_training		ml_training
popup		popup
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bfg-1.15.0.jar.REMOVED.git-id		bfg-1.15.0.jar.REMOVED.git-id
college.html		college.html
git-filter-repo.py		git-filter-repo.py
learn-more.css		learn-more.css
learn-more.html		learn-more.html
learn-more.js		learn-more.js
manifest.json		manifest.json
package.json		package.json
phising.html		phising.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

URLGuard

Recognition

Overview

Features

AI-Powered Detection

Tiered Alert System

Multi-Layer Protection

Architecture

Detection Pipeline

Alert Isolation via Shadow DOM

Model Details

Training

Feature Set (16 Lexical Features)

Runtime Performance

Installation

Test Scenarios

Developer Reference

Known Limitations

Roadmap

v1.1

Future Work

Contributing

Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

URLGuard

Recognition

Overview

Features

AI-Powered Detection

Tiered Alert System

Multi-Layer Protection

Architecture

Detection Pipeline

Alert Isolation via Shadow DOM

Model Details

Training

Feature Set (16 Lexical Features)

Runtime Performance

Installation

Test Scenarios

Developer Reference

Known Limitations

Roadmap

v1.1

Future Work

Contributing

Acknowledgments

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages