Crawler Analyzer Troubleshooting Guide

Go Back to Main Knowledgebase

If you’re experiencing issues with the Crawler Analyzer feature, this guide will help you identify and resolve common problems quickly.

No Crawl Data Showing

When you open the Crawler Analyzer and see no data, it’s usually one of a few straightforward issues.

The feature needs time to collect data

The most common reason is that the feature simply hasn’t had time to collect data yet. Crawler logging needs actual bot visits to your site to record data. If you’ve just enabled the feature, you’ll need to wait for search engines and other bots to visit naturally. This typically happens within 24-48 hours for active sites.

To verify if logging is working:

  1. Navigate to the Crawler Analyzer page
  2. Check the Overview tab for any activity
  3. Try expanding your date range to 30 Days or 90 Days
  4. Check the Logs tab to see if any individual visits have been recorded

Check that logging is enabled

Crawler logging can be enabled or disabled in Linkilo’s settings. Go to Linkilo → Settings and look for the Crawler Analyzer section. Make sure the logging toggle is turned on.

Verify your bot tracking settings

The Crawler Analyzer only records visits from bots you’ve chosen to track. By default, major search engines (Googlebot, Bingbot) and common AI bots are enabled.

To check your tracking settings:

  1. Go to Linkilo → Settings
  2. Find the Crawler Analyzer section
  3. Review which bots are enabled for tracking
  4. Enable Other Bots/Crawlers if you want to catch all bot traffic

Database table issues

Sometimes the database table needed for logging doesn’t get created properly. To fix this:

  1. Deactivate the Linkilo plugin
  2. Wait a few seconds
  3. Reactivate the plugin
  4. This will recreate any missing database tables
Tip: For testing purposes, you can add ?sla_generate_test_data=1 to your Crawler Analyzer URL (while logged in as admin) to generate sample data. This helps verify the feature is working correctly.

 

Smarter Internal Linking, Zero Effort

Link suggestions appear right inside your editor. One click, and it's done.

Try the Plugin

Inaccurate or Missing Bot Detection

If you’re seeing bot traffic in your server logs but it’s not appearing in the Crawler Analyzer, or if bots are being misidentified, there are several things to check.

Bot identification relies on user agent strings

The plugin identifies bots by matching patterns in their user agent strings. The current version supports over 40 different bots including:

  • Search engines: Googlebot, Bingbot, YandexBot, Baiduspider, DuckDuckBot, Applebot, and more
  • AI training bots: GPTBot, ClaudeBot, Google-Extended, Meta-ExternalAgent, PerplexityBot, CCBot, Bytespider
  • AI assistant bots: ChatGPT-User, Claude-User, Perplexity-User, OAI-SearchBot
  • Social media: Facebook, Twitter, LinkedIn, Pinterest, Slack, Discord, Telegram, WhatsApp
  • SEO tools: AhrefsBot, SemrushBot, MJ12bot, DotBot, and more

 

Enable “Other Bots” for comprehensive tracking

If you want to catch bots that aren’t specifically recognized, make sure Other Bots/Crawlers is enabled in your settings. This uses generic pattern matching to detect anything that looks like a bot but doesn’t match a known pattern.

Understanding false positives

Some automated tools or scrapers pretend to be search engine bots by using fake user agent strings. The plugin can’t always distinguish between a real Googlebot and something pretending to be Googlebot. This is a limitation of user agent-based detection. For critical security purposes, consider additional server-level verification.

Note: Static assets (CSS, JS, images, fonts) are automatically excluded from logging to reduce database size and improve performance.

 

Health Score Questions

The Health Score is calculated from four components, each weighted at 25%:

Component What It Measures
Crawl Frequency How often Googlebot visits your site. Calculated as days with crawl activity out of the last 7 days.
Coverage Percentage of your published content that Googlebot has crawled in the last 30 days.
Error Rate Percentage of crawl requests returning error status codes (4xx, 5xx). Lower is better.
Response Time Average server response time. Under 500ms is ideal; over 1 second significantly impacts score.

 

“Not enough data” message

The Health Score needs at least 7 days of data to calculate meaningful metrics. If you’ve just enabled the feature, wait for data to accumulate before relying on the score.

Score seems too low

Check the Health tab for a detailed breakdown. Common causes of low scores:

  • Low crawl frequency: Submit your sitemap to Google Search Console and ensure it’s up to date
  • Poor coverage: Improve internal linking to orphaned pages
  • High error rate: Fix 404 errors and broken links
  • Slow response time: Optimize server performance and enable caching

 

Coverage Percentage Seems Wrong

If coverage percentages don’t match your expectations, understand how they’re calculated. The plugin compares unique URLs crawled by each bot against your total published content (posts and pages).

Common reasons for unexpected coverage numbers

  • More content than you realize: The total includes all published posts, pages, and custom post types—even old or forgotten content
  • Content blocked from crawlers: Check your robots.txt file and ensure important content isn’t accidentally blocked
  • Sitemap issues: Verify your XML sitemap is up to date and submitted to search engines
  • Retention period limits: Coverage is calculated based on your data retention period. Recently cleared logs will show lower coverage

 

“Days Since Last Crawl” Showing Unexpected Results

This feature tracks when each page was last visited by the selected bot. If the numbers seem off, here’s what to check:

Data retention limits

The feature can only report on data within your retention period. If retention is set to 30 days, any page not crawled in those 30 days will show as “Never Crawled” even if it was crawled 31 days ago.

To get more accurate results:

  • Increase your retention period to 90-200 days (recommended for this feature)
  • Allow time for data to accumulate after changing the setting
  • Remember that historical data before you enabled the feature won’t be available

Important pages showing as “Never Crawled”

This is actually valuable information! If important pages haven’t been crawled within your retention period, they need attention:

  • Improve internal linking to these pages
  • Submit them directly through Google Search Console
  • Update the content to trigger a recrawl
  • Check if they’re excluded by robots.txt

 

Performance Issues with Large Datasets

If the Crawler Analyzer interface is loading slowly or timing out, you’re likely dealing with too much data at once.

Immediate solutions

  • Use a shorter time range (7 days instead of 30 or 90)
  • Filter by specific bots rather than viewing “All Bots”
  • Apply URL filters to focus on specific sections of your site
  • Use the Logs tab’s pagination instead of trying to load everything at once

For sites with heavy bot traffic

Consider adjusting your retention settings to keep less historical data. While longer retention (90-200 days) is recommended for the “Days Since Last Crawl” feature, you might need to balance this with performance. The plugin caches dashboard data for 5 minutes and health scores for 15 minutes to reduce database load.

Performance tip: If you’re experiencing timeouts, try reducing the Logs per page setting from 100 to 50, and use date filters to narrow your query range.

 

Alerts Not Working

Alerts are checked hourly and sent via email to the WordPress admin email address. If you’re not receiving alerts:

  1. Verify the alert is active: Check the Alerts tab and make sure your alert shows as enabled
  2. Check your email settings: Alerts are sent to the WordPress admin email (Settings → General → Administration Email Address)
  3. Check spam folders: Alert emails might be filtered as spam
  4. Verify WordPress can send email: Use a plugin like WP Mail SMTP to ensure your site can send emails
  5. Check cron is running: Alerts depend on WordPress cron. If your site has low traffic, cron may not run frequently

Available alert types

  • No Crawl Activity: Triggers when a specific bot (like Googlebot) hasn’t visited for X hours
  • High Error Rate: Triggers when error responses exceed your specified percentage threshold

 

Chart Display Issues

If charts aren’t displaying or showing incorrect data:

  • Clear your browser cache – Sometimes old JavaScript gets cached
  • Check browser console for errors – Press F12 in most browsers to open developer tools
  • Try a different browser – Rule out browser-specific issues
  • Disable browser extensions – Some ad blockers interfere with JavaScript
  • Ensure JavaScript is enabled – Charts require JavaScript to render

Understanding Bot Behavior Patterns

Sometimes the data is accurate but confusing. Here are common patterns and what they mean:

Sudden drops in crawl activity

  • Your site was temporarily unavailable
  • Changes to robots.txt blocking access
  • Server performance issues causing timeouts
  • Natural fluctuation in bot behavior (this is normal)

Mostly homepage crawls

  • Poor internal linking structure
  • New content not being discovered
  • Possible crawl budget issues on large sites

High number of 404 errors

  • Deleted content still being requested
  • Incorrect internal links
  • Old external links pointing to moved content

Lots of AI bot traffic

This is normal in 2025. AI crawlers now make up over 50% of bot traffic on many sites. Use the Bot Reference tab to understand each bot and decide which to allow or block via robots.txt.

When to Contact Support

Reach out to support if you experience:

  • Database errors appearing in the interface
  • Complete feature failure after following troubleshooting steps
  • Data that’s clearly incorrect (like timestamps from the future)
  • PHP errors or white screens when accessing the feature

When contacting support, please provide:

  • Your WordPress version
  • PHP version
  • Current retention settings
  • Screenshots of any error messages
  • Description of what you expected vs. what you’re seeing

 

Best Practices for Reliable Data

Start with default settings and adjust gradually. The default configuration works well for most sites.

 

  • Allow time for data collection: It takes at least a week to see meaningful patterns and a month for reliable trends
  • Focus on trends, not individual events: Bot behavior is naturally variable. Look for patterns over time
  • Review settings quarterly: Check retention settings and bot tracking preferences periodically
  • Use the Export feature: Save periodic reports for long-term trend analysis
  • Set up alerts: At minimum, create a “No Crawl” alert for Googlebot (24-48 hours)
  • Check the Bot Reference tab: Learn which bots to allow and which you can safely block

 

Remember: Crawler analysis is just one tool in your SEO toolkit. Use it alongside search rankings, traffic data, and user engagement for a complete picture.

0
651