Tools: Breaking: This Is What’s Really Hitting Your Website (Hint: Not People)

Tools: Breaking: This Is What’s Really Hitting Your Website (Hint: Not People)

Source: Dev.to

This Is What’s Really Hitting Your Website (Hint: Not People) I wanted to understand how much of our traffic was actually human, so I pulled and analyzed 48 hours of raw request logs. No filters, no analytics layer, just direct log data. Start: 2026-03-31 10:00 UTC End: 2026-04-02 10:00 UTC All requests within that window were grouped by path patterns and behavior. Requests were classified into four categories: Roughly 79 percent of requests were not normal user activity. Below is a subset of IPs with the highest request volume or repeated attack patterns during the window: 185.220.101.45 WordPress login brute force patterns 45.146.165.12 XMLRPC pingback attempts 103.248.70.33 PHP endpoint scanning 91.134.23.198 Multi-path probing (/admin, /login, /.env) 176.65.148.92 High-frequency requests consistent with botnet behavior 198.54.117.210 Credential stuffing attempts 5.188.62.76 Known scanner signature patterns 194.147.142.88 Repeated wp-login hits 212.83.150.120 PHPMyAdmin probing 139.59.37.12 Generic crawler with attack signatures Many of these generated hundreds to thousands of requests over the 48 hour period. Observed Attack Patterns WordPress Probing Even on non-WordPress systems, these paths were repeatedly hit: This is automated scanning, not targeted behavior. Common uses include pingback abuse and brute force via API endpoints. Requests targeting common configuration and entry points: Repeated requests to: Often with high frequency and rotating IPs. If you rely on standard analytics: Traffic volume may be inflated Engagement metrics may be misleading Infrastructure may be handling unnecessary load More importantly, this traffic is constant. It is not tied to visibility or popularity. Any exposed service will receive it. After seeing this across multiple systems, we started aggregating this data instead of treating each site in isolation. Track IPs across multiple deployments Classify behavior based on request patterns Identify repeat offenders Apply blocking rules based on shared observations This evolved into a simple shared threat dataset. Threat Network Concept Instead of reacting per site: An IP flagged on one system is known to others Patterns such as WordPress probing or XMLRPC abuse are categorized Repeated behavior increases confidence in classification Blocking decisions become faster and more consistent This reduces duplicate analysis and speeds up mitigation. After applying filtering based on this data: Cleaner traffic metrics Reduced unnecessary requests Lower noise in logs Better visibility into actual users Closing The main takeaway from this dataset is straightforward. A large portion of inbound traffic to public web services is automated and non-user driven. This data is from a limited 48 hour window across a small set of systems. Patterns may vary, but the presence of automated scanning is consistent. If you are interested in testing this type of visibility or contributing additional data points, I am running a small beta around this approach. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to ? It will become hidden in your post, but will still be visible via the comment's permalink. as well , this person and/or - WordPress probing (paths containing wp) - XMLRPC access attempts - PHP endpoint probing - General scanning and enumeration - WordPress probes: 34 percent - XMLRPC attempts: 18 percent - PHP probes: 27 percent - Other scanning: 21 percent - /wp-login.php - /wp-content/plugins/ - /config.php - /db.php These are looking for exposed configs or weak deployments.