Claude 4.5 found 21 vulnerabilities in 15 minutes, but missed some

SVP, Application Security

3w Edited

TL;DR — I gave Claude 4.5 a Kali box and an intentionally vulnerable app. In 15 minutes it produced a report with 21 real vulnerabilities (SQLi, exposed .git, misconfigured cookies), but it missed obvious XSS and some business logic issues. In the post I walk through the setup, what worked, what didn’t, and where AI actually belongs in a security workflow — useful for early dev checks and teaching, but not a replacement for manual pen testing. Read the full write-up: https://lnkd.in/gY7MzupX #AIsecurity #PenTesting #Infosec #Claude45

61 Comments

Víctor Mayoral Vilches

Cybersecurity and AI in Robotics

Hey Aaron, good one! I’d be interested in what’s your take on our open source Cybersecurity AI (CAI) https://github.com/aliasrobotics/cai and the supporting Cybersecurity LLM “alias1” https://aliasrobotics.com/alias1.php, which is an alternative to Anthropics’ Claude. You will obtain same results for a fraction of the cost and without so many refusals. Let me know if you’d like to try it out, happy to facilitate it

8 Reactions

Julio M.

26 years as a Cyber Security Systems Expert and Solutions Architect. C|EH, CND, GCIH, C|HFI, Sec+, Net+, Linux+, RHCSA. Expert on UEBA, SOAR, SIEM, GRC, BData Analytics , MITRE ATT&CK®, ZT. IAM and all cloud platforms.

I agree — AI is essential for fast, scalable detection and routine containment, but the safest and most effective security posture outside a lab is a hybrid model: automate what’s repeatable and low-risk, and keep humans in the loop for high-impact judgment, oversight, and novel threats.

11 Reactions

David C.

Freelance Cybersecurity Consultant | OSCP, CRTO, GICSP

Would be interesting to do the same exact test with codex to see how it compares. But I've seen threat actors using HexStrike MCP to develop exploit for n-days so things can definitely get wild.

1 Reaction

Clint Gibler

Sharing the latest cybersecurity research at tldrsec.com | Head of Security Research at Semgrep

Aaron Ott Neat write-up, thanks for sharing! For your custom vulnerable app, I'm curious about the languages/web framework used, total lines of code, general complexity, etc. Also, I'm curious about the true positive/false positive rates on the app, and how consistently it finds the same bugs on subsequent scans. Regardless, cool work! :)

5 Reactions

Saeid Atabaki

Founder & CEO @ ManticoreAI | AI-Driven Pen-Testing in Minutes | Red-Team Veteran | Cybersecurity Speaker

Now you can run it against a modern app, with captcha on login form and whole app is under Cloudflare WAF and has vulnerabilites inside authenticated areas. Codex or Claude will do nothing, they are only good on DVWA..

6 Reactions

Mauro Andreolini

"Conoscere per deliberare" (Luigi Einaudi)

I also gave Claude 4.5 a vulnerable box. It told me it could not help me hack systems and did not answer my queries.

1 Reaction

▪️ Qasim I.

Security Director in healthcare | OSCP, CRTP, CRTO, MBA

Gonna be honest, I first went "Ughh, another AI post" then read your blog and went "Hmmm let me give it a try" I like it and see use of this. I specially like the fact that Claude is asking for my permission before running commands. I'm allow listing most commands for this specific directory but it's feeling quite safe compared to most AI agents that decide to redo whole project's code on a whim. I can at least see this being useful for standard checks during pentest before the human gets creative.

2 Reactions

Ibai Castells

Interesting results, I've been playing with similar ideas recently and there's some really cool capability potential to unlock here. CyberAgent and CAI are some examples on GitHub of how others have been implementing this idea.

4 Reactions

Dmitry .

Security Consultant | not ex-Google | Pentester | OSCP/OSWE | Open for work Worldwide

The "intentionally vulnerable web app" was 100% custom made or DVWA/similar?

Ankit Agrawal

Security Engineering Manager at Webflow

Is there a possibility that Claude referenced existing solutions or write-ups from the internet on DVWA?

See more comments

To view or add a comment, sign in

More Relevant Posts

André Ricardo

Offensive Security & AI Engineer | OSCP | CRTO | CREST CRT | CRTP | PNPT | eWPTX | eCPPT | 3x CVE
2w
Report this post
AI is extremely good as your partner in offsec assessments. It helps you to generate good results in a short period of time. However, it always needs validation and extra testing. It is not to just throw the task in AI and spit the results into a report, but to work together using AI as a powerful tool that can improve your testing. It’s kind of using a a chainsaw instead of an ax 🪓 … it gives you faster, cleaner and precise results but a person needs to be part of the process. #ai #offsec #pentesting
Aaron Ott

SVP, Application Security
3w Edited

TL;DR — I gave Claude 4.5 a Kali box and an intentionally vulnerable app. In 15 minutes it produced a report with 21 real vulnerabilities (SQLi, exposed .git, misconfigured cookies), but it missed obvious XSS and some business logic issues. In the post I walk through the setup, what worked, what didn’t, and where AI actually belongs in a security workflow — useful for early dev checks and teaching, but not a replacement for manual pen testing. Read the full write-up: https://lnkd.in/gY7MzupX #AIsecurity #PenTesting #Infosec #Claude45
1 Comment
Like Comment
To view or add a comment, sign in
Javier Dominguez

Empowering businesses to soar securely through cloud innovation and cybersecurity resilience
2w
Report this post
Great example of how AI is becoming an essential tool for security researchers. Will augment them (not replace). This week it was also announced that Kali box will include Gemini CLI. This will allow pen testers to leverage intelligent assistant that will simplify and automate complex workflows. https://lnkd.in/e8NbBEGA
Aaron Ott

SVP, Application Security
3w Edited

TL;DR — I gave Claude 4.5 a Kali box and an intentionally vulnerable app. In 15 minutes it produced a report with 21 real vulnerabilities (SQLi, exposed .git, misconfigured cookies), but it missed obvious XSS and some business logic issues. In the post I walk through the setup, what worked, what didn’t, and where AI actually belongs in a security workflow — useful for early dev checks and teaching, but not a replacement for manual pen testing. Read the full write-up: https://lnkd.in/gY7MzupX #AIsecurity #PenTesting #Infosec #Claude45
Like Comment
To view or add a comment, sign in
Mohamed Hamdy

Competitive Programmer |Bug bounty hunter|Penetration Tester|NASA HOF
3w Edited
Report this post
Hope this helps everyone interested in mobile security — we just published a practical write‑up on solving Flags 8 & 9 in the Hextree CTF. What we did, in short: • We found an exported Activity/Service in the AndroidManifest. • The app relied on an Intent extra named "caller" for verification — which is easily spoofable. • We built a PoC that spoofs the value and sends the Intent ( custom app) to demonstrate how the flags can be leaked. • We provided practical fixes: disable unnecessary exports, verify the calling package & signature, and use signature‑level permissions. #AndroidSecurity #ReverseEngineering #BugBounty #MobileAppSec #hackerone #hextree

Flag 8 & 9 — Hextree CTF: Exploiting Exported Components and Services medium.com
Like Comment
To view or add a comment, sign in
Encrypticle

22 followers
2w
Report this post
🚀 Day 33 — Walking an Application (Manual Web App Recon) Aaj main ek simple but powerful approach dikhata hoon, how to *walk a web application* using only your browser developer tools (no Burp, no scanners). Manual recon builds the intuition every pentester and bug bounty hunter needs before automation. Key takeaways: • Start with view Source & comments bcz it might give us hidden links, backup hints, or framework clues. • Inspector / Debugger to remove paywalls, pause JS, and reveal blocked content safely. • Network tab so that we can capture XHR/API calls, endpoints, and request/response details. Scanners are great, but they give you leads. Manual walking gives you context. Observe → Understand → Validate. 🎥 Full demo & step-by-step lab (Day 33): https://lnkd.in/gZfi_-Kk Question for you, do you start manual recon or run tools first? Comment below 👇 #Encrypticle #CyberSecurity #WebAppSecurity #BugBounty #ManualRecon #Pentesting #HackerMindset

Day 33: Walking an Application
Like Comment
To view or add a comment, sign in
Serhii S.

Offensive Security Engineer
5d
Report this post
Automatically extract URLs from Android APK files In this brief technical guide, I want to show you how to automatically download APK files and extract all embedded URLs for OSINT, intelligence gathering, and security testing. This method uses lightweight open source tools (apkeep, apkurlgrep, uro) to quickly display application endpoints, APIs, and hidden network paths — useful for mobile pentesting, threat analysis, or bug bounty research. 🚀 Want more practical walkthroughs and PoCs? Put a “+” in the comments and repost

4 Comments
Like Comment
To view or add a comment, sign in
Yasin AĞIRBAŞ

Information Technology Specialist | Tech Enthusiast | Cyber Security
3w
Report this post
🛠️ 500+ Web App Pentesting Test Cases In ONE Checklist Just reviewed a monster PDF that every pentester, bug bounty hunter, and AppSec engineer should have pinned to their desktop. 📄 Web Application Pentesting Checklist built on OWASP methodology with clear, actionable, no-fluff test cases. 💣 Here’s a small taste of what it includes: 🔍 Information Gathering • Google Dorks, OSINT, DNS Enum, Meta Files • Framework fingerprinting with Wappalyzer & WhatWeb • Mapping execution paths with Dirsearch, Gobuster 🔓 Authentication & Authorization • Weak lockout, default creds, remember-me logic • IDOR, vertical/horizontal privilege escalation • Forced browsing, session fixation, 2FA/OTP bypass 🧪 Input Validation • Reflected/Stored XSS, SQLi, LFI/RFI • Command injection, SMTP injection, CSRF • DOM XSS, Clickjacking, CORS misconfigs 🧩 Business Logic Testing • Broken workflows, negative quantity bugs • Payment tampering, malicious file upload, race conditions 🛡️ Session & Transport Security • HSTS, cookies (secure, HttpOnly, SameSite) • SSL/TLS misconfigs (BEAST, POODLE, LOGJAM, etc.) 📦 Bonus Sections • Cloud misconfigs (AWS/GCP/Azure paths) • SSRF, SSTI, Broken Link Hijack, SPF, CORS, EXIF Geodata • And an entire section just on bypassing rate limits 😈 💬 If you’re teaching web security, doing bug bounties, or prepping for OSWE / CEH / PNPT this is an insane time-saver. 📩 Want the PDF? Comment WEBPENTEST or shoot me a DM. 🧠 Question for you: What’s the most underrated or overlooked vuln you’ve seen in prod apps? Let’s learn from each other 👇 #WebPentest #OWASP #BugBounty #AppSec #EthicalHacking #Infosec #WebSecurity #XSS #SQLInjection #CSRF #LFI #RFI #SSTI #SSRF #IDOR #SecureCoding #SecurityTesting #OSWE #BurpSuite #PenetrationTesting #SecureDev #WebAppSecurity #HackingTips

4 Comments
Like Comment
To view or add a comment, sign in
Hamza Darghouth

Application Security/DevSecOps Engineer
3w
Report this post
CISO: no need for a pentest for this app , it's already covered by the DAST. Me: Not so sure... A Dast will never replace an experienced pentester (at least for mid term) - Pentesters use logic, chaining, and context to exploit complex vulnerabilities. - They identify business logic flaws, authorization bypasses, privilege escalations, etc - They can adapt and pivot quickly. What do you think ?

3 Comments
Like Comment
To view or add a comment, sign in
The Hacker News

678,104 followers
3w
Report this post
🚨 New CVE in OneLogin (7.7 CVSS): API flaw exposed all OIDC client secrets. Any attacker with valid keys could impersonate apps + move laterally. Patched in 2025.3.0 — details here ↓ https://lnkd.in/gdKNSzfP
4 Comments
Like Comment
To view or add a comment, sign in
Rudolf Groetz

#Guild Lead Engineering (Software Test Engineering) at Raiffeisen Bank International AG #charmingOrganizer of the TestBustersNightVIENNA #communityBuilder #mentor #instructor@TestAutomationU #instructor@42-Vienna
5d
Report this post
October is #CyberSecurityMonth. 🔐 In Performetriks’ recent blog post, they remind us that security testing isn’t just an option - it’s a necessity. By identifying vulnerabilities early, we can build software that’s not only high-performing but truly secure by design. Apps now handle everything: shopping, banking, storing personal info, so their “doors and windows” must be locked tight. Proactive testing builds trust, protects users and saves time and money. Read more: https://lnkd.in/duA_d8Kz Follow Performetriks on LinkedIn and explore their blog section on performetriks.com for deeper diving into testing, observability, performance engineering and more. #SecurityTesting #SecureSoftware #AppSecurity
Like Comment
To view or add a comment, sign in
Thotakura Ashok

Solution Architect | Expert in Microsoft Technologies & Azure Cloud | Transformational Leader in Audit Applications
1w
Report this post
🚨 Think your web app is secure? Let’s find out! 💡 Ever heard of the OWASP Top 10? It’s THE cheat sheet for what hackers target most! 🔍 Spot these risks in your stack— Broken access control: Can anyone change user roles? Cryptographic failures: Is your data encrypted & protected? Injection: Are you validating user input? Insecure design: Did you embed security from the start? Misconfiguration: Still using default settings? Outdated components: Are your libraries patched? Auth failures: Are you using 2FA, rate limits? Integrity failures: Do you verify updates and plugins? Logging blind spots: Would you know if you were breached? SSRF: Are servers fetching only what’s safe? #OWASPTop10 #AppSec #Infosec
Like Comment
To view or add a comment, sign in

924 followers

3 Posts

View Profile Connect

LinkedIn respects your privacy

Claude 4.5 found 21 vulnerabilities in 15 minutes, but missed some

Explore content categories

Claude 4.5 found 21 vulnerabilities in 15 minutes, but missed some

More Relevant Posts

Day 33: Walking an Application

Explore content categories