autonomous AI security agents that audit your codebase, prove exploitable vulnerabilities, and deliver fixes your team can ship.winfunc.com San FranciscoJoined December 2023
We discovered the same vulnerability too. :)
And @winfunction discovered 4 more remote RCE primitives in NGINX soon to be publicly disclosed.
Anywho, we're hiring security researchers with a knack on taming LLMs.
If you're interested in novel vulnerability research and autonomous exploitation with language models, DM me and I'll send you a fun CTF to solve. :)
Introducing nginx-poolslip, a fresh RCE for the the latest nginx release 1.31.0.
nginx-rift has been patched, but our security agent Vega has found a new 0 day.
We will release the full technical writeup with ASLR bypass 30 days after the patch on nebusec.ai.
We're doing an experiment with open models @winfunction to see how far we can push them to find vulns in hardened targets. So far:
- $4.5K in bounties from Chrome VRP with a few more pending, with the scans costing less than $100.
- 2 CVEs in NGINX (CVE-2026-28755 & CVE-2026-42926). And watch out for the next release!
- And 60ca500faea0fc70816bb9c53af3815e2af3e6c962b4b4ea63c33c62ebb4240d 👀
We're writing a blog on this soon.
During our YC (@ycombinator S24) batch, we had the awesome opportunity to meet @paulg and talk about what we're building: An autonomous AI hacker.
To showcase a fun demo, I remember opening my laptop in the Uber to his home and challenging our agents to find vulnerabilities in the old HackerNews codebase written in Arc.
For those unfamiliar, Arc is a programming language designed by PG and Robert Morris. And the old HN codebase is written in Arc.
We only got to talk about it with him but we just redid the experiment with our improved harness for fun!
And we wrote a blog about it: winfunc.com/research/hacki…
Vulnerability benchmarks rot. Cases leak into training data, scores measure memorization.
We built N-Day-Bench: tests LLMs on finding real vulnerabilities in real repos, refreshed monthly from live GitHub advisories. Blinded judging. All traces public.
Very interestingly, the latest model from @Zai_org, GLM 5.1 performs really well!
Link: ndaybench.winfunc.com
Currently testing GPT-5.4, Claude Opus 4.6, Gemini 3.1 Pro, GLM-5.1, and Kimi K2.5.
Every run publishes the full audit trail — shell commands, judge rationale, curator answer key, sandbox history. If a score looks wrong, you can trace it to a specific shell session on a specific line of code.
Results: ndaybench.winfunc.com
How it works: each month the benchmark pulls fresh cases from GitHub security advisories, checks out the repo at the last commit before the patch, and drops models into a sandboxed read-only shell (h/t just-bash by @cramforce).
The model never sees the fix. It starts from sink hints and has to trace the bug through actual code.
Only repos with 10k+ stars qualify. A diversity pass prevents any single repo from dominating the set. Ambiguous advisories (merge commits, multi-repo references, unresolvable refs) are dropped.
Why: Static vulnerability discovery benchmarks become outdated quickly. Cases leak into training data, and scores start measuring memorization. The monthly refresh keeps the test set ahead of contamination — or at least makes the contamination window honest.
New CVE in NGINX - CVE-2026-28755
NGINX stream module allows TLS handshake to succeed with revoked client certificates when ssl_ocsp on is configured.
This vulnerability was autonomously discovered by Winfunc's AI agent.
Read the write-up here: winfunc.com/findings/CVE-2…
The Recent CVEs in React and Node.js Were Found by an AI - winfunc.com/blog/recent-0-…
In December 2025 and January 2026, an AI system autonomously discovered zero-day vulnerabilities in Node.js and React, two of the most widely deployed JavaScript runtimes and frameworks in the world.
This post documents how these vulnerabilities were found, the technical details of the flaws, and what this means for the future of security research.
New blog post: The Recent 0-Days in Node.js and React Were Found by an AI
Covering the discovery of 0-days with AI, its implications, and "AI slop". Have a read.
winfunc.com/blog/recent-0-…
A new vulnerability in React Server Components (CVE-2026-23864) was disclosed today.
One of the DoS vectors was discovered by me with the help of an AI agent @winfunction.
Other vectors were also discovered by @ryotkak et al.
All users should upgrade to a patched version as
A new vulnerability in React Server Components (CVE-2026-23864) was disclosed today.
One of the DoS vectors was discovered by me with the help of an AI agent @winfunction.
Other vectors were also discovered by @ryotkak et al.
All users should upgrade to a patched version as soon as possible.
vercel.com/changelog/summ…
🚨 CVE-2026-21636 in Node.js (@nodejs)
Node.js permission model bypass via unchecked Unix Domain Socket connections (UDS)
This vulnerability was autonomously discovered by winfunc.com, an AI agent that can find, exploit, and patch security vulnerabilities in codebases.
Thanks to @_rafaelgss for triaging and fixing the issue.
755 Followers 643 FollowingHead of 3 Labs @AntGroup. Exploring the tech-business intersection. Transforming AI & Security into business momentum. AI/Risk/Data/Web3/Cyber/Red Teaming
7K Followers 480 FollowingCall me xsskiller!
Full time bug bounty hunter in China🇨🇳 Tencent Cloud Security Public Testing ranked No.1 and Tencent Security Response Center ranked No.2
22 Followers 409 FollowingFounder @ForwardCodeSol 🚀 AI & Cybersecurity Expert 🤖 Helping European businesses integrate AI safely | Italy 🇮🇹 | you need? we ready!
2K Followers 627 FollowingI do fuzzing on Google's Open Source Security Team.
I work on OSS-Fuzz/ClusterFuzz/FuzzBench.
Speaking on behalf of myself, not my employer.