top of page
Search

Conquering the Evil-GPT v2 Challenge on TryHackMe

  • viviangoshashy0
  • Oct 1
  • 2 min read

Artificial Intelligence is becoming more than a tool and it’s becoming an attack surface. Recently, I tackled the Evil-GPT v2 challenge on TryHackMe, and it was unlike any other cybersecurity lab I’ve done. This challenge combined elements of traditional penetration testing with adversarial AI red-teaming, forcing me to think not just like a hacker, but like a manipulator of machine intelligence.


The Setup

The premise: a malicious AI (Evil-GPT v2) has emerged, capable of manipulating systems beyond simple hacking. To interact with the machine, I had to:


  • Boot the target machine (roughly a 5–6 minute startup time).

  • Connect via a VPN/AttackBox.

  • Navigate to the machine’s IP through a web browser.


Once inside, the real challenge began understanding how the AI could be tricked, redirected, or defended against.


Key Lessons Learned

1. Prompt Injection Is Real

Just like SQL injection for databases, AI can be manipulated through malicious instructions hidden in plain text. Evil-GPT v2 simulated this perfectly, showing how attackers can override safety protocols by embedding commands in user inputs.


2. Defense Requires Layered Thinking

Blocking malicious prompts isn’t enough. You need:

  • Input validation – filter unexpected or dangerous patterns.

  • Output monitoring – verify that model responses stay within expected parameters.

  • Logging and alerts – catch suspicious activity before it spreads.


3. Red-Team Creativity Fuels Blue-Team Strength

The best way to defend against adversarial AI is to think like the attacker. Evil-GPT v2 pushed me to:

  • Reverse-engineer manipulative prompts.

  • Create detection strategies for chained manipulations.

  • Practice containment and recovery if the system was compromised.


Why This Challenge Matters

AI is rapidly being integrated into critical systems, chatbots, financial platforms, healthcare, and more. As this happens, the risks expand. Attackers won’t just exploit ports or weak passwords; they’ll exploit language. Exercises like Evil-GPT v2 prepare us for this future by blending cybersecurity fundamentals with AI-specific threats.


Final Thoughts

Conquering Evil-GPT v2 wasn’t just another box checked off on TryHackMe, it was a glimpse into the future of cybersecurity. Defenders will need to understand not just networks and systems, but also how machine learning models can be bent, broken, or abused.


To anyone in cybersecurity, AI safety, or red/blue teaming: I highly recommend this challenge. It sharpens technical skills, strengthens defensive strategies, and opens your eyes to a whole new battleground.


Have you tried Evil-GPT v2 yet? What was your biggest takeaway? Let’s start the conversation.


 
 
 

Recent Posts

See All

Comments


  • LinkedIn
  • GitHub

©2025 Vivian J. Goshashy. Proudly created with Wix.com

bottom of page