Trickshot: The Problem With Multiple Solutions

The trick of being a CTF challenge author is making a problem that is fun to solve while presenting the appropriate level of challenge…

But the next level of challenge is making a challenge that is accessible and entertaining for spectators, which is what we try to do with LiveCTF by commentating challenges in a sportscast-style livestream format and showcasing different solutions.

As a part of the DEF CON qualifier CTF this year, LiveCTF ran 6 challenges that were all tests of speed, where all 200+ teams raced to solve different kinds of problems in order to score points for their team in the main competition.

My goal was to write a challenge that would be fun for competitors, allow the organizers to talk about and compare different strategies, and also naturally lend itself to making helpful and interesting visuals for the audience!

A tall order, and I think we pulled it off… but let’s start at the beginning.

The Meta-Challenge #

This all started from the idea of having a challenge where we could visually see the efforts of each competitor, which would allow us to look across the field and compare and contrast approaches.

My initial inclination was to write a fuzzing challenge, but a different and more interesting idea emerged while working on that.

Inspired by a previous DEF CON finals challenge ( pinboooll), the basic idea was a reversing challenge where players supply an input and score points by executing different parts of code, and this is what became trickshot.

Screenshot of Pinboooll: squares are basic blocks that get executed

The difficulty of many CTF challenges comes from either figuring out the goal (“strategy”), or figuring out exactly how to accomplish that goal fast enough (“execution”).

For trickshot, I wanted there to be multiple options for both the strategy AND execution, while not being too hard since we had a hard limit of four hours for LiveCTF challenges during the DEF CON qualifiers.

Since the format is a race to see who can solve first, I wanted a big part of this challenge to be the decision competitors would make regarding which would be the fastest combination of strategy and execution.

All Roads Lead to Pwn #

trickshot is a reversing problem with two components: a binary and a small Python wrapper script, and the wrapper script makes the overall goal obvious: score points above a certain threshold (640,000) to pop a shell and get the flag.

Strings in the script and binary make it clear that players can both score points and gain score multipliers, and the Python script has a condition that if you collect a certain number of score multipliers, you get a mega-boost of points (600,000) that makes getting over the threshold easy.

The binary itself is not obfuscated, and it also prints out both points scored and score multipliers to stdout, so players can easily see how scoring works.

The basic structure of the binary is that each chunk of input is parsed, and if particular inputs are found, different code will execute and yield some combination of points and multipliers. So players have to figure out what code they want to reach in order to get their score above the threshold.

Control Flow Graph of the main function of trickshot with points & multiplier blocks highlighted

The screenshot above shows the control flow graph of the main function, with the basic blocks that indicate scoring points highlighted in blue, and blocks with multipliers in red.

Once players get to this basic understanding of the problem, my hope was that they’d start thinking about different ways to get enough points: whether they needed to get the mega-boost or if they could reach the winning score just with multipliers. Of course, I deliberately tuned the math so it would be possible to win without getting all the multipliers, and wanted to reward potential shortcuts in a speed problem ;)

Regardless of the strategy the players chose, the challenge is that they still have to pull it off faster than their competitors!

Reversing Tools & Techniques #

In order to reach the appropriate score multipliers, competitors had to craft an input that would reach the corresponding code. The two general strategies to do this quickly are: 1) manually generate an input as they reverse-engineer the logic or 2) identify where they wanted to reach and use a symbolic execution tool like angr to generate an input that reaches the specified blocks.

Due to the way the competition was set up, we didn’t actually get to watch teams write their solution scripts, but my assumption was that the structure of a team’s solution would give a strong indication of their process.

The idea here being that manual reversing would lead to an input being built as a series of steps that mirror the challenge structure, whereas solver-based solutions would look like one big string that gets sent to the target. And this is exactly what we saw!

A side-by-side comparison of a manual vs automated solution script

Since we don’t have the script the competitors used to create the input used in the screenshot on the right, we can’t say for sure how they generated the input… but we can still look at a combination of time, script, and execution parameters to compare different teams and see if a dominant strategy emerged.

P.S. At the time of writing I could only find manual reversing writeups. If you find a solution that talks about how somebody used symbolic execution to solve trickshot, please let me know!

Seeing Solutions #

While we can compare scripts for any of the LiveCTF quals challenges, one of the special features of trickshot is that we can use a block coverage tool like bncov or lighthouse to see what code was exercised by each solution and see each team’s point-scoring strategy.

Composite of three images showing different block coverage for each of the three top teams — Comparing the top three teams’ block coverage, it’s easy to see the differences by what they blocks they didn’t execute

Let’s drill down on the top three fastest teams, since they all solved the problem in under 40 minutes and each did things a little differently.

RePokemonedCollections’ Solution #

Starting with RePokemonedCollections, the team that solved the challenge first in about 32 minutes, we can run their input through the binary and observe their strategy to score enough points based on what is printed to stdout:

Setting you up for a trickshot...
BONUS MULT x2: You really connected!
BONUS MULT x2: threading the needle!
+1600: weaving through hoops
+1600: weaving through hoops
+1600: weaving through hoops
...snip...
+1600: weaving through hoops
+1600: weaving through hoops
+1600: weaving through hoops
BONUS MULT x2: through the roof!
+1600: tubular bounce
Great job! Your shot went 830 bytes long and bounced 82 times.

===== TRICKSHOT COMPLETE =====

Initial score: 128000
You found 3 score multipliers!
Total multiplier: 8x
Final score: 1_024_000

The output shows that they got 3 multipliers and racked up normal points without trying for the mega-boost, and looking at their script, it looks like they took the manual reversing approach of figuring out how to trigger just the multipliers they needed.

A screenshot of the RePokemonedCollections solution script

The multipliers they went for is also apparent in the block coverage picture, where we can see that there’s three clusters of uncovered (grey) blocks, indicating the two multipliers and one scoring block that they didn’t try to reach.

A screenshot of Binary Ninja showing block coverage from the RePokemonedCollections input

Overall, it’s really impressive they were able to reverse this problem and get to an answer in about a half hour!

RedHazzarTeam’s Solution #

Looking at the second-fastest team, RedHazzarTeam with a 38-minute solve, we see they likely used a solver to generate the input they used to trigger the code they wanted (based off the single really long string that goes off the screen to the right).

A screenshot of the RedHazzarTeam solution script

And if we look at the output and block coverage, we can see they also took the strategy of getting just three score multipliers and a large amount of points. In this aspect, they only differed from the first team on their choice of one multiplier.

Setting you up for a trickshot...
BONUS MULT x2: You really connected!
BONUS MULT x2: threading the needle!
+1600: weaving through hoops
+1400: weaving through hoops
...snip...
+1400: weaving through hoops
+1400: weaving through hoops
BONUS MULT x2: Mad skills!
+206: mathematical bounce
Great job! Your shot went 1020 bytes long and bounced 101 times.

===== TRICKSHOT COMPLETE =====

Initial score: 137606
You found 3 score multipliers!
Total multiplier: 8x
Final score: 1_100_848

A screenshot of Binary Ninja showing block coverage from RedHazzarTeam’s input

MMM’s Solution #

Rounding out the top three, Maple Mallard Magistrates finished in about 40 minutes, appearing to use a manual reversing approach and showcasing the strategy of just getting all the multipliers and not worrying about trying to score many points otherwise.

Setting you up for a trickshot...
BONUS MULT x2: You really connected!
BONUS MULT x2: threading the needle!
+1600: weaving through hoops
BONUS MULT x2: through the roof!
+1600: tubular bounce
BONUS MULT x2: Mad skills!
+301: mathematical bounce
BONUS MULT x2: Wiggle bonus!
+280: tiny bounces
Great job! Your shot went 60 bytes long and bounced 5 times.

===== TRICKSHOT COMPLETE =====

Initial score: 3781
You found 5 score multipliers!
Total multiplier: 32x
You achieved the multi-bonus bonus!!!!!!
Final score: 720_992

A screenshot of Binary Ninja showing block coverage from the MMM input

Looking at their coverage, we can see that the only blocks they didn’t bother with were one set that only yielded a modest number of points and no multiplier, and so generally wasn’t worth going after.

Other Notables #

Speaking of blocks not worth going after, if we overlay all of the 14 fastest teams’ block coverage into an inverse heat-map with bncov (red is rare, blue is common, purple is in-between), we can both see the trends like which multipliers almost every solving team used.

But we can also see that not one but two teams ended up going after that group of blocks that weren’t really worth that many points:

A screenshot of Binary Ninja showing block coverage from the aggregate of the top 14 teams

Yep, Norsecode 24 and undef1ned both getting after it and perhaps doing a little more work than needed… interesting to note that while they both maximized coverage, Norsecode appeared to do so using a solver-based approach while undef1ned’s script appeared to be manual.

A screenshot of Binary Ninja showing block coverage from Norsecode’s solution

“A” for effort, and they both ended up solving with time to spare.

But what team scored the highest number of raw points on the trickshot binary?

Setting you up for a trickshot...
BONUS MULT x2: You really connected!
BONUS MULT x2: threading the needle!
+1600: weaving through hoops
+1600: weaving through hoops
...snip...
+1600: weaving through hoops
+1600: weaving through hoops
BONUS MULT x5: Mad skills!
+310: mathematical bounce
Great job! Your shot went 1020 bytes long and bounced 101 times.

===== TRICKSHOT COMPLETE =====

Initial score: 157110
You found 3 score multipliers!
Total multiplier: 20x
Final score: 3_142_200

It’s KerKerYuan! Earning the “Moar Points, Please” trophy that I just made up by achieving a truly overkill score of 3,142,200 points using only three multipliers. This was possible because they were the only team that discovered how to crank one of the multipliers up to 5x (others only achieved 4x).

Thank you for finding that trick, and great work!

Getting an Edge a Game of Minutes #

Looking at the teams that solved the challenge in the time limit, we saw a variety of solutions in terms of strategy and execution. Exactly the result we hoped for.

So some teams went manual, some used solvers, and different teams picked different bonuses… but was there a trend?

Looking at the 6 teams that finished in under an hour:

4 of 6 went for the mega-bonus by getting all 5 multipliers, while 2 of 6 only used 3 multipliers
2 of 6 appeared to use a solver to craft an input, and 4 of 6 appeared to use manual approaches

Given how close the teams were and how few data points there are, it doesn’t look like there’s really a wrong way to go…

But the two fastest teams were the only two teams that scored enough points using only three multipliers.

It takes time to locate each multiplier and determine if it is viable, even when using tools like symbolic execution, so I believe that going for a smaller number of multipliers gave them a slight edge in terms of speed.

As the challenge author, I’m glad that it turned out this way, because it rewarded the teams that chose to think laterally and tried to find a faster way, even when under a time crunch.

Speed CTF and the Future #

Looking at the other challenges, trickshot ended up being one of the easier challenges in terms of how many teams solved it and how quickly.

But it fully accomplished my goal of writing a challenge that would be interesting: showcasing different strategies, different tools, and even using interesting visuals that could help the audience. Mission accomplished!

Missed out on the action? You can still re-watch the commentary livestream for trickshot and the rest of the challenges, or you can check out the problems on Github.

For more LiveCTF action, follow us on YouTube or Twitter to stay updated for the upcoming DEF CON finals.

Otherwise, if you enjoyed this post please let me know by way of a like, follow, or DM on Mastodon or Twitter.

Until next time, CTF fans!