With more time spent at home due to COVID-19 restrictions, I worked on a MI5 cryptography challenge called Can You Solve This Puzzle. The premise is that simply given an image (below), can we identify the hidden message?

With image-based challenges, the solution usually revolves around finding the right steganography encryption technique(s) like the LSB obfuscation. To get some hint, I poked into its exif metadata for some reconnaissance.

~$ exiftool puzzle.png 2>&1 | tee puzzle.exif

ExifTool Version Number         : 10.80
File Name                       : puzzle.png
Directory                       : mi5
File Size                       : 1497 bytes
...
Pixels Per Unit Y               : 3779
Pixel Units                     : meters
Comment                         : As I read, numbers I see. 'Twould be a shame not to count this art among the great texts of our time
Image Size                      : 92x163
Megapixels                      : 0.015

Which was very lucky because Comment gives us a clue:

As I read, numbers I see. 'Twould be a shame not to count this art among the great texts of our time

Sadly I hit a dead end for a while as I looked into its color spectrum, pixel distributions, negatives, decoder scans, etc. Having already tried the common decoding techniques to no avail, I decided to change my strategy and start observing the physical characteristics of the puzzle itself.

  • Striped patterns stretching from top to bottom, left to right. This gave me an idea that it was somewhat boolean.

  • No (pixel) row is ever populated with the same color. This was the strangest characteristic. Every row there is a switch between two colors. If no row is ever completely filled with a consistent single color, then does this mean that there is some sort of limit to the length of the contiguous pattern?

  • Only “two” color variations. This puzzle had either pink or blue colors.

I re-visited the clue and read it more carefully; and I noticed that the puzzle authors had used some strange choice of words: numbers, count, and text. Piecing together the observations from earlier, I realized they wanted us to count the contiguous colored pixels and, very likely, that will return some 8-bit ASCII characters.

I used pandas and itertools to quickly offload some implementation logics, but the idea is very simple: load the puzzle into memory and parse out its pixels, remove alpha channel since we only care about RGB, calculate the “median color” so that our solution can be more flexible against gradients, group the colors into counts, and finally decode.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
from PIL import Image
from typing import (
    List,
    Text,
    Tuple,
)

import itertools
import pandas

def solve(filename: Text) -> None:
    """Solve MI5 image puzzle.

    About
    =====
    The MI5 puzzle is composed of blue and pink pixels - which can be
    translated as countable segments of pixels. Once we aggregate the
    same pixel colors, we should get a sequence of numbers, of which
    can be decoded as ASCII.
    """
    # Convert MI5 puzzle into pixels.
    px: pandas.DataFrame = pandas.DataFrame(
        tuple(Image.open(filename).convert("RGBA").getdata()),
        columns=("r", "g", "b", "a"),
    ).drop(columns=["a"])

    # Even though there looks to be only two pixel color variations
    # (blue and pink), this solution is flexible enough to handle
    # gradients. For example, if there are 'slightly' blue and
    # 'slightly' pink, this approach is much better since we're
    # finding the median color ("the partition") and then fuzzy
    # converting the gradients.
    shades: pandas.DataFrame = px.sort_values(by=["r", "g", "b"]).values
    median: Tuple[int] = tuple(shades[len(shades)//2])
    fuzzy_normalized: List[bool] = [
        (rgb<median)*1  # normalize truth values to integers
        for rgb in px.itertuples(index=False)
    ]

    # Group the binary list (`fuzzy_normalized`) by 0s and 1s.
    # This will return hidden message in 'XX-YY-ZZ-..' format.
    hidden_message: Text = "".join(map(chr, (
        sum(1 for _ in group)
        for _, group in itertools.groupby(fuzzy_normalized)
    )))

    # Decode (ASCII) each message components.
    for hd in hidden_message.split("-"):
        try: print(bytes.fromhex(hd).decode("ASCII"), end="")
        except ValueError: pass


if __name__ == "__main__":
    solve(filename="puzzle.png")

And once we run the decoder against the image, we get the following message.

Congratulations, you solved the puzzle! Why dont you apply to join our team? mi5.gov.uk/careers