With more time spent at home due to COVID-19 restrictions, I worked on a MI5 cryptography challenge called Can You Solve This Puzzle. The premise is that given an image (below), can we identify the hidden message?

Typically with image-based challenges, the solution usually revolves around finding the right steganography encryption technique(s) like the LSB obfuscation. So to get some hint, I poked into its exif for some reconnaissance:

~$ exiftool puzzle.png 2>&1 | tee puzzle.exif

ExifTool Version Number         : 10.80
File Name                       : puzzle.png
Directory                       : mi5
File Size                       : 1497 bytes
...
File Type                       : PNG
File Type Extension             : png
MIME Type                       : image/png
Image Width                     : 92
Image Height                    : 163
Bit Depth                       : 8
Color Type                      : RGB with Alpha
Compression                     : Deflate/Inflate
Filter                          : Adaptive
Interlace                       : Noninterlaced
SRGB Rendering                  : Perceptual
Gamma                           : 2.2
Pixels Per Unit X               : 3779
Pixels Per Unit Y               : 3779
Pixel Units                     : meters
Comment                         : As I read, numbers I see. 'Twould be a shame not to count this art among the great texts of our time
Image Size                      : 92x163
Megapixels                      : 0.015

Which was very lucky because Comment gives us a clue:

As I read, numbers I see. 'Twould be a shame not to count this art among the great texts of our time

Sadly I hit a dead end for a while as I looked into its color spectrum, pixel distributions, negatives, decoder scans, etc. Having already tried the common decoding techniques to no avail, I decided to change my strategy and start observing the physical characteristics of the puzzle itself.

  • Striped patterns stretching from top to bottom, left to right. This gave me an idea that this puzzle had very similar traits as boolean (True and False).

  • No (pixel) row is ever populated with the same color. This was the strangest characteristic. Every row there is a switch between two colors. If no row is ever completely filled with a consistent, single color, does this mean that there is some sort of limit to the length of the contiguous pattern?

  • Only “two” color variations. Although this PNG file is composed with Alpha channel gradients (there are actually 284 color types, provided below) - we can make a general assumption that the pixels are either pink or blue.

I re-visited the clue and read it more carefully; and I noticed that the puzzle authors had used some strange choice of words: “numbers”, “count”, and “text”.

numbers, count, and text..

Ah! Piecing together the observations from earlier, I realized what they wanted was for us to count the contiguous colored pixels and, very likely, that will return some ASCII-like characters (since ASCII is represented using 8 bits).

I used pandas and itertools to quickly offload some implementation logics, but the idea is very simple: load the puzzle.png into memory and extract its' pixels, remove alpha from the pixels since we only care about RGB, calculate the “median color” so that our solution can be more flexible against gradients, group the colors into counts, and finally decode the counts into ASCII.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
from PIL import Image
from typing import (
    List,
    Text,
    Tuple,
)

import itertools
import pandas

def solve(filename: Text) -> None:
    """Solve MI5 image puzzle.

    About
    =====
    The MI5 puzzle is composed of blue and pink pixels - which can be
    translated as countable segments of pixels. Once we aggregate the
    same pixel colors, we should get a sequence of numbers, of which
    can be decoded as ASCII.
    """
    # Convert MI5 puzzle into pixels.
    pixels: pandas.DataFrame = pandas.DataFrame(
        tuple(Image.open(filename).convert("RGBA").getdata()),
        columns=("r", "g", "b", "a"),
    ).drop(columns=["a"])

    # Even though there looks to be only two pixel color variations
    # (blue and pink), this solution is flexible enough to handle
    # gradients. For example, if there are 'slightly' blue and
    # 'slightly' pink, this approach is much better since we're
    # finding the median color ("the partition") and then fuzzy
    # converting the gradients.
    shades: pandas.DataFrame = pixels.sort_values(by=["r", "g", "b"]).values
    median: Tuple[int] = tuple(shades[len(shades)//2])
    fuzzy_normalized: List[bool] = [
        (rgb<median)*1  # normalize truth values to integers
        for rgb in pixels.itertuples(index=False)
    ]

    # Group the binary list (`fuzzy_normalized`) by 0s and 1s.
    # This will return hidden message in 'XX-YY-ZZ-..' format.
    hidden_message: Text = "".join(map(chr, (
        sum(1 for _ in group)
        for _, group in itertools.groupby(fuzzy_normalized)
    )))

    # Decode (ASCII) each message components.
    for hd in hidden_message.split("-"):
        try: print(bytes.fromhex(hd).decode("ASCII"), end="")
        except ValueError: pass


if __name__ == "__main__":
    solve(filename="puzzle.png")

And once we run the decoder against the image, we get the following message.

Congratulations, you solved the puzzle! Why dont you apply to join our team? mi5.gov.uk/careers

Nice :)