chaining blocks for disappointment and failure

I recently took a new job at a pretty large enterprise in one of the security groups. I happened to join just as they kicked off an annual CTF contest so I signed up on day one and started gleefully hacking about. I was doing pretty well holding a spot in the top ten amid some fifty competitors, had fun factoring some weak RSA keys, implementing a login timing attack, and recovering files encrypted by a poorly written ransomware.

Then I came to a challenge that required implementing and mining a blockchain. Given a particular genesis block, difficulty requirements, and a hash algorithm you had to provide a chain of at least four valid blocks to retrieve the flag.

Specifically, the genesis block looked like this.

{
"identifier": "000102030405060708090A0B0C0D0E0F", 
"nonce": 3754873684,
"data": "Genesis Block for CTF contest, all block chains must start with this block. This is equivalent to Big Bang, time didn't exist before this",
"previous_hash": null
}

The hash algorithm looked like this.

def hash_block(block):  
    message = hashlib.sha256()
    message.update(str(block['identifier']).encode('utf-8'))
    message.update(str(block['nonce']).encode('utf-8'))
    message.update(str(block['data']).encode('utf-8'))
    message.update(str(block['previous_hash']).encode('utf-8'))
    return message.hexdigest()

And the difficulty for each block was pre-assigned in the following order: 8 for the genesis block, then 4, 4, 5, 6, 7, 9, 11, 13, and 16 for the remaining blocks. In this case difficulty is defined as the number of leading zeros in the resulting hash.

The content of the data field was dealers choice, so I populated with a poem I'm fond of and implemented the miner over my lunch break. The result produced about 100k hashes per second on my macbook, randomly hashing the block with a different nonce value until hitting upon one which produced a hash that satisfied the difficulty requirement. It completed the required four blocks before I had finished eating. I submitted the chain and collected my flag and went back to work.

But one thing kept bothering me. I kept thinking about the fact that even though it took only four blocks to collect the flag the instructions provided difficulty values for a ten block chain. What would happen if I submitted all ten blocks? Bonus points? A hidden challenge? My weight in dogecoins?

I stopped working on all the remaining challenges and focused on completing the chain. Since the new gig is a Go heavy shop and my Go game is weak I decided to port the miner to Go for a compiled language speed up. First step was to see if I could replicate the hashing algorithm which turned out to be pretty straightforward with the standard library crypto/sha256 package.

func (b *Block) hash() []byte {  
    h := sha256.New()
    h.Write([]byte(b.Identifier))
    h.Write([]byte(strconv.Itoa(b.Nonce)))
    h.Write([]byte(b.Data))
    h.Write([]byte(b.Previous_hash))
    return h.Sum(nil)
}

I can see why Python people like Go, it's pretty intuitive. This bought me about an 8x speed up. My macbook was churning out 800k hashes per second which ripped through the next few blocks rapidly up until it hit block 7 with a difficulty rating of 11, and there it stayed for quite a while.

Recalling back to the early days of bitcoin mining I decided that a mining pool seemed like a good approach. I have nerdy friends, they have computers, by our powers combined we can do a thing right?

So I implemented a quick and dirty web app to loosely coordinate a distributed pool of miners. I made sure the miners would generate the same identifier field for a given block where previously it had also been random, so the fleet would be working on the same problem. Then I had them poll the server every thirty seconds to see if anyone else had mined a block. Stumbling about with my new Go legs it took about a day and a half to get the server and the miner working in tandem.

On Friday the 3rd I started soliciting friends to run the miner.

I was testing the pool with my work macbook, my personal laptop, and my desktop. All told these were combining to produce about 2.4 million hashes per second. There seemed to be some upper limit at the 800k mark that didn't seem to vary much with CPU speed. As friends came on board the pool hashrate began to climb. Slowly.

The ones with beefy gaming PC's complained that their many cores remained idle while mining so I set about learning how to use Go's concurrency to make use of them. Eventually I got it working after faffing about with channels and worker pools and whatnot. The code is here and it's hideous and no doubt rife with bugs. I haven't had to deal with pointers since the 90's and my approach is roughly on par with one that a friend once half jokingly suggested, "Keep adding asterisks and ampersands until it works".

Anyway it's ugly but functional, the concurrent miner spawned a dedicated goroutine for each available core on the system and did a great job pinning the entire CPU to 100% usage.

We went from this.

To this.

Somewhere around this point on the evening of the Nov 4th I decided to start graphing the hashrate of the mining pool.

We were doing better but for the next two days the entire pool churned and churned on block 7 getting nowhere. I decided to stand up a bunch of cheap google cloud compute instances to help the effort. My friend Adam had a bunch of cores sitting around from a screw up with a cloud provider ages ago so he fired up a dozen miners in there. At its peak the pool hit nearly 50 million hashes per second. By then I'd had to fortify the web app with more workers and a proper MySQL rather than just SQLite.

Some time around 8 am on Sunday the 5th a google instance in South America with miner id 662F4F146E1504EA mined the elusive block 7 and the whole fleet rolled over to working on block 8 with difficulty level 13. At this point we had three days to go till the end of the CTF contest. People had been suggesting it for a while and I'd been hoping to avoid it but I finally decided to cave and take a serious look at implementing a GPU accelerated miner.

I spent all day Sunday and most of Monday evening tinkering with a variety of GPU based sha256 implementations, even getting a little twitter help from one author.

I managed to get a hash actually computed on my GPU but writing an algorithm to run at GPU scale parallelism is pretty far from the sort of software I normally write. I was just beginning to see how I could divide the work up and was just skirting the outlines of how I might get meaningful answers back out, but with time running short and brain cells in short supply I sputtered to a stop a day before the contest ended.

Feeling pretty burnt out I threw up my hands in defeat. What I did learn was a fair bit about Go, who my most competitive friends are, and way more than I ever needed to know about how sha256 works. For the curious, here's my tl;dr.

Remember these puzzles?

That's pretty much how sha256 works. When you fire it up the innards are arranged in a standard configuration hand crafted from the finest artisanal entropy. Then the input data is padded to ensure its size is a multiple of 512 bits and it is shoved through the system 512 bits at a time. Each block of bits turns the crank and slides the configuration around in a consistent but input dependent way ensuring that if even one input bit is altered the sliders wind up in wildly varied final states.

And that's it. See? Cryptographically secure hashes aren't that hard. Now whatever you do don't go read FIPS-180. That way lies total madness.

I hope the CTF is this much fun next year.