Embedding files into source files is dumb

I vaguely remember using a tool way back in my earlier days of programming, back when I was trying to create aimbots for CSGO, that could create byte arrays for C and C++.

It was common to use such tool to embed fonts and textures to use within these hacks and it’s likely still common today.

I don’t entirely remember which tool I used but after some searching it turns out a tool called xxd could do the same thing.

As simple as..

echo "hello world" > data_file
xxd -i data_file

Which outputs..

unsigned char data_file[] = {
  0x68, 0x65, 0x6c, 0x6c, 0x6f, 0x20, 0x77, 0x6f, 0x72, 0x6c, 0x64, 0x0a
};
unsigned int data_file_len = 12;

This is great, it does basically what I remember but I’ve since moved away from C and C++ and fully gone onto C#.

While I could lookup a implementation of this in C#, I figured why not just make my own, after all it’s the learning process that helps the most.

To be clear, I know this is a dumb way to embed resources into projects. But it’s simple and not a terrible option.

Embedder

Embedder is a simple single class program that outputs a .cs file with any files you want, compressed, into properties with only getters.

The data is appened as the initial value of these getters, so C# only ever needs to evaluate the decompression once.

Example Usage

# Embedder.exe <filename> <class name> [input_files...]
./Embedder.exe Data.cs Data ./TextFile.txt ./Image.png

Result:

using System;
using System.IO;
using System.IO.Compression;

public static class Data
{
	// Source: TextFile.txt
	public static byte[] TextFile_txt { get; } = Decompress(FromBase64(
		"H4sIAAAAAAAEAPNIzcnJVyjPL8pJAQBSntaLCwAAAA=="
	));

	// Source: Image.png
	public static byte[] Image_png { get; } = Decompress(FromBase64(
		"H4sIAAAAAAAEAFNgYGBQYBgFo2AUDCLgRUeMDtSA+D8dMTpQHbUfRT6HSNxLpj5C9hMLTJH0/CFBHy3s/0gl+3+RoM8eSd8zCuwX",
		"YED1vxCR+vyR9JyjwH4Q+IFklgaRegqR9Gyj0P6HSGaFEalnOZKePgrtP4Vk1kYi1DszoMZZCIX2Z6GZl4RHLQ8QH0RS+xyI+Si0",
		"HwReormhjgE1LXAAsQcQ30NTF0cFuxmg9mErLz8B8R0ccuupZDcMNOGwBxu+zQApO6gNCoiwewoQs9LAbhjQA+IWIN7LAPHnDSBe",
		"CcTVQOxJQ3tHwSgYBXQAAMxzaH8IEAAA"
	));

	private static byte[] FromBase64(params string[] parts)
	{
		var all = string.Concat(parts);
		return Convert.FromBase64String(all);
	}

	private static byte[] Decompress(byte[] data)
	{
		using var input = new MemoryStream(data);
		using var gzip = new GZipStream(input, CompressionMode.Decompress);
		using var output = new MemoryStream();
		gzip.CopyTo(output);
		return output.ToArray();
	}

	private static (int width, int height, uint[] pixels) ImageData(byte[] data)
	{
		using var br = new BinaryReader(new MemoryStream(data));
		var width = br.ReadInt32();
		var height = br.ReadInt32();
		var pixels = new uint[width * height];
		for (var i = 0; i < pixels.Length; i++)
			pixels[i] = br.ReadUInt32();
		return (width, height, pixels);
	}
}

Challenges

The main issue with a tool like this is that it’s rather stupid for uncompressable formats and just in general not a great way to embed data.

I went through a couple iterations but eventually settled on what I have now.

Originally, it just output the bytes as a byte array, 0x01, 0x02, 0x03 and so on, but this inflated the output file.

Then I went to compressing the data along with writing the output bytes, this helped, but still with larger files the extra 0x’s, commas and spaces lead to still inflating the output file.

I eventually settled on compressed bytes into Base 64 strings. I felt that this had the best compromise between output file size and IDE readability. Though for text files it’s still not amazing.

A text file of 11 bytes, a simple “Hello world”, is written as 44 bytes. That’s 300% increase over the raw bytes. So this still needs tweaking, but I’m ok with how it is.

Embedder also handles images in a more direct format. Instead of writing out the image file bytes, leaving in all the metadata crap we don’t want, it writes out the image Width and Height then every byte as a Packed uint containing R, G, B and A. Can result in a 50% reduction over the raw bytes. I did end up using a third party library to read the pixels, SixLabors Image Sharp, but another small price to pay for cutting down on output size.

To help with the custom image format, there is a handy ImageData(byte[]) helper method. This returns a tuple, (int width, int height, uint[] pixels), which is just enough information to reconstruct the image data.

Everything is bundled up using Costura.Fody, meaning Embedder is just a single Exe, no lingering dlls!

Source Code and Binary

Embedder.zip Embedder.exe