Behind GitHub’s Two-Factor-Auth

For those who wonder how GitHub’s Two-Factor-Auth ‘style 2’ codes work, there is an article which explains the basic mechanism. It will explain the magic from getting the QR code to getting a spinning 6-digit pin code. Like magic,… like magic.

GitHub ssh
From QR to code, but how?!

otpauth://totp/Github:rcoh?secret=onswg4tforrw6zdf&issuer=Github

Scanning the QR code will tell you the protocol, TOTP who is issuing this OTP code (Github), and most importantly the shared secret

secret = 'onswg4tforrw6zdf'

Now that’s it – End of article, let me drink my coffee and never come back. No, of course not! How is the 6-digit code now generated?

The magic behind the code

Based on this python script:

  • time_chunk is based on the current time local time
  • time_chunk and your shared secret (based on the hmac) will be combined. It’s a sha1-hmac, which means we’ll get a 20-byte digest.
  • Take the last 4-bits of the digest, these become the main-offset. Eg. if the last 4 bits are 0010, then we would start at the 4th byte. It’s required to read 4 bytes starting from the offset to get a correct value.
  • Convert the 4 byte section into a n digit integer. This is your two-factor code.

The algorithm works by finding what 30 second time slice we’re in, starting from the Unix Epoch via time_chunk = int(time.time() / 30)time_bytes = struct.pack('>q', time_chunk).

HOTP algorithm

The default hash function in Python for HMACs is MD5, I only mention it because it’s maybe not widely known.

HOTP is an algorithm for using HMAC to generate one-time-passwords based on a counter. In the case of TOTP, the counter is the time chunk we’re in. First, we need to create a hmac of our secret key and the count:

key = base64.b32decode(secret.upper())
hm = hmac.new(key, time_bytes, hashlib.sha1)
hex_digest = hm.hexdigest()

Now we need to determine the offset in the array we’ll be reading from. This offset is chosen based on the last 4-bit chunk of the array: offset = int(hex_digest[-1:], 16).

Starting at the offset, read 4 bytes. This will be the basis of our 6 digit code: relevant_bytes = int(hex_digest[offset*2:offset*2+8], 16). Drop the highest order bit (the sign). Dropping this makes it easier to implement in different programming languages which may have differing behavior on signed module operations: unsigned_bytes = relevant_bytes & 0x7FFFFFFF.

Last but not least, take the unsigned result % 10**d where d is the number of digits you want. In our case, that’s 6:

num_digits = 6
final = masked % (10 ** num_digits)
final_str = str(final).zfill(6)
return final_str

The Google implementation derives them by taking 4-byte sections of the secret and converting them into digits by the same process as the time-based system: bignum % (10 ** d). This means they’re also derivable in a predictable way from the original key so they don’t need to by stored separately by the server. Some other systems use hex-based codes instead, but presumably it’s a similar process.

Final Words

  • Make sure your comparison method is safe from timing attacks in case you want to implement 2-factor authentication.
  • Check if the libraries offering verify methods, if not stay away from them!
  • he easiest way to compare 2-factor codes in a constant time way is convert to ints first. Google does it.

One thought on “Behind GitHub’s Two-Factor-Auth

Comments are closed.

Blog at WordPress.com.

Up ↑

%d bloggers like this: