Parsing Certificate Transparency Logs

Several years ago Mr Ryan Sears wrote a very good blog on Parsing Certificate Transparency Logs. The links are here

He also wrote a very handy tool called Axeman which still works till date with some minor python version tweaks . Emphasizing on the word WORKS. The main thing is to get the correct version of python and the "construct" package to work together. The idea is to use a very slightly older version of python and more importantly the "construct" package which supports the "Embedded" keyword.

The Itch

While I had Axeman running, I thought it should be like a line or two change to make it work with the latest version. The itch in me was taking over for no reason.

What could have been solved with a simple ignorance towards versions, led me to a path of immense pain and learning.

I re-wrote the structs for parsing in python.

So for all the readers , see the structs for parsing the CTL Merkle Tree below. It uses structs instead of construct. I find it more usable

import struct
from enum import Enum
from OpenSSL import crypto 


class LogEntryType(Enum):
        uninitialized = -1  # Not set
        X509LogEntryType = 0
        PrecertLogEntryType = 1
# MerkleTreeHeader = Struct(
#     "Version"         / Byte,
#     "MerkleLeafType"  / Byte,
#     "Timestamp"       / Int64ub,
#     "LogEntryType"    / Enum(Int16ub, X509LogEntryType=0, PrecertLogEntryType=1),
#     "Entry"           / GreedyBytes
# )
class MerkleTreeParser:
    Version = 0
    MerkleLeafType = 0
    Timestamp = 0
    LogEntryType = LogEntryType.uninitialized
    Entry = b''
    
    def __init__(self, data):
        FORMAT = f'>BBQH'
        unpacked = struct.unpack(FORMAT, data[:struct.calcsize(FORMAT)])
        self.Version = unpacked[0]
        self.MerkleLeafType = unpacked[1]
        self.Timestamp = unpacked[2]
        self.LogEntryType = LogEntryType(unpacked[3])
        self.Entry = data[struct.calcsize(FORMAT):]
    
    def __str__(self) -> str:
        return f"Version: {self.Version}, MerkleLeafType: {self.MerkleLeafType}, Timestamp: {self.Timestamp}, LogEntryType: {self.LogEntryType}, Entry: {self.Entry}"

# Certificate = Struct(
#     "Length" / Int24ub,
#     "CertData" / Bytes(this.Length)
# )
        
class Certificate:
    Length = 0
    CertData  = b''
    
    def __init__(self, data):
        if len(data) == 0:
            return
        FORMAT = f'>I'
        unpacked = struct.unpack(FORMAT, b'\x00' + data[:3])
        self.Length = unpacked[0]
        self.CertData = data[3:]
    
    def __str__(self) -> str:
        return f"Length: {self.Length}, CertData: {self.CertData}"    

# CertificateChain = Struct(
#     "ChainLength" / Int24ub,
#     "Chain" / GreedyRange(Certificate),
# )

class CertificateChain:
    ChainLength : int = 0
    Chain:list = []
    
    def __init__(self, data):
        if len(data) == 0:
            return
        FORMAT = f'>I'
        unpacked = struct.unpack(FORMAT, b'\x00' + data[:3])
        self.ChainLength = unpacked[0]
        data = data[3:]
        
        while len(data) > 0:
            length = struct.unpack(FORMAT, b'\x00' + data[:3])[0]
            cert_data = data[:3+length]
            cert = Certificate(cert_data)
            self.Chain.append(cert)
            data = data[3+length:]
    
    def __str__(self) -> str:
        return f"ChainLength: {self.ChainLength}, Chain: {self.Chain}"
    
# PreCertEntry = Struct(
#     "LeafCert" / Certificate,
#     Embedded(CertificateChain),
#     Terminated
# )
class PreCertEntry:
    LeafCert = Certificate(b'')
    Chain = CertificateChain(b'')
    
    def __init__(self, data):
        if len(data) == 0:
            return
        FORMAT = f'>I'
        leafcert_length = struct.unpack(FORMAT, b'\x00' + data[:3])[0]
        self.LeafCert = Certificate(data[:3+leafcert_length])
        data = data[3+leafcert_length:]
        self.Chain = CertificateChain(data)
    
    def __str__(self) -> str:
        return f"LeafCert: {self.LeafCert}, Chain: {self.Chain}"

I leave it to readers to reimplement Axeman to use these structs. Its not that difficult. But as I said earlier, AXEMAN Works, so this hardly makes any dent.

The Itch Part II

It works in python and I should have left it there. There is no need for optimization.

I felt after running Axeman for a few hours and doing some secondary research the python as always just is SLOW and leaks memory. After like running the program for 15 hours, my computer was completely unusable. While writing this blog I realize it could have been anything on my computer, but guess what, I blame it to python.

I started a journey to learn Rust and see how much better it is than python. Spent around a month understanding Rust and wrote my first program in it.

See the details below for how to do same in rust.

Thanks

I think this might be the most boring post. But its my journey to learn something new. So that is it.

I might release my tool in future and its possible uses , but that's in future now.

Last updated

Was this helpful?