I’m working on a query engine, essentially a tool to scan/filter/annotate by lookups/group by/aggregate a large dataset, tens-of-terabytes range. The compute part seems to be a bottleneck for me (I’ll be doing around 80-300 GB/s of reads, and yes, I will have hardware capable of providing that kind of throughput). My hypothesis is that by encoding query in form of template arguments I can make the compiler generate code optimized for a specific type of query (like, the filtering or aggregation keys). But I do not know what queries will users send, so I need a way to instantiate templates at runtime.

Sounds simple: for a new type of query invoke a compiler at runtime to build a dynamic library with a new instantiation, then dynload it and off we go. Some prior work is here, though I’m pretty sure any JIT compiler also can counts here. But there’s enough technical details to worry about, and at the same time this idea isn’t novel, so I wonder—are there any packaged solutions for this kind of approach?

  • lobsticle 🦞
    link
    19 months ago

    It sounds like what you are looking for is a form of an object request broker. Provide the name of a class as a string (or, if the set of desired objects is more constrained, an integer or enum or something similar) and then build an instance based on that key. Generally, all these objects typically inherit from some base class like Object so that the broker can return an Object* and the client can dynamic cast it down to the actual thing. I’ve used a pattern like this in the past that worked pretty well using macro magic to enable classes eligible to be instantiated through the broker (register the key and the class name with the broker). This was pre-C++03, so doubtless there are cleaner and more modern ways to implement such a thing these days.