hybrid CPU GPU inference

An inference setup that splits model execution across processors and graphics hardware to balance speed, memory use, and efficiency.