Mohamed EL HARCHAOUI — AI Engineer. Intelligence for thinking, robotics for doing.

This is where I share reflections on systems reshaping our world.

Featured

Making LLMs Faster Without Retraining

Six experiments — SVD, activation-aware factorization, Monarch projections, LoRA recovery — hunting for the configuration where speedup and quality finally overlap.

A personal experiment log testing multiple compression techniques across two model scales, ending with a GPT-2 Large that runs 1.6× faster and outperforms its uncompressed baseline by 11%.

· 20 min read