Mohamed EL HARCHAOUI — AI Engineer. Intelligence for thinking, robotics for doing.

This is where I share reflections on systems reshaping our world.

Featured

Making LLMs Faster Without Retraining

Seven experiments — SVD, activation-aware factorization, Monarch projections, LoRA recovery, closed-form initialization — hunting for the configuration where speedup and quality finally overlap.

A personal experiment log testing multiple compression techniques across two model scales, ending with a GPT-2 Large that runs 1.6× faster and outperforms its uncompressed baseline by 13.7%.

· 25 min read