From Crash and Burn to a Smooth Landing: A Reinforcement Learning Journey
A chronicle of a reinforcement learning project, from initial struggles with exploding losses to a successful pixel-based agent for LunarLander.
Lead Machine Learning Engineer | Victoria, BC
I’ve spent the past 6 years working as a Machine Learning Engineer at progressively higher levels of responsibility across a handful of startups. I am currently a Lead at Kibeam, where I oversee everything from Ops to edge model training/deployment to LLMs, and everything in between. In my spare time, I tinker with whatever else interests me, and that gets posted here.
A chronicle of a reinforcement learning project, from initial struggles with exploding losses to a successful pixel-based agent for LunarLander.
A very approachable jumping off point for video captioning. If you're GPU-poor (<24GB vram) this is for you.
A small contribution to the community. Adds caption-like variety samples to SSV2 dataset.