
I’m a second-year Computational Science and Engineering (CSE) master’s student at Harvard University. I do research on reliable LLM post-training—especially Reinforcement Learning with Verifiable Rewards (RLVR)—advised by Sham Kakade, and I’m a Machine Learning Researcher at the Kempner Institute. I also teach AC215 (MLOps & LLMOps) as a Teaching Fellow.
Before Harvard, I studied Computer Engineering at the Federal University of Paraíba (UFPB), and interned at Google (YouTube, Search) and Meta (AI for AR, Ads).
I like running, playing football, surfing, and reading.
You can reach me at itamardprf@gmail.com, visit my GitHub, or connect with me on LinkedIn. You can also access my resume (PDF).
A study of Evolution Strategies as an exploration-based alternative to gradient RL for fine-tuning LLMs—analyzing when ES matches RL baselines and when it avoids reward-hacking failure modes.
An AI safety experiment analyzing evaluation hacking in language-model agents, combining controlled task design, behavioral analysis under pressure, and systematic testing across models.
An AI-powered running coach on WhatsApp, combining RAG, fine-tuned LLMs, vector search, and a GKE deployment.