Algorithmic Foundations Of Policy Optimization In Reinforcement Learning, Multi-Agent Systems, And Ai Alignment