As more solar photovoltaic (PV) systems are installed around the world, the fact that power consumption and solar generation profiles do not synchronize leads to a problem called a duck curve. As PV penetration increases, the problem is exacerbated due to an increasing ramp rate that adds strain to the electricity grid. Another challenge is that the power profiles vary considerably by day and by season. We propose a system control algorithm using reinforcement learning for a battery-integrated PV converter system that works in real-time, is dynamic, and is adaptive. Results show good balance among four objectives, which are verified by real data sets from Taiwan and Germany.