# Derivative of Softmax loss function

I am trying to wrap my head around back-propagation in a neural network with a Softmax classifier, which uses the Softmax function:

This is used in a loss function of the form

where $o$ is a vector. I need the derivative of $L$ with respect to $o$. Now if my derivatives are right,

and

Using this result we obtain

According to slides I’m using, however, the result should be

Can someone please tell me where I’m going wrong?

Your derivatives $\large \frac{\partial p_j}{\partial o_i}$ are indeed correct, however there is an error when you differentiate the loss function $L$ with respect to $o_i$.
We have the following (where I have highlighted in $\color{red}{red}$ where you have gone wrong)
given that $\sum_ky_k=1$ from the slides (as $y$ is a vector with only one non-zero element, which is $1$).