) Add an L2 regularization term (+α P i θ 2 i ) to your surrogate loss function, and update the gradient and your code to reect this addition. Try re-running your learner with some regularization (e.g. α = 2) and see how dierent the resulting parameters are. Find a value of α that gives noticeably dierent results & explain the

Respuesta :

Answer:

Jj (θ) = −y

(j)

log σ(x

(j)

θ

T

) − (1 − y

(j)

) log(1 − σ(x

(j)

θ

T

))

Step-by-step explanation: