Some confusions about AUCM loss

Dear developers,

After learning about your work, I have the following confusions:

1. Why do we need to convert the square loss into an SPP problem? What problems will arise if we directly minimize $E[f(x_i)-a]^2+E[f(x_j)-b]^2+2*(m+b-a)^2$?

`loss = self.mean((y_pred - self.a)**2*pos_mask) + self.mean((y_pred - self.b)**2*neg_mask) +  (self.margin + self.mean(y_pred*neg_mask) - self.mean(y_pred*pos_mask))**2`

2. I noticed that in your code, $a$ in $E[f(x_i)-a]^2$ and $b$ in $E[f(x_j)-b]^2$ directly use `self.a` and `self.b`, while $a$ and $b$ in $(m+b-a)$ use the sample mean `self.mean(y_pred*neg_mask) - self.mean(y_pred*pos_mask)`. I would like to know the reason for that. 

`loss = self.mean((y_pred - self.a)**2*pos_mask) + self.mean((y_pred - self.b)**2*neg_mask) + \
2*self.alpha*(self.margin + self.mean(y_pred*neg_mask) - self.mean(y_pred*pos_mask)) - self.alpha**2`

3. Can I regard that the design of margin loss is to transform the square loss $(m+f(x_j)-f(x_i))^2$ into $max[0, (m+f(x_j)-f(x_i))]^2$ to allow $m+f(x_j)$ to be equal or less than $f(x_i)$ while the square loss only seeks to be equal to $f(x_i)$? When the loss function has a value of 0, is there a potential problem that the gradient cannot be updated?

4. In the demo you provided, the AUC test score of AUCM based on PESG on CIFAR10 can reach 0.9245, while the value quoted in your paper _Large-scale Robust Deep AUC Maximization_ is 0.715±0.008. Is this because some content has been updated?

I would be grateful if you could reply as soon as possible. Wish you a happy new year.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Some confusions about AUCM loss #67

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Some confusions about AUCM loss #67

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions