Initial and final evaluation in `finetune` scripts do not accumulate over devices

### Bug description

I am looking at `litgpt/finetune/lora.py` and `litgpt/finetune/full.py`. In the LoRA code, the periodic evaluation L401-418 runs `validate` on each device, then accumulates parts by `all_reduce`.

But this does not happen for initial evaluation L310 and final evaluation L260. This seems pretty wrong to me, the loss values would just be the one on the rank 0 device.

Another issue is the `val_loss` value which is printed in L395, but which seems never updated in L401-418.

I'd be happy to submit a PR fixing all this, but first wanted to check whether I understand something wrong here?

### Reproduced in studio

_No response_

### What operating system are you using?

macOS

### LitGPT Version

0.5.9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Initial and final evaluation in `finetune` scripts do not accumulate over devices #2116

Bug description

Reproduced in studio

What operating system are you using?

LitGPT Version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Initial and final evaluation in finetune scripts do not accumulate over devices #2116

Description

Bug description

Reproduced in studio

What operating system are you using?

LitGPT Version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Initial and final evaluation in `finetune` scripts do not accumulate over devices #2116