Skip to content
This repository was archived by the owner on Feb 25, 2022. It is now read-only.
This repository was archived by the owner on Feb 25, 2022. It is now read-only.

GPT-3 configuration for a v3-32 TPU #183

@stefan-it

Description

@stefan-it

Hi,

many thanks for releasing this GPT training code 👍

I just wanted to train a new model from scratch (with own vocab), so I was using the following configuration file

https://github.com/EleutherAI/gpt-neo/blob/master/configs/gpt3_small_256.json

However, I'm not 100% sure what to use for mesh_shape and layout, because I'm not using a 256 TPU pod, I'm using a v3-32 only.

Could you please provide some more information about how to use the correct values?

Many thanks in advance and best,

Stefan

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions