Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Init weights only if not loading a checkpoint #628

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

carmocca
Copy link
Contributor

@carmocca carmocca commented Oct 18, 2024

Initializing the weights is wasteful if a checkpoint will be loaded.

This assumes that checkpoints are complete (loadable in strict=True mode). Let me know if that's too strong of an assumption generally.

It might also be problematic with non_persisent=True buffers, if they were supported (#316). That is if you expect to initialize them in init_weights. Although one could argue for a separate init_buffers() method

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 18, 2024
@awgu
Copy link
Contributor

awgu commented Oct 18, 2024

I agree with the spirit of this change :)

init_buffers for non-persistent buffers makes sense to me too.

@tianyu-l
Copy link
Contributor

cc: @fegin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants