Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练数据切分问题 #164

Open
sm307 opened this issue May 30, 2024 · 1 comment
Open

训练数据切分问题 #164

sm307 opened this issue May 30, 2024 · 1 comment

Comments

@sm307
Copy link

sm307 commented May 30, 2024

对于训练代码文件超过context大小时,需要切分成不同个的4096 minibatch。想问一下,咱们这个切分是按照token级别切分,还是按行级别切分,或者是function级别切分?

@YutianGitHub
Copy link

个人觉得按line好一些

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants