Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Model] Add the Infinity-Instruct SFT code #278

Merged
merged 17 commits into from
Dec 3, 2024
Merged

Conversation

CathySama
Copy link
Contributor

No description provided.

@CathySama CathySama requested a review from a team as a code owner November 26, 2024 04:03
@CathySama CathySama changed the title fyc update flagscale add Infinity-Instruct SFT code Nov 26, 2024
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the train_ prefix from the file names since they are in the train directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have updated the launcher to use the unified run.py. Please remove dist_start.sh, dist_stop.sh, env.sh, and run.sh.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These files (dist_start.sh, dist_stop.sh, env.sh, run.sh and the directories of tokenizers) have been removed.

total_tokens = loss_mask.sum()
loss = torch.cat([torch.sum(losses.view(-1) * loss_mask).view(1), total_tokens.view(1)])

loss = torch.cat([torch.sum(torch.masked_select(losses.view(-1) , loss_mask==1)).view(1), total_tokens.view(1)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loss will also be used in pre-training. If the SFT requires a different one, we may need a better way to distinguish them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add a new file called train_aquila_sft.py to distinguish the loss.

aoyulong
aoyulong previously approved these changes Nov 28, 2024
Copy link
Contributor

@aoyulong aoyulong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@aoyulong aoyulong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename conf_qwen to conf since this folder is already in the parent qwen folder.

In /conf/train/qwen_2.5_1.5b.yaml,   adding ckpt_format, ckpt_convert_format and ckpt_convert_save to convert checkpoints .
@CathySama
Copy link
Contributor Author

Have renamed conf_qwen.

@aoyulong aoyulong changed the title add Infinity-Instruct SFT code [Model] Add the Infinity-Instruct SFT code Dec 3, 2024
Copy link
Contributor

@aoyulong aoyulong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aoyulong aoyulong merged commit 39d1775 into FlagOpen:main Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants