Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for tracking commits pushed directly to branch #373

Open
2 tasks
rithviknishad opened this issue Mar 11, 2024 · 24 comments · May be fixed by #601 or #602
Open
2 tasks

Support for tracking commits pushed directly to branch #373

rithviknishad opened this issue Mar 11, 2024 · 24 comments · May be fixed by #601 or #602
Assignees
Labels
good first issue Good for newcomers

Comments

@rithviknishad
Copy link
Member

Problem

In certain cases, contributions are made to a project by pushing a commit to a branch. Leaderboard currently does not track such activities performed by the contributor.

image

Action Items

  • Update scraper to track commits pushed to a branch by a contributor. Tag those activities if it was made to the default branch or not.
  • Update the contributor page to reflect the new information being captured.
@dgparmar14
Copy link
Contributor

I want to work on this issue.

@dgparmar14
Copy link
Contributor

image

@rithviknishad Please review the logic and the method used to fetch the default branch name in the provided code snippet.

@rithviknishad
Copy link
Member Author

rithviknishad commented Mar 22, 2024

Do events show PushEvents from forked repos?

If not, I don't think this will be a great value addition since most of the contributions are from forked repos.

@dgparmar14
Copy link
Contributor

Okay, So I need to track also a commits from contributors forked repositories and also show them in profile page, right ?

@rithviknishad
Copy link
Member Author

Yeah, but is it possible?

@dgparmar14
Copy link
Contributor

dgparmar14 commented Mar 22, 2024

Yup, It's possible
Here are the steps

  1. Fetch the main repository: First, fetch information about the main repository.
  2. Fetch the forks of the main repository
  3. Iterate through forks: For each fork, fetch the commits made by the contributors in their forked repositories.

Screenshot 2024-03-22 191640
Screenshot 2024-03-22 191651

We can create these two functions or one by merging these two and call this from scrapper function.

@dgparmar14
Copy link
Contributor

@rithviknishad Even if we utilize the GraphQL API, we still encounter the same issue. We'd need to initially retrieve all repositories and their associated forks before we can access their commits. While pagination can help mitigate rate limit problems, it may not be applicable in this scenario since we're scraping data.

@dgparmar14
Copy link
Contributor

@rithviknishad
In profile page we are showing currently working on feature , in profile page.
So, my idea is that when pr opend from than we will start tracking commits of that pr because i think it is more better and meaning full.another thing is currently working on, feature is shwoing scrapped data now. So i thought to change it to Api request when user hit the route . So is it fine to do this to fixes the issue?

@rithviknishad
Copy link
Member Author

Yeah that'd be nice. That'd also bring ability to blend viewing of scrapped data with data from other sources

@rajku-dev
Copy link
Contributor

rajku-dev commented Jan 2, 2025

Currently, the feed page only shows push activities for the core team with direct access. The issue is to display all commits for made by contributors(forked ones), including their individual commits.

There are 3 cases

  • User pushes commits to forked repo branch without creating a PR
  • After creating a PR, user pushes commits to forked repo's branch(upon review)
  • Merged commit (by a team member) displayed on Feed
    Is this the task to be done? @rithviknishad

@rajku-dev
Copy link
Contributor

@rithviknishad I have designed an algo that allows contributions made from forked repos to be displayed on feed,
Final Result looks like this
Image

Can i make a PR and work further on optimizing it?

@rithviknishad
Copy link
Member Author

rithviknishad commented Jan 3, 2025

issue is about tracking commits in the flat data repo and not the feed page.

@rajku-dev
Copy link
Contributor

issue is about tracking commits in the flat data repo and not the feed page.

so we need to have this algo in a scraper in github-scraper folder (fetchForkedCommits.ts) and a new ACTIVITY_TYPE "forked_commits", with new commits : Activity[] field in ActivityData
changes will be reflected in .json file of users in data-repo and we can update profile from there on right?
@rithviknishad

@rithviknishad
Copy link
Member Author

why do we need a new activity type called "forked_commits"?

We should be tracking all types of commits via activity type: "pushed_commits" which includes forked or non forked ones

@rajku-dev
Copy link
Contributor

rajku-dev commented Jan 7, 2025

why do we need a new activity type called "forked_commits"?

We should be tracking all types of commits via activity type: "pushed_commits" which includes forked or non forked ones

exactly what i was thinking,
we can have a new pushed_commits ACTIVITY_TYPE for pushEvents type of Activity
currently we are not parsing pushEvents in parseEvents, if we do we'll get pushed_commits type of Activity for core-team members, because this is what fetchEvents fetches.
we'll write algo for fetching pushEvents from forked repos, and push it in existing Activity[] as pushed_commits for users and sort them, or do we really need to sort them? may be because that scraper will push on top of existing Activities (Comment, Issue..) from existing scrapers.

@rithviknishad
Copy link
Member Author

yup, sounds good. and skip sorting. this is just a dump of data. if sorting is needed ever for any situation, it should be happening during the pre-build step when data from json is loaded for building the pages

@rajku-dev
Copy link
Contributor

yup, sounds good. and skip sorting. this is just a dump of data. if sorting is needed ever for any situation, it should be happening during the pre-build step when data from json is loaded for building the pages

We can definitely do that. Could you please assign this issue to me? I’d like to move forward with implementing the discussed approach.

@rajku-dev
Copy link
Contributor

rajku-dev commented Jan 9, 2025

How to test the changes made by adding new scraper in local data repo? (.json file of users)
stumbled upon this thread for some guidance, set the GIT_ACCESS_TOKEN to GIT_PAT in secret,
got this error while running the leaderboard-data workflow manually through GitHub interface
Are we even allowed to do that or just write the code?

@rithviknishad
Copy link
Member Author

How to test the changes made by adding new scraper in local data repo? (.json file of users) stumbled upon this thread for some guidance, set the GIT_ACCESS_TOKEN to GIT_PAT in secret, got this error while running the leaderboard-data workflow manually through GitHub interface Are we even allowed to do that or just write the code?

Just saw this. I believe you were able to figure out since you've already made the PR.

@rajku-dev
Copy link
Contributor

How to test the changes made by adding new scraper in local data repo? (.json file of users) stumbled upon this thread for some guidance, set the GIT_ACCESS_TOKEN to GIT_PAT in secret, got this error while running the leaderboard-data workflow manually through GitHub interface Are we even allowed to do that or just write the code?

Just saw this. I believe you were able to figure out since you've already made the PR.

not really, i tried deploying to vercel following the docs, but it failed
i just need to test the scrapers which i have not figured out how.

@rithviknishad
Copy link
Member Author

you can test the scraper in your local.

first you'll need to pre-load any existing data-repo (you can start from scratch too if you want). run this: https://github.com/ohcnetwork/leaderboard/blob/main/package.json#L12

second, you can cd into the scraper directory and run the scraper by running the following command (assuming you have gh cli, or else replace with your own personal PAT token. No extra permissions to be configured. Any PAT works.

GITHUB_TOKEN=$(gh auth token) pnpm dev ohcnetwork ../data-repo/data/github

@rajku-dev
Copy link
Contributor

rajku-dev commented Jan 10, 2025

you can test the scraper in your local.

first you'll need to pre-load any existing data-repo (you can start from scratch too if you want). run this: https://github.com/ohcnetwork/leaderboard/blob/main/package.json#L12

second, you can cd into the scraper directory and run the scraper by running the following command (assuming you have gh cli, or else replace with your own personal PAT token. No extra permissions to be configured. Any PAT works.

GITHUB_TOKEN=$(gh auth token) pnpm dev ohcnetwork ../data-repo/data/github

Thank You Sir, now i am able to test the scraper and debug the changes in terminal 🚀
I'll update the README

@rithviknishad
Copy link
Member Author

feel free to make a PR to update the README 😄

@rajku-dev
Copy link
Contributor

feel free to make a PR to update the README 😄

yeah that'd be better 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment