Skip to content

alloevil/github-discovery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Discovery

CI License Stars Website

Discover trending GitHub repos before they go mainstream.
6 data sources · smart scoring · anti-spam · daily email digest

Website · Quick Start · Features · Development


What it does

GitHub Discovery automatically collects signals from 5 data sources every day, uses a smart scoring system (100 points) to filter the most promising projects, and delivers them to you via email and web.

The problem it solves: GitHub Trending shows you what's popular today. GitHub Discovery shows you what's about to be popular — repos with unusual growth patterns, community picks from Hacker News and Reddit, and early-stage projects gaining traction.


Features

6 Data Sources

Source Signal What it catches
GitHub Trending Popularity Daily trending repositories
GitHub Search New & rising Repos created in the last 7 days with fast star growth
Hacker News Community picks GitHub repos from Show HN posts
Reddit Discussion GitHub links from /r/programming hot posts
Rising Detection Early signal Unusual Fork/Watch growth patterns
AI/ML Trending AI focus AI/ML repositories with fast growth (inspired by OSSInsight)

Smart Scoring (100 points)

Dimension Points What it measures
Acceleration 40 Star growth rate, acceleration trend
Quality 30 Age, language, license, content completeness
Anti-spam 30 Fork ratio, description quality
Code Quality +20 README, CI config, commit frequency
Suspicious Stars -15 1000+ stars in 1 day with no description
User Feedback ±10 👍👎 voting integrated into scoring
Batch Fraud -40 Multiple repos from same owner growing simultaneously

Anti-spam

  • Star fraud detection: 1000+ stars in 1 day with age < 1 day → flagged
  • Batch fraud detection: Same owner with multiple repos growing at once → flagged
  • Content quality: No description or no README → penalty
  • Cross-day dedup: 7-day window, no duplicate recommendations

User Feedback

  • Vote on every recommendation (👍/👎)
  • Feedback integrated into scoring algorithm
  • localStorage persistence

Email Subscription

  • Daily curated repos delivered to your inbox
  • Dark mode support (Apple Mail / iOS)
  • Powered by Resend API

GitHub Pages

  • Modern, professional web interface
  • Filter by date and language
  • Real-time scoring display

Quick Start

1. Fork this repo

Click the Fork button in the top right corner.

2. Configure Secrets

Go to Settings → Secrets and variables → Actions and add:

Secret Required Description
RESEND_API_KEY Resend API Key for sending emails
GITHUB_TOKEN GitHub Personal Access Token (optional, uses GITHUB_TOKEN by default)

3. Enable GitHub Actions

Go to Actions and click I understand my workflows, go ahead and enable them.

4. Test manually

Go to Actions → Daily Discovery → Run workflow to trigger a test run.

5. View results

  • GitHub Pages: Visit https://<your-username>.github.io/github-discovery/
  • Email: Subscribers receive daily digests

Project Structure

github-discovery/
├── scripts/
│   ├── sources.py           # 5 data source collectors
│   ├── scorer.py            # Scoring algorithm
│   ├── quality.py           # Code quality detection
│   ├── dedup.py             # Cross-day deduplication (7-day window)
│   ├── feedback.py          # User feedback system
│   ├── fraud_detection.py   # Batch fraud detection
│   ├── verify_scoring.py    # Scoring verification / backtesting
│   ├── main.py              # Entry point
│   └── config.py            # Configuration
├── tests/                   # 117 unit tests
├── docs/index.html          # GitHub Pages
├── .github/workflows/       # Daily automation
├── subscribers.txt          # Email subscriber list
└── config.yaml              # Runtime configuration

Development

Local Run

git clone https://github.com/alloevil/github-discovery.git
cd github-discovery
python scripts/main.py

Run Tests

pip install pytest
python -m pytest tests/ -v

Add a New Data Source

  1. Add a new fetch_xxx() function in scripts/sources.py
  2. Call it in fetch_all()
  3. Add tests in tests/test_sources.py
  4. Submit a PR

Scoring Algorithm

Scoring logic is in scripts/scorer.py. Weights can be adjusted in config.py:

SCORING_WEIGHTS = {
    "acceleration": 40,
    "quality": 30,
    "antispam": 30,
}

Scoring Verification

Run backtesting to verify whether high-scored repos actually took off:

python scripts/verify_scoring.py --days 30

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork this repo
  2. Create a feature branch: git checkout -b feature/your-feature
  3. Commit your changes: git commit -m 'feat: add your feature'
  4. Push the branch: git push origin feature/your-feature
  5. Submit a Pull Request

Contribution Ideas

  • 📡 Add new data sources
  • 🎯 Optimize scoring algorithm
  • 🐛 Fix bugs
  • 📖 Improve documentation
  • ✅ Add tests

License

This project is licensed under the MIT License.


Acknowledgements


⭐ If you find this useful, please give it a star!

About

🔥 Discover trending GitHub repos before they go mainstream. 6 data sources · smart scoring · anti-spam · daily email digest.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors