AICL-Lab
Popular repositories Loading
-
the-book-of-secret-knowledge-zh
the-book-of-secret-knowledge-zh Public📚 秘密知识之书中文版 - A curated collection of tools, manuals, cheatsheets, and resources for SysAdmins, DevOps, Pentesters and Security Researchers. Chinese translation of the-book-of-secret-knowledge.
Python 5
-
-
brave-sync-notes
brave-sync-notes Public🔐 端到端加密笔记同步 | End-to-end encrypted note sync with real-time collaboration
JavaScript 4
-
diy-flash-attention
diy-flash-attention PublicLearn Triton by building FlashAttention from scratch — V2 kernels, persistent threads, mask DSL, profiling toolkit, bilingual docs
Python 4
-
heterogeneous-task-scheduler
heterogeneous-task-scheduler PublicC++17 DAG scheduler for heterogeneous CPU/GPU workloads - production-ready with CPU-only validation path
C++ 4
-
triton-fused-ops
triton-fused-ops PublicFused Triton kernels for Transformer inference: RMSNorm+RoPE, Gated MLP, FP8 GEMM — CPU-testable references, autotuning, and benchmarking
Python 4
Repositories
- gpu-fft Public
High-performance GPU-accelerated FFT library for JavaScript/TypeScript using WebGPU compute shaders. Zero runtime dependencies, dual GPU/CPU paths, TypeScript-first.
AICL-Lab/gpu-fft’s past year of commit activity - chatroom Public
Teaching-oriented real-time chat app with Go, React, PostgreSQL, WebSocket, observability, and OpenSpec-driven workflow.
AICL-Lab/chatroom’s past year of commit activity - cuflash-attn Public
CUDA C++ FlashAttention reference implementation - O(N) memory, FP32/FP16, forward/backward
AICL-Lab/cuflash-attn’s past year of commit activity - webgpu-sorting Public
High-performance GPU sorting library using WebGPU compute shaders (Bitonic Sort, Radix Sort) with TypeScript API, live demo, and comprehensive documentation
AICL-Lab/webgpu-sorting’s past year of commit activity - n-body Public
High-performance N-body particle simulation with Barnes-Hut algorithm, GPU acceleration, and real-time visualization
AICL-Lab/n-body’s past year of commit activity - modern-ai-kernels Public
TensorCraft-HPC: A header-only C++/CUDA kernel library for learning high-performance AI operators with progressive optimization paths
AICL-Lab/modern-ai-kernels’s past year of commit activity - tiny-llm Public
CUDA-native C++ Transformer inference engine with W8A16 quantization, KV cache management, and optimized CUDA kernels
AICL-Lab/tiny-llm’s past year of commit activity - llm-speed Public
CUDA kernels for LLM inference: FlashAttention forward, Tensor Core GEMM, and PyTorch bindings
AICL-Lab/llm-speed’s past year of commit activity - aurora-signal Public
Lightweight WebRTC Signaling Server (Go): Room Management, Role-Based Auth, Redis Scaling & Prometheus Metrics | 轻量级 WebRTC 信令服务器(Go),支持房间管理、角色权限、Redis 水平扩展与 Prometheus 监控
AICL-Lab/aurora-signal’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…