Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 15 items • Updated 3 days ago • 157
Running 3.87k The Ultra-Scale Playbook 🌌 3.87k The ultimate guide to training LLM on large GPU Clusters
An Empirical Study of Autoregressive Pre-training from Videos Paper • 2501.05453 • Published Jan 9, 2025 • 41
An Empirical Study of Autoregressive Pre-training from Videos Paper • 2501.05453 • Published Jan 9, 2025 • 41
Scaling Properties of Diffusion Models for Perceptual Tasks Paper • 2411.08034 • Published Nov 12, 2024 • 13