Follow
Nolan Dey
Nolan Dey
Other namesNolan S. Dey, Nolan Simran Dey
Cerebras Systems
Verified email at cerebras.net - Homepage
Title
Cited by
Cited by
Year
SlimPajama: A 627B token cleaned and deduplicated version of RedPajama
D Soboleva, F Al-Khateeb, R Myers, JR Steeves, J Hestness, N Dey
https://www.cerebras.net/blog/slimpajama-a-627b-token-cleaned-and …, 2023
163*2023
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
N Dey, G Gosal, ZC Chen, H Khachane, W Marshall, R Pathria, M Tom, ...
arXiv preprint arXiv:2304.03208, 2023
972023
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
N Dey, D Soboleva, F Al-Khateeb, B Yang, R Pathria, H Khachane, ...
arXiv preprint arXiv:2309.11568, 2023
10*2023
37,000 Human-Planned Robotic Grasps With Six Degrees of Freedom
VR Osorio, R Iyengar, X Yao, P Bhattachan, A Ragobar, N Dey, B Tripp
IEEE Robotics and Automation Letters 5 (2), 3346-3351, 2020
52020
Sparse maximal update parameterization: A holistic approach to sparse training dynamics
N Dey, S Bergsma, J Hestness
The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024
32024
Position Interpolation Improves ALiBi Extrapolation
F Al-Khateeb, N Dey, D Soboleva, J Hestness
arXiv preprint arXiv:2310.13017, 2023
22023
Studying CNN representations through activation dimensionality reduction and visualization
NS Dey
University of Waterloo, 2021
12021
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
S Bergsma, NS Dey, G Gosal, G Gray, D Soboleva, J Hestness
The Thirteenth International Conference on Learning Representations, 2025
2025
Empirical Upper Bounds for Unstructured Sparsity in Compute-Efficient Language Modeling
E Singh, S Bergsma, NS Dey, J Hestness, G Gray
Workshop on Machine Learning and Compression, NeurIPS 2024, 2024
2024
The Practitioner’s Guide to the Maximal Update Parameterization
N Dey, Q Anthony, J Hestness
https://cerebras.ai/blog/the-practitioners-guide-to-the-maximal-update …, 2024
2024
Identifying and interpreting tuning dimensions in deep networks
NS Dey, JE Taylor, BP Tripp, A Wong, GW Taylor
NeurIPS 2020 Workshop on Shared Visual Representations in Human & Machine …, 2020
2020
The system can't perform the operation now. Try again later.
Articles 1–11