Chaim RandinTowards Data ScienceOn the Programmability of AWS Trainium and InferentiaAccelerating AI/ML Model Training with Custom Operators — Part 45d ago5d ago
Chaim RandinTowards Data ScienceAI Model Optimization on AWS Inferentia and TrainiumTips for accelerating ML with AWS Neuron SDKOct 201Oct 201
Chaim RandinTowards Data ScienceImplementing Sequential Algorithms on TPUAccelerating AI/ML Model Training with Custom Operators — Part 3.AOct 7Oct 7
Chaim RandinTowards Data ScienceThe Rise of Pallas: Unlocking TPU Potential with Custom KernelsAccelerating AI/ML Model Training with Custom Operators — Part 3Oct 62Oct 62
Chaim RandinTowards Data ScienceTraining AI Models on CPURevisiting CPU for ML in an Era of GPU ScarcitySep 14Sep 14
Chaim RandinTowards Data ScienceUnleashing the Power of Triton: Mastering GPU Kernel Optimization in PythonAccelerating AI/ML Model Training with Custom Operators — Part 2Aug 133Aug 133
Chaim RandinTowards Data ScienceAccelerating AI/ML Model Training with Custom OperatorsOn the potential benefits of creating model-specific GPU kernels and their application to optimizing the use of dynamically shaped tensorsAug 111Aug 111
Chaim RandinTowards Data ScienceMulti-Framework AI/ML Development with Keras 3All hail the return of KerasJun 16Jun 16
Chaim RandinTowards Data ScienceAI Model Training with JAXHit the road to super-fast AI/ML developmentMay 29May 29
Chaim RandinTowards Data SciencePyTorch Native FP8Accelerating PyTorch Training Workloads with FP8 — Part 2May 21May 21