Chaim Rand – Medium

Chaim Rand

The Case for Centralized AI Model Inference Serving

Optimizing Highly Parallel AI Algorithm Execution

Mar 18

The Case for Centralized AI Model Inference Serving

Mar 18

Debugging the Dreaded NaN

Capturing and Reproducing Failures in PyTorch Training with Lightning

Feb 26

Debugging the Dreaded NaN

Feb 26

Streaming Data from Cloud Storage with Mountpoint for Amazon S3

A First Look at a New Solution for Mounting Cloud Based Data

Feb 10

Streaming Data from Cloud Storage with Mountpoint for Amazon S3

Feb 10

Efficient Metric Collection in PyTorch: Avoiding the Performance Pitfalls of TorchMetrics

PyTorch Model Performance Analysis and Optimization — Part 7

Feb 4

Efficient Metric Collection in PyTorch: Avoiding the Performance Pitfalls of TorchMetrics

Feb 4

Published in
TDS Archive

Optimizing Transformer Models for Variable-Length Input Sequences

How PyTorch NestedTensors, FlashAttention2, and xFormers can Boost Performance and Reduce AI Costs

Nov 26, 2024

Optimizing Transformer Models for Variable-Length Input Sequences

Nov 26, 2024

Published in
TDS Archive

Increasing Transformer Model Efficiency Through Attention Layer Optimization

How paying “better” attention can drive ML cost savings

Nov 18, 2024

Increasing Transformer Model Efficiency Through Attention Layer Optimization

Nov 18, 2024

Published in
TDS Archive

On the Programmability of AWS Trainium and Inferentia

Accelerating AI/ML Model Training with Custom Operators — Part 4

Nov 1, 2024

On the Programmability of AWS Trainium and Inferentia

Nov 1, 2024

Published in
TDS Archive

AI Model Optimization on AWS Inferentia and Trainium

Tips for accelerating ML with AWS Neuron SDK

Oct 20, 2024

AI Model Optimization on AWS Inferentia and Trainium

Oct 20, 2024

Published in
TDS Archive

Implementing Sequential Algorithms on TPU

Accelerating AI/ML Model Training with Custom Operators — Part 3.A

Oct 7, 2024

Implementing Sequential Algorithms on TPU

Oct 7, 2024

Published in
TDS Archive

The Rise of Pallas: Unlocking TPU Potential with Custom Kernels

Accelerating AI/ML Model Training with Custom Operators — Part 3

Oct 6, 2024

The Rise of Pallas: Unlocking TPU Potential with Custom Kernels

Oct 6, 2024

Chaim Rand

Chaim Rand

I am a Machine Learning Algorithm Developer working on Autonomous Vehicle technologies at Mobileye. The views expressed in my posts are my own.

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech