"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

December 22, 2022

Sagemaker - Notes

Key Notes

  • Create endpoint
  • Expose as Rest API
  • Realtime / Batch inference
  • Async for image
  • Deploy on CPU / GPU 
Factors for Deployment

  • Model Complexity
  • Payload
  • Complex Workflow
  • Compute, Storage, Networking Cost
Large Language Models Supported

  • Parallelize the model
  • Latency of low millisecond
Async  / Vision
  • Input uploaded in the bucket
  • Output loaded in bucket



Lower cost and higher throughput


Cold Start Latency 



Ops Features





Keep Exploring!!!

No comments: