Scalable AI Inference: Performance Analysis and Optimization of AI Model Serving — Hung Cuong Pham, Fatih Gedikli | Kutubxona