Posts by 'Ramandeep Singh'

Why Starlette's TestClient Might Break Your pytest-asyncio Tests (And How to Fix It)

Testing FastAPI applications with async resources? If you're using Starlette's TestClient, you're mixing sync and async contexts in ways that might break your tests. Here's why httpx.AsyncClient should be your go-to choice for keeping everything properly async.

The Hidden Problem with TestClient: Sync/Async Context Mixing

TestClient looks convenient …

"Fire but don't forget": The Hidden Costs of Async fire and forget patterns

In modern web applications, we often use async fire and forget patterns to improve user experience, for example, when a user clicks a item on a page, or an ML model makes a real-time prediction and want to log the event. These events are often not super critical and we …

A few caveats with online inference

Introduction

Online inference is a technique used to deploy machine learning models in production. However, there are several considerations to keep in mind when deploying models, especially when using FastAPI/gunicorn along with libraries such as NumPy and PyTorch. This article highlights a few of these caveats.

Model Loading

When …

mapPartitions vs mapInPandas

Prior to spark 3.0+, to optimize for performance and utilize vectorized operations, you'd generally have to repartition the dataset and invoke mapPartitions.

This had the major drawback of performance impact that was incurred from repartitioning (caused by shuffle) the DataFrame.

With spark 3.0+, if your underlying function is …

Multiple Condition Queues For Better Concurrency

I had been revisiting concurrent libraries that I had worked upon earlier and just wanted to highlight the importance of using separate wait sets and condition queues for your library implementations. The performance of these has been benchmarked using JMH.

Let me just list down the advantages of using separate …