top of page

MOHIT KUMAR

Mohit Picture 1.jpg

Technical Consultant and Trainer

"I am fascinated with Computer Science, mathematics, Universe and stuff. I like to understand things and the way they work under-the-hood. Occasionally, I like to explain the things that I understand using the first principle's approach. More formally I am a Researcher, Trainer, and a design consultant on the design of Artificial Intelligence(deep learning)based systems. Microarchitecture based optimizations are my specialization, more specifically P5(Intel) to Skylake(Intel). On the vector side, SIMD, GPUs(Nvidia). My micro-architecture knowledge enable me to see the complete stack. Close to 20 years in total experience, last 5 years I have been working on Optimizing Tensorflow and Models on Tensorflow on GPUs/CPUs. A general example of micro-optimization on Haswell microarchitecture."

Artificial Intelligence Research (DEEP Breath)

Artificial Intelligence Research - Deep

I feel that the barrier to entry for Deep Learning is very steep. Consider Natural Language Processing as an example. Neural Machine Translation, for example, uses concepts like LSTMBidirectional LSTMMulti Layered LSTMsAttention, etc. Neither one of them is easy to understand by itself, imagine the plight of a student when these concepts are strung together for a Neural Machine Translation or Google's BERT based systems. I have seen Neural Machine Translation Based systems grossly underperform and, it was simply because most of the hyperparameters were not understood at all. 
DEEP-Breathe is a complete and pure python implementation of most complex models, especially but not limited to Neural Machine Translator.

Concurrency

Concurrency.png

My main interests are techniques for designing, implementing, and reasoning about multiprocessor algorithms, in particular concurrent data structures for multicore machines and the mathematical foundations of the computation models that govern their behavior.

My research these days is directed at the use of randomness and combinatorial techniques in concurrent algorithm and data-
structure design.

JVM Tunings and Optimizations

image (1).png

Designing, Tuning and, Optimization of JVM based applications. Most often tuning is not only a software job. It requires one to know the entire stack right down to the hardware and the tools with which one can pinpoint the real issue. Beside the usual profilers, the tools I specialize in are Perf, Systemtap, Dtrace, Solaris studio analyzer, JMH, JCStress etc.

Most often tuning is not only a software job and majority of the java CPU profilers have little idea as what is happening beyond the JVM.
So using the right profiler with minimal overhead is the key. PMCs ( Performance Monitoring Counters) are special CPU registers that can record the entire trace of a call. Perf and system tap are tools that make this extremely easy and then generate a flame graph(figure-2) to inspect the calls that are bottlenecks.

Ultra Low Latency design and Architecture

ULL.jpg

With an intimate knowledge of the hardware specially the CPU(x86 family mostly) and the GPU, I specialize in the design and architecture of ultra low latency software.

Stillwaters run deep

Why Stillwaters?

What makes us different is real time practical experience.
Our knowledge of Micro-architecture makes us capable of looking at the complete stack, from hardware to software.

Follow us on social

  • Stillwaters Facebook Page
  • Stillwaters Twitter Page
  • Stillwaters Instagram Page
  • Stillwaters LinkedIn Page

Stay Updated

Subscribe to WhatsApp updates

Thanks for submitting!

Subscribe to Email updates

Thanks for submitting!

bottom of page