#aiml — blogs.social

Oz Akan @oz.akan.io

6d

Turbovec and the trick of not training a quantizer

How TurboQuant uses a random rotation to precompute its quantizer, and why skipping the training step changes the operational story.

Oz Akan @oz.akan.io

6d

Hadamard and the Random Rotation

Why a matrix of plus and minus ones does the work of a dense random rotation, in O(d log d) instead of O(d squared).

Oz Akan @oz.akan.io

6d

RAG vs MCP: Complementary AI Approaches

Understanding the differences between RAG and MCP, when to use each, and how they work together

Oz Akan @oz.akan.io

6d

Which Loss Function Do LLMs use?

Exploring Cross-Entropy Loss in Large Language Models.

Oz Akan @oz.akan.io

6d

What is Matryoshka Representation Learning (MRL)?

Nesting Power and Flexibility into ML Embeddings

Oz Akan @oz.akan.io

6d

UE8M0 FP8 Number Format

Training LLMs without H100 using UE8M0 FP8 number format.

Oz Akan @oz.akan.io

6d

Understanding ML Numerical Formats

Understanding INT4, INT8, FP16, BF16, and TF32 formats in machine learning - their precision, speed, and memory trade-offs for training and inference.

Oz Akan @oz.akan.io

6d

What do GPT-OSS and Gemma 3 really offer?

GPT-OSS and Gemma 3: two new small-but-powerful language models pushing the boundaries.

Oz Akan @oz.akan.io

6d

What are Positional Embeddings?

The mathematical technique that teaches AI models where each word sits in a sequence.

Oz Akan @oz.akan.io

6d

Words, Tokens and Embeddings

How language models convert token IDs into meaningful vector representations that capture semantic relationships.

Oz Akan @oz.akan.io

6d

Subword Tokenization Algorithms

Understanding the algorithms behind tokenization in Large Language Models.

Oz Akan @oz.akan.io

6d

What is LLM Inference?

Understanding how Large Language Models generate text through the inference process.

Oz Akan @oz.akan.io

6d

The Agentic AI Hype

The Overhyped Buzzword That’s Just AI With To-Do Lists.

Oz Akan @oz.akan.io

6d

Embedding Selection for RAG Systems

At the heart of every effective RAG implementation lies a crucial decision: which embedding model to use.

Oz Akan @oz.akan.io

6d

Diffusion Based Language Models

Instead of writing sequentially, DLMs start with something like noisy or scrambled text and gradually denoise it over several steps.

Oz Akan @oz.akan.io

6d

Understanding the K-Nearest Neighbors (k-NN) Algorithm

A simple yet effective machine learning algorithm for classification and regression.

Oz Akan @oz.akan.io

6d

Semantic vs Lexical Similarity

Semantic similarity and lexical similarity are two distinct ways of comparing text, with the key difference being meaning versus surface-level features.

Oz Akan @oz.akan.io

6d

XGBoost: The Powerhouse of Gradient Boosting

XGBoost is one of the most powerful tools for building machine learning models due to its speed, accuracy, and robustness.

Oz Akan @oz.akan.io

6d

What are Word Embeddings?

Word embeddings are a fundamental concept in Natural Language Processing (NLP), enabling machines to understand and process human language effectively.

Oz Akan @oz.akan.io

6d

SageMaker Built-in Algorithms

Amazon SageMaker offers a wide range of built-in algorithms to simplify and accelerate machine learning (ML) projects.

Oz Akan @oz.akan.io

6d

Factorization Machines (FMs) are a type of machine learning model that helps us make predictions based on data.

Oz Akan @oz.akan.io

6d

SageMaker Linear Learner Algorithm

Amazon SageMaker Linear Learner is a machine learning algorithm that helps solve two main types of problems.

Oz Akan @oz.akan.io

6d

What are Features in Machine Learning?

Choosing the right features is crucial for building an accurate and efficient model.

Oz Akan @oz.akan.io

6d

The ML Development Lifecycle and Best Practices

A comphrensive guide to ML Development Lifecycle with best practices.

Oz Akan @oz.akan.io

6d

The ML Development Lifecycle

A brief guide to ML Development Lifecycle.

Oz Akan @oz.akan.io

6d

TF-IDF Simplified

The goal of TF-IDF is to emphasize words that are important in a particular document while filtering out common words that appear frequently across many documents but offer little unique information.

Oz Akan @oz.akan.io

6d

AWS AI Practitioner Certification Notes

Guide to the AWS Certified AI Practitioner exam, covering key concepts, AWS services, and real-world applications.