Dense vs Sparse Vectors Explained With Examples
A research-backed guide to dense and sparse vectors in machine learning. Learn how each representation works, when to use BM25 versus embeddings, how SPLADE bridges both worlds, and why hybrid search combining dense and sparse retrieval consistently outperforms either method alone.
Run a semantic search for "ERR_CONN_RESET_4XX retry semantics" using only dense embeddings. The retriever returns semantically adjacent documents about networking delays and connection errors. The actual answer, buried in section 3.2, gets ranked eleventh. Now run the same query through a BM25 index. The string "ERR_CONN_RESET_4XX" has an extremely high IDF score because it appears in almost no document other than section 3.2. BM25 returns that section as result number one.
That failure mode reveals the structural weakness of dense vectors. They were trained to generalize across language, which means exact string matching is exactly what they sacrifice. Sparse vectors were built for precise retrieval, so semantic generalization is what they never had.
Understanding both representations, when each one works, and how to combine them is now table stakes for anyone building production retrieval systems. This article covers the mechanics of each type, the algorithms that produce them, and the hybrid search architecture that brings them together.
This is the third article in the Vector Database Fundamentals series. It builds directly on what a vector is and how embeddings are generated, and connects forward to semantic search mechanics and the vector database infrastructure that stores and indexes both types.
What Makes a Vector Dense or Sparse?
The classification comes down to one property: what fraction of the dimensions have non-zero values.
A dense vector has a value in nearly every dimension. If the vector has 768 dimensions, almost all 768 positions contain a floating-point number that is not zero. Every dimension contributes to the representation.
A sparse vector has values in only a small number of dimensions. If the vector has 50,000 dimensions corresponding to a vocabulary of 50,000 words, a typical document might activate only 200 to 500 of those dimensions. The remaining 49,500-plus values are zero.
import numpy as np
# Dense vector (768-dim embedding — most values non-zero)
dense = np.array([0.412, -0.231, 0.887, 0.051, -0.330, 0.712, ...])
# nearly all 768 dimensions contain a meaningful float
# Sparse vector (10,000-dim vocabulary — mostly zeros)
sparse = np.zeros(10000)
sparse[243] = 0.82 # "machine" appears 3 times
sparse[1047] = 1.41 # "learning" appears 5 times
sparse[8833] = 0.54 # "vector" appears 2 times
# the remaining 9,997 positions stay zero
nonzero_count = np.count_nonzero(sparse)
print(f"Non-zero dimensions: {nonzero_count} out of {len(sparse)}")
# Non-zero dimensions: 3 out of 10000According to Weaviate's hybrid search documentation, sparse vectors have mostly zero values with only a few non-zero values, while dense vectors mostly contain non-zero values. Dense embeddings are generated from machine learning models, and sparse embeddings are generated from algorithms like BM25 and SPLADE.
Dense Vectors: Semantic Meaning as Geometry
Dense vectors are the output of trained embedding models. When you call OpenAI's embedding API or run a Sentence Transformers model, you get a dense vector for each input.
The core property of a dense vector is that its geometry encodes semantic meaning. Content with similar meanings produces vectors that point in similar directions in the vector space. The individual dimensions do not have human-readable interpretations. Meaning is distributed across all of them simultaneously.
from sentence_transformers import SentenceTransformer
import numpy as np
model = SentenceTransformer("all-MiniLM-L6-v2")
sentences = [
"How do I return a product?",
"What is your refund policy?",
"I want to cancel my order.",
"What is the distance from Earth to Mars?",
]
embeddings = model.encode(sentences)
def cosine_sim(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
# Compare semantically related sentences
print(cosine_sim(embeddings[0], embeddings[1])) # ~0.85 — very similar
print(cosine_sim(embeddings[0], embeddings[2])) # ~0.72 — related
print(cosine_sim(embeddings[0], embeddings[3])) # ~0.12 — unrelated"Return a product" and "refund policy" share no keywords, yet their dense vectors are close because both belong to the semantic neighborhood of customer returns and commerce support. The embedding model learned this relationship from patterns in training data.
As Milvus' quick reference on dense and sparse embeddings explains, dense embeddings excel at capturing nuanced relationships and contextual meaning, making them ideal for tasks like semantic search or recommendation systems.
Characteristics of Dense Vectors
Dense vectors have fixed dimensionality. All documents indexed with a given model produce vectors of identical length. You cannot compare a 768-dimensional BERT embedding to a 1536-dimensional OpenAI embedding. They live in incompatible spaces.
Dense vectors are not interpretable. You cannot look at dimension 412 and say "this dimension represents sports content." Information is distributed and entangled across all dimensions by design.
Dense vectors are computationally expensive to index for exact nearest-neighbor search. At millions of documents, brute-force comparison is not viable, which is why vector databases use approximate nearest neighbor algorithms like HNSW. This is covered in the vector database fundamentals article.
Sparse Vectors: Term Presence as Weight
Sparse vectors encode which terms are present in a document and how important they are. Each dimension maps to one term in a fixed vocabulary. The value at that position reflects the weight of that term in the document.
The classic method for producing sparse vectors is TF-IDF, and its more refined descendant BM25.
TF-IDF
TF-IDF (Term Frequency Inverse Document Frequency) assigns a weight to each term in a document by multiplying two factors: how often the term appears in that document (term frequency) and how rare the term is across all documents (inverse document frequency).
from sklearn.feature_extraction.text import TfidfVectorizer
import numpy as np
corpus = [
"machine learning is a subset of artificial intelligence",
"deep learning uses neural networks",
"artificial intelligence includes machine learning and deep learning",
"neural networks are inspired by the human brain",
]
vectorizer = TfidfVectorizer()
tfidf_matrix = vectorizer.fit_transform(corpus)
# Each document is now a sparse vector
doc0_vector = tfidf_matrix[0].toarray()[0]
vocab = vectorizer.get_feature_names_out()
# Print only non-zero terms for document 0
nonzero_indices = np.nonzero(doc0_vector)[0]
for idx in nonzero_indices:
print(f" '{vocab[idx]}': {doc0_vector[idx]:.4f}")
# Output (only words present in doc 0):
# 'artificial': 0.3853
# 'intelligence': 0.3853
# 'is': 0.5087
# 'learning': 0.2810
# 'machine': 0.3853
# 'of': 0.5087
# 'subset': 0.5087The word "machine" gets a moderate score because it appears in document 0 but also in document 2. The word "subset" gets a high score because it appears only in document 0, making it more distinctive.
BM25: The Industry Standard for Keyword Search
BM25 (Best Matching 25) improves on TF-IDF with two additions. First, it applies term frequency saturation: a word appearing 10 times in a document contributes more weight than one appearing once, but the weight does not grow linearly. After a point, additional occurrences contribute diminishing returns. Second, it applies document length normalization: a term appearing twice in a short document is more meaningful than the same term appearing twice in a very long document.
According to Weaviate's documentation, BM25 builds on TF-IDF by taking the Binary Independence Model from the IDF calculation and adding a normalization penalty that weighs a document's length relative to the average length of all documents in the database.
from rank_bm25 import BM25Okapi
corpus = [
"how do I return a product",
"what is your refund policy",
"I want to cancel my subscription",
"how to contact customer support",
"machine learning model training tutorial",
]
# Tokenize
tokenized_corpus = [doc.split() for doc in corpus]
bm25 = BM25Okapi(tokenized_corpus)
# Query
query = "product return"
tokenized_query = query.split()
scores = bm25.get_scores(tokenized_query)
for doc, score in zip(corpus, scores):
print(f"Score {score:.4f}: {doc}")
# Output:
# Score 0.9516: how do I return a product ← highest
# Score 0.1823: what is your refund policy
# Score 0.0000: I want to cancel my subscription
# Score 0.0000: how to contact customer support
# Score 0.0000: machine learning model training tutorialBM25 correctly ranks "return a product" highest for the query "product return." However, "refund policy" gets a very low score because it shares no keywords with the query, even though it is semantically related. This is the fundamental limitation of sparse retrieval.
BM25 is the default ranking algorithm in Elasticsearch, Solr, Lucene, and OpenSearch. It has been the backbone of keyword search for decades.
SPLADE: A Neural Sparse Model
SPLADE (Sparse Lexical and Expansion model) is the most important development in sparse retrieval in recent years. It uses a transformer model to produce sparse vectors that have the same structure as BM25 output (one dimension per vocabulary term, most values zero) but with two critical differences: the weights are learned rather than computed by formula, and the model performs query and document expansion.
BM25 query "car":
Activates dimensions: {car: 2.1}
SPLADE query "car":
Activates dimensions: {car: 1.8, vehicle: 1.4, automobile: 1.1, automotive: 0.7, driver: 0.4}SPLADE learns that "car," "vehicle," "automobile," and "automotive" appear in similar contexts and therefore expands the sparse vector to include all of them. According to Chroma's sparse vector documentation, SPLADE combines the precision of keyword search with the contextual awareness of neural models.
According to Elasticsearch's sparse embedding documentation, sparse vectors rely on term-based representations, making them more effective for zero-shot retrieval — where the model handles queries it has not explicitly been trained on. Unlike dense vector models that often need domain-specific training, sparse vectors generalize better to new domains out of the box.
Elastic's own implementation of this idea is ELSER (Elastic Learned Sparse EncodeR), which uses the same SPLADE principles tuned for English retrieval.
The practical tradeoff: SPLADE produces better retrieval quality than BM25 on most benchmarks, but requires running a transformer at inference time, adding roughly 100 to 300ms of latency depending on hardware. For exact-match heavy use cases involving product codes, error identifiers, or document IDs, BM25 often wins because those tokens get maximal IDF scores with zero inference overhead.
The Structural Failure Modes
Understanding where each representation breaks down is as important as understanding where it succeeds.
Where Dense Vectors Fail
Dense vectors fail on exact identifier matching. When a user queries "PROD-SKU-7842X", the embedding model maps this to a neighborhood of similar-looking product codes. It might return "PROD-SKU-7842Y" with high confidence. That is a wrong answer delivered with high confidence.
Dense vectors also fail on low-frequency proper nouns. If a new product name or person's name appears rarely or never in the training corpus, the embedding model has no good representation for it. The embedding lands in an arbitrary neighborhood that may bear no relationship to the actual meaning.
Dense vectors fail on technical jargon that differs from natural language usage. An error code like "ERR_CONN_RESET_4XX" is not in any training corpus. The dense model cannot distinguish it from similar-looking strings.
Where Sparse Vectors Fail
Sparse vectors fail on vocabulary mismatch. "How do I get a refund?" and "What is your return policy?" share no keywords, so BM25 returns zero similarity between them. A user asking in different words than the document uses will get no results.
Sparse vectors fail on synonyms by default (though SPLADE partially fixes this). "Car," "vehicle," and "automobile" are completely different tokens to a BM25 index. None of the word's relationships are captured.
Sparse vectors fail on semantic queries that rely on contextual interpretation. "What should I eat when I am feeling anxious?" requires understanding that "anxious" relates to mental states, that certain foods affect mood, and that the user is asking for a recommendation rather than a factual definition. BM25 has none of this.
The failures are structural and complementary. Dense misses exact strings. Sparse misses semantic relationships. This is the premise of hybrid search.
Hybrid Search: Combining Both Representations
Hybrid search runs a sparse retriever and a dense vector retriever in parallel, then merges their ranked results using a fusion algorithm before passing the top chunks to an LLM or presenting results to a user.
User query: "how to handle HTTP timeout errors in Python"
Sparse (BM25) pipeline:
Query → tokenize → BM25 score → ranked list A
Strong on: "HTTP", "timeout", "Python" — exact matches
Dense (embedding) pipeline:
Query → embedding model → ANN search → ranked list B
Strong on: "connection error handling", "retry logic",
"request exceptions" — semantic matches
Fusion (RRF):
Combine ranked list A + ranked list B
→ Final merged ranked list
Top K chunks → LLM context → grounded responseAccording to research cited in Supermemory's hybrid search guide, dense-only retrieval hits 78 percent recall at 10, sparse-only BM25 lands at 65 percent, and hybrid search reaches 91 percent recall at 10. That gap between 78 percent and 91 percent is the difference between a production-ready RAG system and one that hallucinates on edge cases.
Reciprocal Rank Fusion
The standard fusion algorithm is Reciprocal Rank Fusion (RRF). Rather than trying to normalize and combine raw scores from two different scoring systems (which creates subtle bugs when score distributions differ), RRF uses only rank positions. Each document gets a score of 1 / (k + rank) from each retriever, where k defaults to 60. Those rank scores are summed across retrievers, and the merged list is sorted by the combined score.
def reciprocal_rank_fusion(sparse_results, dense_results, k=60):
"""
Merge two ranked result lists using RRF.
Args:
sparse_results: list of doc IDs ordered by sparse score (best first)
dense_results: list of doc IDs ordered by dense score (best first)
k: constant to prevent high weighting of top-1 results
Returns:
merged: list of (doc_id, rrf_score) sorted by combined score
"""
scores = {}
for rank, doc_id in enumerate(sparse_results):
scores[doc_id] = scores.get(doc_id, 0) + 1.0 / (k + rank + 1)
for rank, doc_id in enumerate(dense_results):
scores[doc_id] = scores.get(doc_id, 0) + 1.0 / (k + rank + 1)
merged = sorted(scores.items(), key=lambda x: x[1], reverse=True)
return merged
sparse_results = ["doc_A", "doc_C", "doc_E", "doc_B"]
dense_results = ["doc_B", "doc_A", "doc_D", "doc_C"]
merged = reciprocal_rank_fusion(sparse_results, dense_results)
print("Merged ranking:")
for doc_id, score in merged:
print(f" {doc_id}: {score:.5f}")
# Merged ranking:
# doc_A: 0.03226 ← ranked 1st in sparse, 2nd in dense
# doc_B: 0.03175 ← ranked 4th in sparse, 1st in dense
# doc_C: 0.02969 ← ranked 2nd in sparse, 4th in dense
# doc_E: 0.01587 ← ranked 3rd in sparse only
# doc_D: 0.01575 ← ranked 3rd in dense onlyRRF is immune to the score normalization bugs that plague linear interpolation fusion. If one BM25 document has an outlier score because a query term appears 200 times in it, that does not collapse all other BM25 scores toward zero. RRF only cares about the rank, not the magnitude of the score.
According to Prems hybrid search guide, RRF at k=60 is the zero-config default. If you have 50 or more labeled query pairs, you can tune a weighted combination. Add a cross-encoder reranker after fusion for the single biggest precision improvement.
A Full Hybrid Retriever in Python
This example combines BM25 and dense embeddings with RRF using Qdrant as the vector database:
from qdrant_client import QdrantClient, models
from rank_bm25 import BM25Okapi
from sentence_transformers import SentenceTransformer
import numpy as np
# Documents to index
documents = [
"How do I return a product to the online store?",
"What is your refund policy for digital purchases?",
"I need to cancel my monthly subscription.",
"ERR_CONN_RESET_4XX occurs when the server closes the connection.",
"How to handle HTTP connection timeout errors in Python.",
"Network socket errors: causes and remedies.",
]
# Embedding model
embed_model = SentenceTransformer("all-MiniLM-L6-v2")
embeddings = embed_model.encode(documents)
# Qdrant client (in-memory for demo)
client = QdrantClient(":memory:")
client.create_collection(
collection_name="docs",
vectors_config=models.VectorParams(size=384, distance=models.Distance.COSINE),
)
# Upsert all documents
client.upsert(
collection_name="docs",
points=[
models.PointStruct(id=i, vector=emb.tolist(), payload={"text": doc})
for i, (doc, emb) in enumerate(zip(documents, embeddings))
],
)
# BM25 index (separate sparse index)
tokenized_corpus = [doc.lower().split() for doc in documents]
bm25 = BM25Okapi(tokenized_corpus)
def hybrid_search(query: str, top_k: int = 3) -> list:
# Sparse: BM25 ranking
sparse_scores = bm25.get_scores(query.lower().split())
sparse_ranked = np.argsort(sparse_scores)[::-1].tolist()
# Dense: semantic search
query_emb = embed_model.encode([query])[0]
dense_results = client.search(
collection_name="docs",
query_vector=query_emb.tolist(),
limit=len(documents),
)
dense_ranked = [hit.id for hit in dense_results]
# Fusion via RRF
k = 60
scores = {}
for rank, doc_id in enumerate(sparse_ranked):
scores[doc_id] = scores.get(doc_id, 0) + 1.0 / (k + rank + 1)
for rank, doc_id in enumerate(dense_ranked):
scores[doc_id] = scores.get(doc_id, 0) + 1.0 / (k + rank + 1)
top_ids = sorted(scores, key=scores.__getitem__, reverse=True)[:top_k]
return [documents[i] for i in top_ids]
# Test 1: Exact identifier query — sparse should save this
results = hybrid_search("ERR_CONN_RESET_4XX")
print("Query: ERR_CONN_RESET_4XX")
for r in results:
print(f" {r}")
# Test 2: Semantic query — dense should find this despite vocabulary mismatch
results = hybrid_search("how to get my money back from a purchase")
print("\nQuery: how to get my money back from a purchase")
for r in results:
print(f" {r}")The first query succeeds because BM25 gives "ERR_CONN_RESET_4XX" an extremely high IDF score. The second query succeeds because dense embeddings connect "money back" to "refund" and "return" even with no keyword overlap. Neither retriever alone would handle both queries well.
Comparing Dense and Sparse Vectors Side by Side
Property | Dense Vectors | Sparse Vectors
------------------+------------------------+--------------------------
Dimensionality | Low (128 to 3072) | High (vocab size, 10K-50K)
Non-zero values | Almost all | Very few
Produced by | Neural embedding model | BM25, TF-IDF, SPLADE
Captures | Semantic meaning | Lexical term presence
Strengths | Synonym matching, | Exact string matching,
| paraphrase retrieval, | rare identifiers,
| contextual meaning | new domain terms
Weaknesses | Exact identifiers, | Vocabulary mismatch,
| rare proper nouns | synonyms, paraphrases
Interpretability | Low | High
Index type | HNSW, IVF | Inverted index
Memory per vector | Higher per vector | Lower (store only non-zeros)
Recall@10 alone | ~78% | ~65%
Recall@10 hybrid | 91% (combined) | 91% (combined)When to Use Dense, Sparse, or Hybrid
The right choice depends on your query distribution.
Use dense-only retrieval when your users write natural language queries, when vocabulary mismatch is common (your documents and users describe things differently), and when you do not have exact identifier lookups. Recommendation systems, general document Q&A, and conversational search usually fall here.
Use sparse-only retrieval when your users search for exact strings: product SKUs, error codes, legal case numbers, person names, and technical identifiers. When the query and the document are expected to use exactly the same terminology, BM25 is faster, simpler to operate, and often more precise.
Use hybrid search for most production RAG applications. Your users will send both types of queries, often in the same session. According to research cited by Infinity on hybrid retrieval, an IBM research paper compared BM25, dense vectors, BM25 plus dense, dense plus sparse, and BM25 plus dense plus sparse. The study concluded that using three-way retrieval is the optimal option for RAG.
Hybrid Search Support in Vector Databases
Every major vector database now supports hybrid search, though the implementations differ.
Weaviate exposes a single hybrid() query method accepting an alpha parameter from 0 (pure sparse) to 1 (pure dense). Internally it runs BM25 and vector search in parallel, fuses via RRF or relative score fusion, and returns a single ranked list.
Qdrant supports DBSF (Distribution-Based Score Fusion) as an alternative to RRF. DBSF normalizes scores relative to their distributions before combining, which gives better results when one retriever has a much higher score variance than the other.
Elasticsearch added native RRF support in version 8.9. It supports both BM25 and ELSER (its SPLADE-inspired sparse neural retriever) alongside dense kNN search.
Milvus supports multi-vector search, allowing simultaneous retrieval across dense and sparse vector fields in a single query with configurable fusion weights.
OpenSearch supports neural sparse search through its neural sparse query type, combining sparse neural retrieval with dense semantic search via a hybrid query wrapper.
Storage and Memory Considerations
Sparse vectors are memory-efficient to store because only non-zero values need to be saved. Libraries like scipy.sparse use compressed sparse row (CSR) format to store only the index and value of each non-zero element. A 50,000-dimensional sparse vector with 500 non-zero values requires roughly 500 times less storage than a naive full-length array.
Dense vectors require storage proportional to their dimensionality regardless of content. A 1536-dimensional float32 vector takes 6,144 bytes per document. At one million documents, that is roughly 6 GB for the raw vectors alone, before any index overhead.
Hybrid architectures therefore maintain two separate index structures: a dense vector index (HNSW or IVF) and an inverted index (the standard data structure for sparse keyword search). As noted by the GoPenAI hybrid search article, the dual index adds approximately 1.4 times the storage footprint of dense-only retrieval, with about 6ms additional query latency. For most production applications, those are negligible costs relative to the recall improvement.
The Practical Recommendation
For teams building RAG pipelines, the evidence points clearly toward hybrid. BM25 is the default sparse retriever because it has zero inference overhead, works perfectly for exact string queries, and runs inside every major search infrastructure already. If your corpus has heavy vocabulary mismatch between how users ask questions and how documents are written, replacing BM25 with SPLADE produces measurable recall gains at the cost of added inference latency.
Use RRF as the default fusion algorithm. It is immune to score normalization edge cases and requires no tuning to produce reasonable results. If you have labeled evaluation data for your specific query distribution, train a weighted linear combination for marginal gains beyond RRF.
The semantic search article covers how the query pipeline works end to end once you have both indexes in place. The vector database comparison article digs into the tradeoffs between purpose-built vector databases and Elasticsearch for hybrid workloads.
Why Traditional Indexes Cannot Index Dense Vectors
Dense vectors cannot use the same inverted index that stores sparse vectors. An inverted index works by sorting and grouping exact values. For text, it maps every unique term to the list of documents containing it. For numbers, it supports range queries. Neither operation makes sense for a 1536-dimensional float vector.
Dense vectors require ANN (Approximate Nearest Neighbor) index structures specifically designed for high-dimensional geometry. HNSW organizes vectors into a layered graph where each node connects to its nearest neighbors. IVF clusters vectors into groups and searches only the most relevant clusters at query time.
The why traditional indexes fail for vector search article covers this in full, including the curse of dimensionality and why the standard B-tree cannot be adapted to work on high-dimensional float vectors.
Summary
Dense vectors and sparse vectors are complementary representations that fail in opposite directions. Dense vectors encode semantic meaning and handle vocabulary mismatch. Sparse vectors encode lexical term presence and handle exact identifier matching. Neither is universally better.
Hybrid search combines both by running BM25 or SPLADE sparse retrieval and dense ANN search in parallel, then fusing results with Reciprocal Rank Fusion. Research consistently shows hybrid retrieval outperforms either method alone by a meaningful margin.
The full data flow: text input gets embedded into a dense vector via an embedding model, gets tokenized and scored into a sparse representation via BM25, both representations get stored in a vector database that maintains parallel indexes, and at query time both indexes are searched and fused before the top chunks reach the LLM.
Sources and Further Reading
- Weaviate. Hybrid Search Explained. weaviate.io/blog/hybrid-search-explained
- Elastic. Sparse Embeddings: Dense vs. Sparse Vector and Usage With ML Models. elastic.co/search-labs/blog/sparse-vector-embedding
- Milvus. What Are Dense and Sparse Embeddings? milvus.io/ai-quick-reference/what-are-dense-and-sparse-embeddings
- Chroma. Sparse Vector Support. trychroma.com/project/sparse-vector-search
- OpenSearch. Neural Sparse Search Documentation. docs.opensearch.org/latest/vector-search/ai-search/neural-sparse-search
- Supermemory. Hybrid Search Guide: Vectors and Full Text (April 2026). blog.supermemory.ai/hybrid-search-guide
- Prem AI. Hybrid Search for RAG: BM25, SPLADE, and Vector Search Combined. blog.premai.io/hybrid-search-for-rag-bm25-splade-and-vector-search-combined
- GoPenAI. Hybrid Search in RAG: Dense plus Sparse, RRF, and When to Use Which. blog.gopenai.com/hybrid-search-in-rag
- Infiniflow. Dense Vector plus Sparse Vector plus Full Text Search plus Tensor Reranker. infiniflow.org/blog/best-hybrid-search-solution
- Zilliz. Sparse and Dense Embeddings. zilliz.com/learn/sparse-and-dense-embeddings
- Thakur et al. BEIR: A Heterogeneous Benchmark for Zero-Shot Evaluation of IR Models. arxiv.org/abs/2104.08663
- DEV Community. Dense vs Sparse Retrieval: Mastering FAISS, BM25, and Hybrid Search. dev.to/qvfagundes/dense-vs-sparse-retrieval-mastering-faiss-bm25-and-hybrid-search-4kb1
Follow on Google
Add as a preferred source in Search & Discover
Add as preferred sourceKrunal Kanojiya
Technical Content Writer
Technical Content Writer and former software developer from India. I write in-depth articles on blockchain, AI/ML, data engineering, web development, and developer careers. Currently at Lucent Innovation, previously at Cromtek Solution and freelance.