"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

April 11, 2023

Milvus, an open-source, cloud-native vector database - Key Architecture Notes

Milvus, an open-source, cloud-native vector database - Key Architecture Notes

  • Milvus takes a ground-breaking approach and introduces a publish-subscribe (pub/sub) system for log storage and persistency
  • Milvus offers two ways of deployment - standalone or cluster
  • Access layer acts as the system's face, exposing the endpoint of the client connection to the outside world
  • Coordinator service responsible for cluster topology node management, load balancing, timestamp generation, data declaration, and data management. 
  • Worker, or execution, node executing instructions issued by the coordinator service and the data manipulation language (DML) commands initiated by the proxy.

Storage is the cornerstone of Milvus, responsible for data persistence. The storage layer is divided into three parts:

  • Meta store: Responsible for storing snapshots of metadata such as collection schema, node status
  • Log broker: A pub/sub system that supports playback and is responsible for streaming data persistence, reliable asynchronous query execution
  • Object storage: Stores snapshot files of logs, scalar/vector index files, and intermediate query processing results.

Log broker, logs are decoupled from the server, ensuring that Milvus is itself stateless and better positioned to quickly recover from system failure.

Ref - Link

Vector Search 

  • Knowhere not only further extends the functions of Faiss but also optimizes the performance
  • Built on top of FAISS, Annoy, HNSW

Ref - Link

Consistency Discussions

  • GuaranteeTs is configurable in the search request to achieve the level of consistency specified by you. A GuaranteeTs with a large value ensures strong consistency at the cost of a high search latency.

Ref - Link 

Query Mechanism

  • Before a query is executed, the data has to be loaded to the query nodes first.
  • There are two types of data that are loaded to query node: streaming data from log broker, and historical data from object storage (also called persistent storage below).

Ref - Link 

Milvus - Applications

  • Video media: video understanding, video deduplication.
  • E-commerce and mobile applications: image understanding, reverse image search.
  • Finance/Telecommunications/Retail: AI-aided customer support, QA chatbots.
  • Internet: personalized recommender systems, personalized search.
  • Autonomous vehicles: automated data labeling and annotation, object detection.
  • Biopharmaceutical: virtual compound screening, compound retrosynthetic analysis, protein property prediction, and DNA testing.
  • Cybersecurity: malware detection and cyberattack alert.
  • Quantitative trading: data analysis and prediction.
  • Metaverse: environmental perception and interaction in the virtual world.

Ref - Link

Hnswlib - fast approximate nearest neighbor search

Distance parameter Equation

Squared L2 'l2' d = sum((Ai-Bi)^2)

Inner product ' ip' d = 1.0 - sum(Ai*Bi)

Cosine similarity 'cosine' d = 1.0 - sum(Ai*Bi) / sqrt(sum(Ai*Ai) * sum(Bi*Bi))

  • hnswlib uses hnswlib to pre-calculate approximate nearest neighbors.


Vector databases are having their day right now. Vector databases are able to calculate similarity quickly because they have already pre-calculated it


  • Storing and searching across table-based data such as the one shown above is exactly what relational databases were designed to do.
  • vector databases are used for: searching across images, video, text, audio, and other forms of unstructured data via their content rather than keywords or tags (which are often input manually by users or curators). When combined with powerful machine learning models, vector databases have the capability of revolutionizing semantic search and recommendation systems.
  • Qdrant and Milvus are the fastest engines when it comes to indexing time.
  • Qdrant achives highest RPS and lowest latencies in almost all scenarios, no matter the precision threshold and the metric we choose.
  • Elasticsearch is typically way slower than all the competitors, no matter the dataset and metric.
Towhee is an open-source machine learning pipeline. that helps you encode your unstructured data into embeddings.



Ref - Link



Keep Exploring!!!

No comments: