"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

November 12, 2023

Vector database - Research Paper Read

Vector database

Key Summary

  • Support for high dimensionality and sparsity
  • Describe rich data such as texts, images and video in various domains such as recommender systems, similarity search, and chatbots.


  • Vector data is in geospatial applications 
  • Two dimensional points such as the location of the end-user and points-of-interest may be represented as vectors
  • High-dimensional vectors can be used to represent more complex data such as text, image, audio and video features
  • VDBMSs typically support similarity search through indexing methods that enable rapid and accurate searching of similar vectors
  • Search for vectors that closely resemble a given query vector based on specific distance metrics such as Euclidean distance or cosine similarity.
  • In natural language processing, words and phrases are vectorized into vectors in such a way that similar words have similar vector representations.
  • Word2vec [7], FastText, and Doc2vec [8] are examples of techniques that create vector embeddings for words in natural language
  • From a developer perspective, queries in VDBMSs are more closely related to simple document or keyvalue store queries than to complex queries in relational databases
  • Vectors are retrieved using one or several query vectors
Use-cases
  • Similarity search in general
  • Image and video similarity search
  • Voice recognition
  • Chatbots and long-term memory
Current challenges
  • Balancing between speed and accuracy
  • Growing dimensionality and sparsity
  • Information security
Keep Exploring!!!

No comments: