Bird
Raised Fist0
Prompt Engineering / GenAIml~5 mins

Vector databases (Pinecone, ChromaDB, Weaviate) in Prompt Engineering / GenAI - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is a vector database?
A vector database stores and searches data as vectors, which are lists of numbers representing things like text or images. It helps find similar items quickly by comparing these vectors.
Click to reveal answer
beginner
Name three popular vector databases.
Pinecone, ChromaDB, and Weaviate are three popular vector databases used to store and search vector data efficiently.
Click to reveal answer
intermediate
How does a vector database find similar items?
It compares vectors using math measures like cosine similarity or Euclidean distance to find items that are close or similar in meaning.
Click to reveal answer
beginner
What is a real-life example of using a vector database?
Imagine searching for a photo of a sunset. A vector database can find photos that look similar by comparing their vector representations, even if the exact words aren’t used.
Click to reveal answer
intermediate
What makes Pinecone, ChromaDB, and Weaviate different?
Pinecone is a managed service focusing on scalability and speed. ChromaDB is open-source and easy to integrate. Weaviate offers rich features like built-in ML models and semantic search.
Click to reveal answer
What does a vector in a vector database represent?
AA type of image file
BA text string describing data
CA database table row
DA list of numbers representing data features
Which similarity measure is commonly used in vector databases?
AExact string match
BCosine similarity
CSorting alphabetically
DCounting words
Which vector database is known for being open-source?
AChromaDB
BPinecone
CWeaviate
DMySQL
What is a key benefit of using a vector database?
AFast similarity search on complex data
BStoring only text files
CReplacing spreadsheets
DRunning SQL queries faster
Which vector database offers built-in machine learning features?
APinecone
BChromaDB
CWeaviate
DSQLite
Explain what a vector database is and why it is useful in AI applications.
Think about how computers find similar things using numbers.
You got /4 concepts.
    Compare Pinecone, ChromaDB, and Weaviate in terms of features and typical use cases.
    Focus on what makes each database special.
    You got /4 concepts.

      Practice

      (1/5)
      1. What is the main purpose of a vector database like Pinecone, ChromaDB, or Weaviate?
      easy
      A. To store plain text documents only
      B. To perform traditional SQL queries on structured data
      C. To store and search data based on similarity using number lists
      D. To create visual graphs from data

      Solution

      1. Step 1: Understand what vector databases store

        Vector databases store data as vectors, which are lists of numbers representing complex data like images or text.
      2. Step 2: Identify the main use of vector databases

        They allow fast searching by similarity, not by exact matches like traditional databases.
      3. Final Answer:

        To store and search data based on similarity using number lists -> Option C
      4. Quick Check:

        Vector databases = similarity search [OK]
      Hint: Vector DBs = search by meaning, not exact text [OK]
      Common Mistakes:
      • Thinking vector DBs only store text
      • Confusing vector DBs with SQL databases
      • Assuming vector DBs create visual graphs
      2. Which of the following is the correct way to insert a vector into Pinecone using Python?
      easy
      A. pinecone.insert(id='vec1', vector=[0.1, 0.2, 0.3])
      B. pinecone.upsert(vectors=[('vec1', [0.1, 0.2, 0.3])])
      C. pinecone.add_vector('vec1', [0.1, 0.2, 0.3])
      D. pinecone.push_vector(id='vec1', vector=[0.1, 0.2, 0.3])

      Solution

      1. Step 1: Recall Pinecone's method to add vectors

        Pinecone uses the 'upsert' method to insert or update vectors, which takes a list of tuples with id and vector.
      2. Step 2: Match the correct syntax

        pinecone.upsert(vectors=[('vec1', [0.1, 0.2, 0.3])]) uses 'upsert' with a list of tuples, which is the correct syntax.
      3. Final Answer:

        pinecone.upsert(vectors=[('vec1', [0.1, 0.2, 0.3])]) -> Option B
      4. Quick Check:

        Use upsert with list of (id, vector) tuples [OK]
      Hint: Pinecone uses upsert() with list of (id, vector) [OK]
      Common Mistakes:
      • Using insert() instead of upsert()
      • Passing vector without wrapping in a list
      • Using non-existent methods like add_vector or push_vector
      3. Given the following code snippet using ChromaDB, what will be the output?
      collection.add(ids=['1'], embeddings=[[0.1, 0.2, 0.3]], metadatas=[{'type': 'image'}], documents=['cat image'])
      results = collection.query(query_embeddings=[[0.1, 0.2, 0.3]], n_results=1)
      print(results['documents'])
      medium
      A. [['cat image']]
      B. ['cat image']
      C. [{'type': 'image'}]
      D. []

      Solution

      1. Step 1: Understand what add() does in ChromaDB

        The add() method stores the document with its vector and metadata in the collection.
      2. Step 2: Understand query() output format

        The query() method returns a dictionary with keys like 'documents' containing a list of lists of matched documents.
      3. Final Answer:

        [['cat image']] -> Option A
      4. Quick Check:

        Query returns list of lists of documents [OK]
      Hint: ChromaDB query returns list of lists for documents [OK]
      Common Mistakes:
      • Expecting a flat list instead of list of lists
      • Confusing documents with metadata
      • Assuming empty result when vector matches exactly
      4. You wrote this Weaviate query to find similar items but get an error:
      client.query.get('Article', ['title']).with_near_vector({'vector': [0.1, 0.2]}).do()
      What is the likely cause of the error?
      medium
      A. The query must include a filter parameter
      B. The method with_near_vector does not exist in Weaviate client
      C. The class name 'Article' must be lowercase
      D. The vector length is too short; it should match the database dimension

      Solution

      1. Step 1: Check vector length requirement in Weaviate

        Weaviate expects the vector length to match the dimension used when creating the index, usually 3 or more numbers.
      2. Step 2: Identify the error cause

        The vector [0.1, 0.2] has length 2, which is likely shorter than expected, causing the error.
      3. Final Answer:

        The vector length is too short; it should match the database dimension -> Option D
      4. Quick Check:

        Vector length must match index dimension [OK]
      Hint: Vector length must match index dimension in Weaviate [OK]
      Common Mistakes:
      • Thinking method name is wrong
      • Assuming class names must be lowercase
      • Believing filter is always required
      5. You want to build a search system that finds similar product descriptions using Weaviate. Which steps should you follow to prepare and query the data correctly?
      hard
      A. Create a schema with a vector index, add product descriptions as objects with vectors, then query using nearVector filter
      B. Store product descriptions as plain text only, then query with SQL-like text search
      C. Upload product images only, then query using image metadata filters
      D. Create a schema without vector index, add descriptions, then query using exact match filters

      Solution

      1. Step 1: Define schema with vector index in Weaviate

        To search by similarity, the schema must include a vector index for the product description class.
      2. Step 2: Add product descriptions as objects with vectors

        Each product description is stored as an object with its vector embedding representing meaning.
      3. Step 3: Query using nearVector filter

        Use the nearVector filter in queries to find objects with vectors close to the query vector.
      4. Final Answer:

        Create a schema with a vector index, add product descriptions as objects with vectors, then query using nearVector filter -> Option A
      5. Quick Check:

        Schema + vectors + nearVector query = correct approach [OK]
      Hint: Schema with vectors + nearVector query = similarity search [OK]
      Common Mistakes:
      • Trying to search plain text without vectors
      • Using exact match filters for similarity search
      • Ignoring schema vector index setup