Bird
Raised Fist0
Agentic AIml~5 mins

Vector store selection (Pinecone, Chroma, FAISS) in Agentic AI - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is a vector store in the context of AI?
A vector store is a system that saves and organizes data as vectors (lists of numbers) so that AI can quickly find similar items by comparing these vectors.
Click to reveal answer
intermediate
Name one key difference between Pinecone and FAISS.
Pinecone is a cloud-based managed service that handles scaling and maintenance for you, while FAISS is an open-source library you run on your own machine or server.
Click to reveal answer
beginner
Why might someone choose Chroma as their vector store?
Chroma is easy to set up and use locally, supports fast similarity search, and is good for small to medium projects without needing cloud services.
Click to reveal answer
intermediate
What is a common use case for FAISS?
FAISS is often used for fast similarity search in large datasets when you want full control over your data and infrastructure.
Click to reveal answer
intermediate
How does Pinecone help with scaling vector search?
Pinecone automatically manages the storage and computing resources needed to handle large amounts of vector data, so you don’t have to worry about technical details.
Click to reveal answer
Which vector store is a fully managed cloud service?
APinecone
BChroma
CFAISS
DNone of the above
Which vector store is best if you want to run everything locally without cloud dependencies?
AFAISS
BPinecone
CChroma
DAll of the above
FAISS is primarily known as:
AA cloud service
BA database for text storage
CA visualization tool
DAn open-source library for similarity search
Which vector store automatically handles scaling and maintenance?
AFAISS
BChroma
CPinecone
DNone
If you have a very large dataset and want full control over your infrastructure, which vector store is suitable?
APinecone
BFAISS
CNone
DChroma
Explain the main differences between Pinecone, Chroma, and FAISS as vector stores.
Think about where and how each vector store runs and what kind of projects they suit.
You got /3 concepts.
    Describe a scenario where you would choose Pinecone over FAISS or Chroma.
    Consider the benefits of cloud-managed services.
    You got /3 concepts.

      Practice

      (1/5)
      1.

      Which vector store is best known for easy cloud-based deployment and scalability?

      easy
      A. Pinecone
      B. Chroma
      C. FAISS
      D. Local file system

      Solution

      1. Step 1: Understand cloud-based vector stores

        Pinecone is designed as a managed cloud service, making deployment and scaling easy.
      2. Step 2: Compare with other options

        Chroma and FAISS are typically used locally or self-hosted, not primarily cloud services.
      3. Final Answer:

        Pinecone -> Option A
      4. Quick Check:

        Cloud deployment = Pinecone [OK]
      Hint: Cloud + scalability? Think Pinecone first [OK]
      Common Mistakes:
      • Confusing FAISS as cloud service
      • Assuming Chroma is cloud-only
      • Choosing local file system as vector store
      2.

      Which of the following is the correct way to initialize a FAISS index for 128-dimensional vectors in Python?

      import faiss
      index = faiss.IndexFlatL2(____)
      easy
      A. '128'
      B. IndexFlatL2(128)
      C. faiss.IndexFlatL2(128)
      D. 128

      Solution

      1. Step 1: Understand FAISS index initialization

        The IndexFlatL2 constructor expects an integer dimension, not a string or nested call.
      2. Step 2: Check the correct argument type

        Passing 128 as an integer is correct; quotes or extra calls cause errors.
      3. Final Answer:

        128 -> Option D
      4. Quick Check:

        Dimension as int = 128 [OK]
      Hint: Dimension must be integer, no quotes [OK]
      Common Mistakes:
      • Passing dimension as string
      • Calling constructor inside argument
      • Using undefined names without import
      3.

      Given this code snippet using Chroma vector store, what will be the output?

      from chromadb import Client
      client = Client()
      collection = client.create_collection('test')
      collection.add(ids=['1'], embeddings=[[0.1, 0.2]], metadatas=[{'name': 'item1'}], documents=['doc1'])
      results = collection.query(query_embeddings=[[0.1, 0.2]], n_results=1)
      print(results['documents'])
      medium
      A. [['doc1']]
      B. ['doc1']
      C. [{'name': 'item1'}]
      D. Error: missing parameters

      Solution

      1. Step 1: Understand Chroma query output format

        The query returns a dictionary with keys like 'documents' containing a list of lists of matched documents.
      2. Step 2: Check the printed output

        Printing results['documents'] shows a list containing a list with 'doc1', so output is [['doc1']].
      3. Final Answer:

        [['doc1']] -> Option A
      4. Quick Check:

        Chroma query docs = [['doc1']] [OK]
      Hint: Chroma query returns list of lists for documents [OK]
      Common Mistakes:
      • Expecting flat list instead of nested list
      • Confusing metadata with documents
      • Assuming query returns error without reason
      4.

      What is the main error in this FAISS usage code snippet?

      import faiss
      index = faiss.IndexFlatL2(64)
      vectors = [[0.1]*64, [0.2]*64]
      index.add(vectors)
      print(index.ntotal)
      medium
      A. Vectors length must be 63, not 64
      B. Vectors must be a numpy array of type float32
      C. ntotal is not a valid attribute
      D. Index dimension should be 128, not 64

      Solution

      1. Step 1: Check vector data type for FAISS

        FAISS requires vectors as numpy arrays with dtype float32, not Python lists.
      2. Step 2: Identify the error cause

        Passing a list causes a type error; converting to numpy float32 fixes it.
      3. Final Answer:

        Vectors must be a numpy array of type float32 -> Option B
      4. Quick Check:

        FAISS vectors = numpy float32 array [OK]
      Hint: FAISS needs numpy float32 arrays, not lists [OK]
      Common Mistakes:
      • Using Python lists instead of numpy arrays
      • Wrong dimension assumption
      • Misunderstanding ntotal attribute
      5.

      You have a large dataset of 10 million vectors and want fast similarity search on your local machine without internet. Which vector store is the best choice?

      hard
      A. Pinecone
      B. Chroma
      C. FAISS
      D. SQLite database

      Solution

      1. Step 1: Consider dataset size and environment

        10 million vectors is large; local machine without internet means no cloud services.
      2. Step 2: Match vector store to requirements

        FAISS is optimized for large-scale local similarity search and does not require internet.
      3. Step 3: Exclude other options

        Pinecone is cloud-based, Chroma is less optimized for huge local datasets, SQLite is not a vector store.
      4. Final Answer:

        FAISS -> Option C
      5. Quick Check:

        Large local dataset = FAISS [OK]
      Hint: Big local data? Choose FAISS for speed [OK]
      Common Mistakes:
      • Choosing cloud-based Pinecone for offline use
      • Assuming Chroma handles huge data best locally
      • Using SQLite as vector store