Bird
Raised Fist0
Prompt Engineering / GenAIml~5 mins

Rate limiting and abuse prevention in Prompt Engineering / GenAI - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is rate limiting in the context of AI services?
Rate limiting is a technique used to control how many requests a user or system can make to an AI service in a given time. It helps prevent overload and abuse by limiting excessive use.
Click to reveal answer
beginner
Why is abuse prevention important for AI models?
Abuse prevention protects AI models from harmful or excessive use that can cause service disruption, unfair resource use, or biased outputs. It ensures fair access and maintains system reliability.
Click to reveal answer
intermediate
Name two common methods used for rate limiting.
Two common methods are:
1. Token Bucket: Users get tokens at a fixed rate; each request uses a token.
2. Fixed Window: Limits requests in fixed time windows, like 100 requests per minute.
Click to reveal answer
intermediate
How can machine learning help in abuse prevention?
Machine learning can detect unusual patterns or behaviors that suggest abuse, like sudden spikes in requests or suspicious inputs, and trigger protective actions automatically.
Click to reveal answer
beginner
What is a real-life example of rate limiting?
A website might allow only 5 login attempts per minute to stop hackers from guessing passwords. This is rate limiting to prevent abuse.
Click to reveal answer
What does rate limiting primarily help prevent?
AToo many requests in a short time
BSlow internet connection
CIncorrect AI predictions
DData storage overflow
Which method uses tokens to control request rates?
ARandom Sampling
BFixed Window
CSliding Window
DToken Bucket
Why is abuse prevention important for AI services?
ATo reduce training time
BTo increase AI model size
CTo protect from harmful or excessive use
DTo improve color contrast
Which of these is NOT a sign of abuse detected by machine learning?
ASudden spike in requests
BConsistent normal usage
CSuspicious input patterns
DRepeated failed attempts
A website limits login attempts to 5 per minute. This is an example of:
ARate limiting
BData augmentation
CModel training
DFeature extraction
Explain what rate limiting is and why it is important in AI services.
Think about how many times you can ask a question before being told to wait.
You got /4 concepts.
    Describe how machine learning can help detect abuse in AI systems.
    Consider how a security guard notices strange behavior.
    You got /4 concepts.

      Practice

      (1/5)
      1. What is the main purpose of rate limiting in AI services?
      easy
      A. To improve the accuracy of AI models
      B. To increase the speed of AI predictions
      C. To stop too many requests from one user in a short time
      D. To reduce the size of the AI model

      Solution

      1. Step 1: Understand rate limiting concept

        Rate limiting is designed to control how many requests a user can make in a short period.
      2. Step 2: Identify the main goal

        The goal is to prevent overload and abuse by stopping too many requests quickly.
      3. Final Answer:

        To stop too many requests from one user in a short time -> Option C
      4. Quick Check:

        Rate limiting = stop excess requests [OK]
      Hint: Rate limiting controls request frequency to prevent overload [OK]
      Common Mistakes:
      • Confusing rate limiting with improving model accuracy
      • Thinking rate limiting speeds up predictions
      • Assuming rate limiting reduces model size
      2. Which Python code snippet correctly implements a simple rate limiter that blocks requests after 5 calls?
      easy
      A. if requests_count >= 5: block_request()
      B. if requests_count == 5: allow_request()
      C. if requests_count < 5: block_request()
      D. if requests_count > 5: block_request()

      Solution

      1. Step 1: Understand the condition for blocking

        We want to block requests when the count reaches or exceeds 5, so >= 5 is correct.
      2. Step 2: Check each option

        if requests_count >= 5: block_request() uses '>= 5' to block requests, which matches the requirement.
      3. Final Answer:

        if requests_count >= 5: block_request() -> Option A
      4. Quick Check:

        Block when count is 5 or more = >= 5 [OK]
      Hint: Use '>=' to include the limit value when blocking [OK]
      Common Mistakes:
      • Using '>' misses blocking exactly at 5
      • Using '<' blocks too early
      • Allowing request at count 5 instead of blocking
      3. Given the code below, what will be printed after 7 calls to check_request()?
      requests_count = 0
      def block_request():
          print('Blocked')
      def allow_request():
          print('Allowed')
      def check_request():
          global requests_count
          requests_count += 1
          if requests_count >= 5:
              block_request()
          else:
              allow_request()
      
      for _ in range(7):
          check_request()
      medium
      A. Allowed printed 7 times
      B. Blocked printed 5 times, Allowed printed 2 times
      C. Allowed printed 5 times, Blocked printed 2 times
      D. Allowed printed 4 times, Blocked printed 3 times

      Solution

      1. Step 1: Track requests_count and output

        For calls 1 to 4, requests_count is less than 5, so 'Allowed' prints. For calls 5 to 7, requests_count is 5 or more, so 'Blocked' prints.
      2. Step 2: Count prints

        'Allowed' prints 4 times, 'Blocked' prints 3 times.
      3. Final Answer:

        Allowed printed 4 times, Blocked printed 3 times -> Option D
      4. Quick Check:

        4 Allowed + 3 Blocked = 7 calls [OK]
      Hint: Count calls before and after limit to find outputs [OK]
      Common Mistakes:
      • Counting 'Allowed' as 5 times instead of 4
      • Confusing when blocking starts
      • Ignoring global variable increment
      4. The following code is meant to block requests after 2 calls, but it blocks after 3 calls instead. What is the error?
      requests_count = 0
      def check_request():
          global requests_count
          requests_count += 1
          if requests_count > 3:
              print('Blocked')
          else:
              print('Allowed')
      medium
      A. The requests_count should start at 1, not 0
      B. The condition should be '>= 3' instead of '> 3'
      C. The print statements are reversed
      D. The global keyword is missing

      Solution

      1. Step 1: Analyze the blocking condition

        The code blocks only when requests_count > 3, so blocking starts at 4th call, not 3rd.
      2. Step 2: Fix condition to block at 3 calls

        Changing condition to '>= 3' will block starting at the 3rd call as intended.
      3. Final Answer:

        The condition should be '>= 3' instead of '> 3' -> Option B
      4. Quick Check:

        Block at 3 calls means '>= 3' [OK]
      Hint: Use '>=' to include the limit call in blocking [OK]
      Common Mistakes:
      • Using '>' blocks too late
      • Starting count at 1 instead of 0 is unnecessary
      • Forgetting global keyword (but it's present here)
      5. You want to prevent abuse by limiting users to 10 requests per minute. Which approach best combines rate limiting with user tracking in Python?
      hard
      A. Use a dictionary to store user IDs with timestamps of their requests, then block if more than 10 in last 60 seconds
      B. Reset a global request count every minute without user distinction
      C. Block all requests after 10 total requests regardless of user
      D. Allow unlimited requests but slow down responses after 10 requests

      Solution

      1. Step 1: Understand per-user rate limiting

        To limit requests per user, we must track each user's request times separately.
      2. Step 2: Choose data structure and logic

        A dictionary with user IDs as keys and timestamps as values lets us count requests in the last 60 seconds and block if over 10.
      3. Final Answer:

        Use a dictionary to store user IDs with timestamps of their requests, then block if more than 10 in last 60 seconds -> Option A
      4. Quick Check:

        Per-user tracking + time window = dictionary with timestamps [OK]
      Hint: Track each user's timestamps to count requests per minute [OK]
      Common Mistakes:
      • Using global count ignores individual users
      • Blocking all users after total requests causes unfair blocking
      • Slowing responses is not strict rate limiting