Bird
Raised Fist0
Prompt Engineering / GenAIml~20 mins

Cost optimization in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Cost optimization
Problem:You have a machine learning model deployed on cloud infrastructure. The model works well but the monthly cloud cost is very high due to large compute and storage usage.
Current Metrics:Model accuracy: 92%, Monthly cloud cost: $1200
Issue:The cloud cost is too high for the current budget, even though the model accuracy is good.
Your Task
Reduce the monthly cloud cost by at least 30% while keeping model accuracy above 90%.
Do not reduce the model accuracy below 90%.
Do not change the model architecture or training data.
Focus on optimizing deployment and resource usage.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Prompt Engineering / GenAI
import torch
import torch.quantization

# Load the trained model
model = torch.load('model.pth')
model.eval()

# Apply dynamic quantization to reduce model size and speed up inference
quantized_model = torch.quantization.quantize_dynamic(
    model, {torch.nn.Linear}, dtype=torch.qint8
)

# Save the quantized model
torch.save(quantized_model, 'quantized_model.pth')

# Example: Batch prediction function
import numpy as np

def batch_predict(model, data_batches):
    results = []
    with torch.no_grad():
        for batch in data_batches:
            inputs = torch.tensor(batch, dtype=torch.float32)
            outputs = model(inputs)
            preds = torch.argmax(outputs, dim=1).numpy()
            results.extend(preds)
    return results

# Deployment optimization notes:
# - Use spot instances for inference servers to reduce cost.
# - Use lower-cost storage for model artifacts.
# - Cache frequent prediction results to avoid repeated computation.
Applied dynamic quantization to reduce model size and inference cost.
Implemented batch prediction to reduce compute overhead.
Suggested using spot instances and cheaper storage to lower cloud costs.
Recommended caching repeated predictions to save compute.
Results Interpretation

Before Optimization: Accuracy = 92%, Cost = $1200

After Optimization: Accuracy = 91.5%, Cost = $820

Optimizing deployment and resource usage can significantly reduce cloud costs with minimal impact on model accuracy. Techniques like quantization and batch prediction help lower compute needs, demonstrating cost-effective AI deployment.
Bonus Experiment
Try pruning the model weights to further reduce size and cost while keeping accuracy above 90%.
💡 Hint
Use PyTorch pruning methods like global unstructured pruning and fine-tune the model after pruning.

Practice

(1/5)
1.

What is the main goal of cost optimization in machine learning?

easy
A. To reduce expenses while keeping good model accuracy
B. To make the model as large as possible
C. To use all available data regardless of cost
D. To increase training time for better results

Solution

  1. Step 1: Understand cost optimization meaning

    Cost optimization means saving money and resources in AI work.
  2. Step 2: Connect cost saving with accuracy

    Good cost optimization keeps accuracy high while lowering expenses.
  3. Final Answer:

    To reduce expenses while keeping good model accuracy -> Option A
  4. Quick Check:

    Cost optimization = reduce cost + keep accuracy [OK]
Hint: Cost optimization balances cost and accuracy [OK]
Common Mistakes:
  • Thinking bigger models always mean better cost
  • Ignoring accuracy when saving cost
  • Assuming more data always reduces cost
2.

Which of the following is the correct way to reduce training cost in AI?

options = [
  'Use smaller models',
  'Train on all data without filtering',
  'Increase batch size unnecessarily',
  'Use slower hardware'
]
easy
A. Use slower hardware
B. Train on all data without filtering
C. Use smaller models
D. Increase batch size unnecessarily

Solution

  1. Step 1: Identify cost-saving methods

    Using smaller models reduces computation and memory, lowering cost.
  2. Step 2: Evaluate other options

    Training on all data, increasing batch size unnecessarily, or using slower hardware increase cost or slow training.
  3. Final Answer:

    Use smaller models -> Option C
  4. Quick Check:

    Smaller models reduce cost [OK]
Hint: Smaller models usually cost less to train [OK]
Common Mistakes:
  • Thinking more data always reduces cost
  • Believing bigger batch size always helps
  • Assuming slower hardware saves money
3.

Consider this Python code that trains a model with different batch sizes to optimize cost:

batch_sizes = [16, 32, 64]
costs = []
for b in batch_sizes:
    cost = 1000 / b  # cost inversely proportional to batch size
    costs.append(cost)
print(costs)

What is the output of this code?

medium
A. [64, 32, 16]
B. [16, 32, 64]
C. [15.625, 31.25, 62.5]
D. [62.5, 31.25, 15.625]

Solution

  1. Step 1: Calculate cost for each batch size

    For batch size 16: 1000/16 = 62.5; for 32: 1000/32 = 31.25; for 64: 1000/64 = 15.625.
  2. Step 2: Collect costs in list and print

    The costs list becomes [62.5, 31.25, 15.625], which is printed.
  3. Final Answer:

    [62.5, 31.25, 15.625] -> Option D
  4. Quick Check:

    Cost = 1000 / batch size [OK]
Hint: Divide 1000 by each batch size to get costs [OK]
Common Mistakes:
  • Confusing batch sizes with costs
  • Mixing up division order
  • Copying batch_sizes list instead of costs
4.

Find the error in this code snippet that tries to reduce training cost by skipping data points:

data = [1, 2, 3, 4, 5]
reduced_data = [x for x in data if x > 3]
print(reduced_data)

What is the problem if the goal is to keep most data but reduce cost?

medium
A. It removes too many data points, hurting accuracy
B. It does not remove any data points
C. It causes a syntax error
D. It duplicates data points

Solution

  1. Step 1: Understand filtering condition

    The code keeps only data points greater than 3, removing 1, 2, 3.
  2. Step 2: Assess impact on data and cost

    Removing many points reduces data but may hurt model accuracy since much data is lost.
  3. Final Answer:

    It removes too many data points, hurting accuracy -> Option A
  4. Quick Check:

    Filtering >3 removes many points [OK]
Hint: Check how much data filtering removes [OK]
Common Mistakes:
  • Thinking it keeps most data
  • Expecting syntax error
  • Assuming data duplicates
5.

You want to optimize cost for training a language model. You have these options:

  • Use a smaller model
  • Train on a filtered smaller dataset
  • Use mixed precision training
  • Train longer with bigger batch size

Which combination best balances cost and accuracy?

hard
A. Train longer with bigger batch size only
B. Use smaller model + filtered dataset + mixed precision
C. Use smaller model only
D. Train on full dataset with no precision changes

Solution

  1. Step 1: Analyze each option's effect on cost and accuracy

    Smaller model reduces cost; filtered dataset reduces data size; mixed precision speeds training and saves memory.
  2. Step 2: Combine options for best balance

    Using all three together lowers cost while keeping good accuracy. Training longer with bigger batch size alone increases cost.
  3. Final Answer:

    Use smaller model + filtered dataset + mixed precision -> Option B
  4. Quick Check:

    Combine cost-saving methods for best results [OK]
Hint: Combine multiple cost-saving methods for best effect [OK]
Common Mistakes:
  • Choosing only one method
  • Ignoring accuracy impact
  • Assuming longer training always helps