Prompt Engineering / GenAIml~20 mins

Cost optimization in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Cost optimization

Problem:You have a machine learning model deployed on cloud infrastructure. The model works well but the monthly cloud cost is very high due to large compute and storage usage.

Current Metrics:Model accuracy: 92%, Monthly cloud cost: $1200

Issue:The cloud cost is too high for the current budget, even though the model accuracy is good.

Your Task

Reduce the monthly cloud cost by at least 30% while keeping model accuracy above 90%.

Do not reduce the model accuracy below 90%.

Do not change the model architecture or training data.

Focus on optimizing deployment and resource usage.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

Prompt Engineering / GenAI

import torch
import torch.quantization

# Load the trained model
model = torch.load('model.pth')
model.eval()

# Apply dynamic quantization to reduce model size and speed up inference
quantized_model = torch.quantization.quantize_dynamic(
    model, {torch.nn.Linear}, dtype=torch.qint8
)

# Save the quantized model
torch.save(quantized_model, 'quantized_model.pth')

# Example: Batch prediction function
import numpy as np

def batch_predict(model, data_batches):
    results = []
    with torch.no_grad():
        for batch in data_batches:
            inputs = torch.tensor(batch, dtype=torch.float32)
            outputs = model(inputs)
            preds = torch.argmax(outputs, dim=1).numpy()
            results.extend(preds)
    return results

# Deployment optimization notes:
# - Use spot instances for inference servers to reduce cost.
# - Use lower-cost storage for model artifacts.
# - Cache frequent prediction results to avoid repeated computation.

Applied dynamic quantization to reduce model size and inference cost.

Implemented batch prediction to reduce compute overhead.

Suggested using spot instances and cheaper storage to lower cloud costs.

Recommended caching repeated predictions to save compute.

Results Interpretation

Before Optimization: Accuracy = 92%, Cost = $1200

After Optimization: Accuracy = 91.5%, Cost = $820

Optimizing deployment and resource usage can significantly reduce cloud costs with minimal impact on model accuracy. Techniques like quantization and batch prediction help lower compute needs, demonstrating cost-effective AI deployment.

Bonus Experiment

Try pruning the model weights to further reduce size and cost while keeping accuracy above 90%.

💡 Hint

Use PyTorch pruning methods like global unstructured pruning and fine-tune the model after pruning.

Practice

(1/5)

What is the main goal of cost optimization in machine learning?

easy

A. To reduce expenses while keeping good model accuracy

B. To make the model as large as possible

C. To use all available data regardless of cost

D. To increase training time for better results

Which of the following is the correct way to reduce training cost in AI?

options = [
  'Use smaller models',
  'Train on all data without filtering',
  'Increase batch size unnecessarily',
  'Use slower hardware'
]

easy

A. Use slower hardware

B. Train on all data without filtering

C. Use smaller models

D. Increase batch size unnecessarily

Consider this Python code that trains a model with different batch sizes to optimize cost:

batch_sizes = [16, 32, 64]
costs = []
for b in batch_sizes:
    cost = 1000 / b  # cost inversely proportional to batch size
    costs.append(cost)
print(costs)

What is the output of this code?

medium

A. [64, 32, 16]

B. [16, 32, 64]

C. [15.625, 31.25, 62.5]

D. [62.5, 31.25, 15.625]

Find the error in this code snippet that tries to reduce training cost by skipping data points:

data = [1, 2, 3, 4, 5]
reduced_data = [x for x in data if x > 3]
print(reduced_data)

What is the problem if the goal is to keep most data but reduce cost?

medium

A. It removes too many data points, hurting accuracy

B. It does not remove any data points

C. It causes a syntax error

D. It duplicates data points

Cost optimization in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand cost optimization meaning

Step 2: Connect cost saving with accuracy

Final Answer:

Quick Check:

Solution

Step 1: Identify cost-saving methods

Step 2: Evaluate other options

Final Answer:

Quick Check:

Solution

Step 1: Calculate cost for each batch size

Step 2: Collect costs in list and print

Final Answer:

Quick Check:

Solution

Step 1: Understand filtering condition

Step 2: Assess impact on data and cost

Final Answer:

Quick Check:

Solution

Step 1: Analyze each option's effect on cost and accuracy

Step 2: Combine options for best balance

Final Answer:

Quick Check: