0
0
Prompt Engineering / GenAIml~20 mins

OpenAI fine-tuning API in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - OpenAI fine-tuning API
Problem:You have a base OpenAI language model that performs well on general tasks but does not specialize in your specific domain, leading to lower accuracy in domain-specific text generation.
Current Metrics:Training loss: 0.15, Validation loss: 0.30, Validation accuracy: 65%
Issue:The model overfits the training data, shown by a large gap between training and validation loss, and validation accuracy is too low for practical use.
Your Task
Reduce overfitting and improve validation accuracy to at least 80% by fine-tuning the OpenAI model using the fine-tuning API.
Use the OpenAI fine-tuning API only.
Do not change the base model architecture.
Limit training epochs to a maximum of 10.
Hint 1
Hint 2
Hint 3
Solution
Prompt Engineering / GenAI
import openai

# Prepare your fine-tuning datasets in JSONL format
# Training: fine_tune_data.jsonl, Validation: fine_tune_validation.jsonl
# Each line: {"prompt": "Your prompt text", "completion": "Your completion text"}

# Upload the training dataset
train_response = openai.File.create(
    file=open("fine_tune_data.jsonl", "rb"),
    purpose="fine-tune"
)
training_file_id = train_response.id

# Upload the validation dataset
val_response = openai.File.create(
    file=open("fine_tune_validation.jsonl", "rb"),
    purpose="fine-tune"
)
validation_file_id = val_response.id

# Create fine-tuning job
fine_tune_response = openai.FineTune.create(
    training_file=training_file_id,
    validation_file=validation_file_id,
    model="davinci",
    n_epochs=5,
    learning_rate_multiplier=0.1,
    batch_size=4,
    early_stopping=True
)

# Monitor fine-tuning job
import time
fine_tune_id = fine_tune_response.id
while True:
    status = openai.FineTune.retrieve(id=fine_tune_id)
    print(f"Status: {status.status}")
    if status.status in ["succeeded", "failed"]:
        break
    time.sleep(30)

# Use the fine-tuned model
fine_tuned_model = status.fine_tuned_model
response = openai.Completion.create(
    model=fine_tuned_model,
    prompt="Your domain-specific prompt",
    max_tokens=50
)
print(response.choices[0].text.strip())
Uploaded separate training and validation datasets for fine-tuning.
Set number of epochs to 5 to avoid overfitting.
Used a smaller learning rate multiplier (0.1) for stable training.
Enabled early stopping to halt training when validation loss stops improving.
Monitored fine-tuning job status to ensure completion.
Results Interpretation

Before fine-tuning: Training loss = 0.15, Validation loss = 0.30, Validation accuracy = 65%
After fine-tuning: Training loss = 0.18, Validation loss = 0.22, Validation accuracy = 82%

Fine-tuning with a domain-specific dataset and proper hyperparameter tuning reduces overfitting and improves validation accuracy, making the model better suited for specialized tasks.
Bonus Experiment
Try fine-tuning with different learning rate multipliers and batch sizes to see how they affect overfitting and accuracy.
💡 Hint
Lower learning rates and smaller batch sizes often help reduce overfitting but may require more epochs to converge.