Given a table model_retraining_schedule with columns model_id, retrain_day (day of week), and last_retrained (date), write a query to find all models scheduled for retraining on the current day.
SELECT model_id FROM model_retraining_schedule WHERE retrain_day = TO_CHAR(CURRENT_DATE, 'FMDay');Use a function that returns the day name as text matching the retrain_day format.
Option D correctly uses TO_CHAR with 'Day' to get the day name as stored in retrain_day. Option D returns a number, not a name. Option D compares a day name to a date, which is invalid. Option D uses a MySQL function DAYNAME which is not standard SQL.
Given a table model_performance with columns model_id, drift_score (float), and evaluation_date, write a query to find models with a drift_score greater than 0.3 in the last 7 days.
SELECT DISTINCT model_id FROM model_performance WHERE drift_score > 0.3 AND evaluation_date >= CURRENT_DATE - INTERVAL '7 days';
Use interval syntax to subtract 7 days from current date and include all models with drift_score strictly greater than 0.3.
Option B correctly uses INTERVAL '7 days' and strict greater than. Option B uses incorrect date arithmetic and includes equals. Option B uses BETWEEN but excludes the exact boundary date. Option B uses > instead of >= which excludes models evaluated exactly 7 days ago.
You need to design a table to store retraining triggers for ML models. Each trigger can be based on schedule, data drift, or performance drop. Which schema design best supports flexible trigger types and efficient querying?
Consider flexibility to store different threshold types per trigger and efficient querying.
Option A uses a JSONB column for thresholds, allowing different trigger types to store relevant parameters flexibly. Options B and C have fixed threshold columns that may not fit all trigger types well. Option A uses model_id as primary key, limiting to one trigger per model.
Given a large model_performance table with millions of rows, which index will best optimize this query?
SELECT model_id FROM model_performance WHERE performance_metric < 0.7 AND evaluation_date > CURRENT_DATE - INTERVAL '30 days';
Consider the order of columns in a composite index matching the query filters.
Option A creates a composite index starting with evaluation_date, which is filtered with a range condition, followed by performance_metric. This order helps the database efficiently filter by date first, then by metric. Option A reverses the order, which is less efficient for this query. Options A and B index only one column, less optimal.
Consider this query to find models needing retraining due to performance drop:
SELECT model_id FROM model_performance WHERE performance_metric < 0.8 AND evaluation_date > CURRENT_DATE - 7;
What error or issue will this query cause?
Check how date arithmetic is done in SQL standard.
Option C is correct because subtracting an integer from a date without INTERVAL keyword is invalid in standard SQL and causes a syntax error. Option C is incorrect because the query won't run to return no rows. Option C is incorrect because performance_metric is compared to a number, not a string. Option C is incorrect because the query is syntactically invalid.