Given three time series data sets, which plot correctly shows all three series on the same graph with distinct colors and a legend?
import matplotlib.pyplot as plt import pandas as pd import numpy as np dates = pd.date_range('2023-01-01', periods=5) data1 = np.array([1, 3, 2, 5, 4]) data2 = np.array([2, 2, 3, 4, 5]) data3 = np.array([5, 3, 4, 2, 1])
Look for a line plot that uses dates on the x-axis and includes a legend.
Option D correctly plots all three series with dates on the x-axis, includes labels for each series, and shows a legend. Option D lacks labels and legend. Option D plots against index, not dates. Option D uses scatter plots, not line plots.
What will be the output type of the following code snippet?
import matplotlib.pyplot as plt import pandas as pd import numpy as np dates = pd.date_range('2023-01-01', periods=3) data1 = [1, 2, 3] data2 = [3, 2, 1] fig, ax = plt.subplots() ax.plot(dates, data1, label='A') ax.plot(dates, data2, label='B') ax.legend() plt.show()
Look at the ax.plot calls and the use of legend().
The code uses ax.plot which creates line plots. The legend labels the two lines. So the output is a line plot with two labeled lines over the date range.
Given two time series DataFrames with different date ranges, what is the shape of the merged DataFrame when merged on date with an outer join?
import pandas as pd df1 = pd.DataFrame({'date': pd.date_range('2023-01-01', periods=3), 'value1': [10, 20, 30]}) df2 = pd.DataFrame({'date': pd.date_range('2023-01-02', periods=3), 'value2': [15, 25, 35]}) merged = pd.merge(df1, df2, on='date', how='outer')
Count unique dates after combining both date ranges with outer join.
df1 dates: 2023-01-01, 2023-01-02, 2023-01-03
df2 dates: 2023-01-02, 2023-01-03, 2023-01-04
Outer join combines all unique dates: 2023-01-01 to 2023-01-04, total 4 unique dates. Columns are date, value1, value2 = 3 columns. So shape is (4,3).
What error will this code raise?
import matplotlib.pyplot as plt import pandas as pd dates = pd.date_range('2023-01-01', periods=4) data1 = [1, 2, 3] data2 = [4, 5, 6, 7] plt.plot(dates, data1, label='Series 1') plt.plot(dates, data2, label='Series 2') plt.legend() plt.show()
Check if the x and y data lengths match for each plot call.
data1 has length 3 but dates has length 4, so plt.plot(dates, data1) raises ValueError because x and y lengths differ.
You have three time series of equal length. Which code correctly calculates the pairwise Pearson correlation matrix?
import pandas as pd import numpy as np np.random.seed(0) data = pd.DataFrame({ 'A': np.random.randn(5), 'B': np.random.randn(5), 'C': np.random.randn(5) })
Look for the method that computes Pearson correlation matrix for all columns.
Option C computes the Pearson correlation matrix between all columns. Option C computes covariance, not correlation. Option C computes Spearman correlation, not Pearson. Option C computes correlation of each column with column 'A' only.