import numpy as np arr = np.array(['cat', 'dog', 'bird'], dtype='S4') print(arr)
The dtype 'S4' means each string is stored as 4 bytes. The string 'bird' fits exactly, but if a string is longer, it gets truncated. Here, 'bird' fits, but if it were longer, it would be cut.
The output shows the array with square brackets and each element as a byte string.
import numpy as np arr = np.array(['apple', 'banana', 'cherry'], dtype='S6') lengths = np.char.str_len(arr) print(lengths)
The dtype 'S6' means each string is stored with 6 bytes. The strings longer than 6 characters get truncated.
np.char.str_len returns the length of the stored strings, not the original Python strings.
import numpy as np arr = np.array(['cat', 'dog'], dtype='S3') arr[0] = 'elephant' print(arr)
NumPy string arrays with fixed length dtype truncate longer strings silently when assigned.
Here, 'elephant' is longer than 3 bytes, so only the first 3 bytes 'ele' are stored.
import numpy as np import matplotlib.pyplot as plt arr_s = np.array(['abcde']*1000, dtype='S5') arr_u = np.array(['abcde']*1000, dtype='U5') sizes = [arr_s.nbytes, arr_u.nbytes] labels = ['S5', 'U5'] plt.bar(labels, sizes) plt.ylabel('Memory size in bytes') plt.title('Memory size: byte vs Unicode strings') plt.show()
Unicode strings (dtype='U5') use 4 bytes per character internally, while byte strings (dtype='S5') use 1 byte per character.
Thus, the memory size for 'U5' arrays is larger than for 'S5' arrays of the same length and number of elements.
Option B is true: byte string dtype 'S' stores fixed-length bytes and truncates longer strings silently.
Option B is false because Unicode strings use 4 bytes per character.
Option B is false because no error is raised on longer string assignment, just truncation.
Option B is false because NumPy arrays have fixed size and do not resize automatically.