We use string type in NumPy to store and work with text data efficiently in arrays.
String type in NumPy
numpy.array(['text1', 'text2'], dtype='S') numpy.array(['text1', 'text2'], dtype='U')
'S' means fixed-length byte strings (ASCII or bytes).
'U' means fixed-length Unicode strings (supports all characters).
import numpy as np arr = np.array(['cat', 'dog', 'bird'], dtype='S') print(arr) print(arr.dtype)
import numpy as np arr = np.array(['cat', 'dog', 'bird'], dtype='U') print(arr) print(arr.dtype)
import numpy as np arr = np.array(['apple', 'banana', 'cherry']) print(arr) print(arr.dtype)
This program shows how to create byte and Unicode string arrays in NumPy, print their types, and find the length of each string.
import numpy as np # Create a byte string array byte_arr = np.array(['red', 'green', 'blue'], dtype='S') print('Byte string array:', byte_arr) print('Data type:', byte_arr.dtype) # Create a Unicode string array unicode_arr = np.array(['red', 'green', 'blue'], dtype='U') print('Unicode string array:', unicode_arr) print('Data type:', unicode_arr.dtype) # Show length of each string element lengths = np.vectorize(len)(unicode_arr) print('Lengths of each string:', lengths)
Byte strings (dtype='S') store text as bytes and are limited to ASCII or byte data.
Unicode strings (dtype='U') store text as Unicode and support all characters like emojis or accents.
NumPy string types have fixed length, so longer strings get truncated if they exceed the set length.
NumPy supports two main string types: byte strings ('S') and Unicode strings ('U').
Use 'S' for ASCII or byte data, and 'U' for full Unicode text.
String arrays are fixed-length, so be careful with string sizes to avoid truncation.