What is String type in NumPy?

NumPydata~5 mins

String type in NumPy

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Introduction

We use string type in NumPy to store and work with text data efficiently in arrays.

When you want to store a list of names or words in a NumPy array.

When you need to perform fast operations on many text entries together.

When you want to save memory by using fixed-length strings instead of Python objects.

When you want to combine text data with numerical data in arrays for analysis.

Syntax

NumPy

numpy.array(['text1', 'text2'], dtype='S')
numpy.array(['text1', 'text2'], dtype='U')

'S' means fixed-length byte strings (ASCII or bytes).

'U' means fixed-length Unicode strings (supports all characters).

Examples

This creates a byte string array with 3 animals and prints the array and its type.

NumPy

import numpy as np
arr = np.array(['cat', 'dog', 'bird'], dtype='S')
print(arr)
print(arr.dtype)

This creates a Unicode string array that can store any characters, not just ASCII.

NumPy

import numpy as np
arr = np.array(['cat', 'dog', 'bird'], dtype='U')
print(arr)
print(arr.dtype)

If you don't specify dtype, NumPy chooses Unicode string type automatically for text.

NumPy

import numpy as np
arr = np.array(['apple', 'banana', 'cherry'])
print(arr)
print(arr.dtype)

Sample Program

This program shows how to create byte and Unicode string arrays in NumPy, print their types, and find the length of each string.

NumPy

import numpy as np

# Create a byte string array
byte_arr = np.array(['red', 'green', 'blue'], dtype='S')
print('Byte string array:', byte_arr)
print('Data type:', byte_arr.dtype)

# Create a Unicode string array
unicode_arr = np.array(['red', 'green', 'blue'], dtype='U')
print('Unicode string array:', unicode_arr)
print('Data type:', unicode_arr.dtype)

# Show length of each string element
lengths = np.vectorize(len)(unicode_arr)
print('Lengths of each string:', lengths)

OutputSuccess

Important Notes

Byte strings (dtype='S') store text as bytes and are limited to ASCII or byte data.

Unicode strings (dtype='U') store text as Unicode and support all characters like emojis or accents.

NumPy string types have fixed length, so longer strings get truncated if they exceed the set length.

Summary

NumPy supports two main string types: byte strings ('S') and Unicode strings ('U').

Use 'S' for ASCII or byte data, and 'U' for full Unicode text.

String arrays are fixed-length, so be careful with string sizes to avoid truncation.