0
0
NumpyHow-ToBeginner ยท 3 min read

How to Read CSV into NumPy Array Quickly and Easily

Use numpy.loadtxt() or numpy.genfromtxt() functions to read CSV files into a NumPy array. These functions load the data directly from the file and convert it into an array for easy numerical processing.
๐Ÿ“

Syntax

The main functions to read CSV files into NumPy arrays are:

  • numpy.loadtxt(fname, delimiter=',', dtype=float): Loads data assuming all values are of the same type.
  • numpy.genfromtxt(fname, delimiter=',', dtype=None, encoding='utf-8'): More flexible, can handle missing values and mixed data types.

Parameters explained:

  • fname: Path to the CSV file.
  • delimiter: Character separating values, usually a comma.
  • dtype: Data type of the resulting array elements.
  • encoding: File encoding, important for text files.
python
import numpy as np

# Basic syntax examples
array1 = np.loadtxt('data.csv', delimiter=',', dtype=float)
array2 = np.genfromtxt('data.csv', delimiter=',', dtype=None, encoding='utf-8')
๐Ÿ’ป

Example

This example shows how to read a CSV file with numeric data into a NumPy array using loadtxt. It prints the array and its shape.

python
import numpy as np
from io import StringIO

# Simulate a CSV file content
csv_data = StringIO('''1.5,2.3,3.1
4.0,5.2,6.3
7.1,8.4,9.0''')

# Load CSV data into numpy array
array = np.loadtxt(csv_data, delimiter=',')

print(array)
print('Shape:', array.shape)
Output
[[1.5 2.3 3.1] [4. 5.2 6.3] [7.1 8.4 9. ]] Shape: (3, 3)
โš ๏ธ

Common Pitfalls

Common mistakes when reading CSV into NumPy arrays include:

  • Using loadtxt on files with missing or non-numeric data causes errors.
  • Not specifying the correct delimiter if the file uses tabs or semicolons.
  • Ignoring encoding issues that can cause reading failures.

Use genfromtxt for files with missing values or mixed data types.

python
import numpy as np
from io import StringIO

# Wrong: loadtxt fails on missing data
csv_bad = StringIO('1,2,3\n4,,6\n7,8,9')
try:
    np.loadtxt(csv_bad, delimiter=',')
except Exception as e:
    print('Error with loadtxt:', e)

# Right: genfromtxt handles missing data
csv_bad.seek(0)
array = np.genfromtxt(csv_bad, delimiter=',', filling_values=0)
print(array)
Output
Error with loadtxt: could not convert string to float: '' [1. 2. 3. 4. 0. 6. 7. 8. 9.]
๐Ÿ“Š

Quick Reference

FunctionUse CaseKey Parameters
numpy.loadtxtSimple numeric CSV filesfname, delimiter, dtype
numpy.genfromtxtCSV with missing or mixed datafname, delimiter, dtype, filling_values, encoding
โœ…

Key Takeaways

Use numpy.loadtxt for simple numeric CSV files without missing data.
Use numpy.genfromtxt to handle missing values or mixed data types safely.
Always specify the correct delimiter matching your CSV file.
Check file encoding if you encounter reading errors.
Test with small data samples to confirm correct loading before large files.