Top Python Libraries for Data Analysis Explained
The main Python libraries for data analysis are
pandas for handling tables, NumPy for numerical data, and Matplotlib for creating charts. These libraries work together to help you clean, analyze, and visualize data easily.Syntax
Here is how you typically import and use the main data analysis libraries in Python:
import pandas as pd: Loads pandas for table-like data.import numpy as np: Loads NumPy for numbers and arrays.import matplotlib.pyplot as plt: Loads Matplotlib for drawing graphs.
You use these libraries by calling their functions on your data, like creating tables, arrays, or plots.
python
import pandas as pd import numpy as np import matplotlib.pyplot as plt
Example
This example shows how to create a simple table with pandas, calculate the mean with NumPy, and plot the data with Matplotlib.
python
import pandas as pd import numpy as np import matplotlib.pyplot as plt # Create a simple table with pandas data = pd.DataFrame({'Scores': [88, 92, 79, 93, 85]}) # Calculate the average score with NumPy average = np.mean(data['Scores']) # Print the average print(f"Average score: {average}") # Plot the scores plt.plot(data['Scores'], marker='o') plt.title('Scores Plot') plt.xlabel('Student') plt.ylabel('Score') plt.show()
Output
Average score: 87.4
Common Pitfalls
Beginners often forget to import libraries or use wrong function names. Another common mistake is mixing up pandas and NumPy data types, which can cause errors.
Also, forgetting to call plt.show() will prevent the plot from appearing.
python
import pandas as pd import numpy as np import matplotlib.pyplot as plt # Wrong: forgetting to import matplotlib # plt.plot([1, 2, 3]) # This will cause an error # Right way: plt.plot([1, 2, 3]) plt.show()
Quick Reference
| Library | Purpose | Basic Usage |
|---|---|---|
| pandas | Work with tables and data frames | import pandas as pd\ndf = pd.DataFrame(data) |
| NumPy | Handle numbers and arrays | import numpy as np\narr = np.array([1, 2, 3]) |
| Matplotlib | Create charts and plots | import matplotlib.pyplot as plt\nplt.plot(data)\nplt.show() |
Key Takeaways
Use pandas to manage and analyze table-like data easily.
NumPy helps with fast numerical calculations and array operations.
Matplotlib is great for visualizing data with charts and graphs.
Always import the libraries before using their functions.
Remember to call plt.show() to display plots.