@浙大疏錦行?Python訓練營Day4
內容,pandas處理表格信息:
- 查看表格統計信息:
- data.mean()
- data.mode()
- data.median()
- 查看表格信息:
- data.info()
- data.describe()
- data.isnull()
- data.head()
- 填充空缺列:
- 數值型,使用mean()或者mode()
- 字符型(object),使用常量""或者mode()填充,但是針對object,需要使用to_string()或者使用數組下標獲取?string?類型的數據進行填充
代碼:
# 缺失值的處理
import pandas as pd
import numpy as np
import matplotlib.pyplot as pltdata = pd.read_csv('./data/credit_data.csv')
print(f"data shape: {data.shape}")
print(f"data head: {data.head()}")
print(f"data info: {data.info()}")
print(f"查看空值: {data.isnull()}")print(data.isnull().sum())
data_columns = data.columns.to_list()
for column in data_columns:if data[column].dtype != 'object':data[column].fillna(data[column].mean(), inplace=True) # 均值elif data[column].dtype == 'object':if data[column].isnull().sum() > 0:print(column)data[column].fillna(data[column].mode()[0], inplace=True) # 眾數
print(data.isnull().sum())# data.head()
# data.info()
# data.describe()
# data.isnull()
#
# data.mode()
# data.mean()
# data.median()
#
# print(data.dtypes, data.columns)