python

当前位置:首页 > Pandas教程 > 当前文章

Pandas教程

Dataframe选择行、选择列、布尔型索引、条件判断

2020-07-02 98赞 python中国网
每篇文章努力于解决一个问题!python高级、python面试全套、操作系统经典课等可移步文章底部。

  Dataframe的行标签和列标签其实都是一种索引。

  一:选择列

  Dataframe选择一列就是Series,选择多列就是DataFrame

  1)df[xx]的形式选择单列、多列

# -*- coding: utf-8 -*-
import pandas as pd

d = {'col1': [1,2,3], 'col2': [4,5,6],'col3':[7,8,9]}
df = pd.DataFrame(data=d)
print(df)
print('-----------------')

print(df['col1']) # 如果单列但是2个中括号就会变成DataFrame
print(df[['col2','col1']])  # 传入列表,两个中括号
D:python3installpython.exe D:/python/py3script/test.py
   col1  col2  col3
0     1     4     7
1     2     5     8
2     3     6     9
-----------------
0    1
1    2
2    3
Name: col1, dtype: int64
   col2  col1
0     4     1
1     5     2
2     6     3

Process finished with exit code 0



  2).loc选择列(第1个参数定位行,第2个参数定位列)

import pandas as pd
import numpy as np



d = {'col1': [11,22,33], 'col2': [44,55,66],'col3':[77,88,99]}
df = pd.DataFrame(data=d)

# 选择col1列,索引为0、2的2行
print(df.loc[[0,2],'col1'])
print('-------------')

# 选择col1列,索引0到2的3行
print(df.loc[0:2,'col1'])
0    11
2    33
Name: col1, dtype: int64
-------------
0    11
1    22
2    33
Name: col1, dtype: int64


  二:选择行或同时选择行列

  1、loc属性:df.loc[xx:xx]末端包含、为标签索引

  2、切片的方式:df[xx:xx] (末端不包含)

  3、iloc属性:末端不包含、为行位置索引

# -*- coding: utf-8 -*-
import pandas as pd

d = {'col1': [1,2,3], 'col2': [4,5,6],'col3':[7,8,9]}
df = pd.DataFrame(data=d)
print(df)
print('-----------------')

# loc实现,选择索引为1、2的行
print(df.loc[[1,2]])
print('-------------')

#切片,选择索引为0到1的行
print(df[0:2])
print('--------------')

#iloc实现,选择索引为0到1的行
print(df.iloc[0:2])
D:python3installpython.exe D:/python/py3script/test.py
   col1  col2  col3
0     1     4     7
1     2     5     8
2     3     6     9
-----------------
   col1  col2  col3
1     2     5     8
2     3     6     9
-------------
   col1  col2  col3
0     1     4     7
1     2     5     8
--------------
   col1  col2  col3
0     1     4     7
1     2     5     8

Process finished with exit code 0


总结:
df['name1']
df['name2']
df[['name1','name2']] #选取多列,多列名字要放在list里
df[0:]  #第0行及之后的行,相当于df的全部数据,注意冒号是必须的
df[:2]  #第2行之前的数据(不含第2行)
df[0:1] #第0行
df[1:3] #第1行到第2行(不含第3行)
df[-1:] #最后一行
df[-3:-1] #倒数第3行到倒数第1行(不包含最后1行即倒数第1行,从前数时从第0行开始,从后数就是-1行开始,没有-0)


  4、loc行列同时选择

# -*- coding: utf-8 -*-
import pandas as pd

df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
     index=['cobra', 'viper', 'sidewinder'],
     columns=['max_speed', 'shield'])

df.loc[['viper', 'sidewinder'], ['shield']] = 50
print(df)
D:python3installpython.exe D:/python/py3script/test.py
 max_speed  shield
cobra               1       2
viper               4      50
sidewinder          7      50

Process finished with exit code 0



  三:布尔型索引

  1、单列做判断

# -*- coding: utf-8 -*-
import pandas as pd

d = {'col1': [1,2,3], 'col2': [4,5,6],'col3':[7,8,9]}
df = pd.DataFrame(data=d)
print(df)
print('-------------------')

res = df['col2'] > 5
print(res)
print('---------------------')
print(df[res])
D:python3installpython.exe D:/python/py3script/test.py
   col1  col2  col3
0     1     4     7
1     2     5     8
2     3     6     9
-------------------
0    False
1    False
2     True
Name: col2, dtype: bool
---------------------
   col1  col2  col3
2     3     6     9

Process finished with exit code 0



  2、多列做判断

# -*- coding: utf-8 -*-
import pandas as pd

d = {'col1': [1,2,3], 'col2': [4,5,6],'col3':[7,8,9]}
df = pd.DataFrame(data=d)
print(df)
print('-------------------')

res = df > 5
print(res)
print('---------------------')
print(df[res])
D:python3installpython.exe D:/python/py3script/test.py
   col1  col2  col3
0     1     4     7
1     2     5     8
2     3     6     9
-------------------
    col1   col2  col3
0  False  False  True
1  False  False  True
2  False   True  True
---------------------
   col1  col2  col3
0   NaN   NaN     7
1   NaN   NaN     8
2   NaN   6.0     9

Process finished with exit code 0



  3、多行做判断

# -*- coding: utf-8 -*-
import pandas as pd

d = {'col1': [1,2,3], 'col2': [4,5,6],'col3':[7,8,9]}
df = pd.DataFrame(data=d)
print(df)
print('-------------------')

res = df.loc[[1,2]] > 5
print(df[res])
D:python3installpython.exe D:/python/py3script/test.py
   col1  col2  col3
0     1     4     7
1     2     5     8
2     3     6     9
-------------------
   col1  col2  col3
0   NaN   NaN   NaN
1   NaN   NaN   8.0
2   NaN   6.0   9.0

Process finished with exit code 0



  4、多条件判断

# -*- coding: utf-8 -*-
import pandas as pd

d = {'A': [11,12,13], 'B': [14,15,16],'C':[17,18,19]}
df = pd.DataFrame(d)
res0 = (df['A'] > 11) & (df['C']) <19
res = df['A'] > 11
res2 = df['C'] < 19
print(df[res])
print('-------------')
print(df[res2])
print('-------------')
print(df[(res) & (res2)])
D:python3installpython.exe D:/python/py3script/test.py
    A   B   C
1  12  15  18
2  13  16  19
-------------
    A   B   C
0  11  14  17
1  12  15  18
-------------
    A   B   C
1  12  15  18

Process finished with exit code 0



  四:多重索引

  先选择行再选择列且配合布尔判断

# -*- coding: utf-8 -*-
import pandas as pd

d = {'col1': [1,2,3], 'col2': [4,5,6],'col3':[7,8,9]}
df = pd.DataFrame(data=d)
print(df)
print('-------------------')

v = df['col1'].loc[[1,2]]
print(v)
print('--------------')

v = df.loc[1:2]['col1']
print(v)
print('--------------')
print(df[df > 5].loc[1:2]['col1'])
D:python3installpython.exe D:/python/py3script/test.py
   col1  col2  col3
0     1     4     7
1     2     5     8
2     3     6     9
-------------------
1    2
2    3
Name: col1, dtype: int64
--------------
1    2
2    3
Name: col1, dtype: int64
--------------
1   NaN
2   NaN
Name: col1, dtype: float64

Process finished with exit code 0


文章评论

Dataframe选择行、选择列、布尔型索引、条件判断文章写得不错,值得赞赏