来源:python中国网 时间:2019-07-28

  数据转置

  行列标签一起转置,利用.T实现

# -*- coding: utf-8 -*-
import pandas as pd

d = {'col1': [1,2,3], 'col2': [4,5,6],'col3':[7,8,9]}
df = pd.DataFrame(data=d)
print(df)
print('-----------')
print(df.T)
D:python3installpython.exe D:/python/py3script/test.py
   col1  col2  col3
0     1     4     7
1     2     5     8
2     3     6     9
-----------
      0  1  2
col1  1  2  3
col2  4  5  6
col3  7  8  9

Process finished with exit code 0


  数据修改

  1、通过索引实现,修改整行和列

  2、通过索引修改单个值不生生效,使用df.at修改单个值(建议新值和旧值数据类型应保持一致)

# -*- coding: utf-8 -*-
import pandas as pd

d = {'col1': [1,2,3], 'col2': [4,'66',6],'col3':[7,8,9]}
df = pd.DataFrame(data=d)
print(df)
print('------------------')

df['col1'] = 'aaa'
print(df)
print('-------------------')

df.at[1,'col2'] = 'py'
print(df)
print('-----------')

df.loc[[1]]['col1'] = 'bbb'
print('并未变化')
print(df)
D:python3installpython.exe D:/python/py3script/test.py
   col1 col2  col3
0     1    4     7
1     2   66     8
2     3    6     9
------------------
  col1 col2  col3
0  aaa    4     7
1  aaa   66     8
2  aaa    6     9
-------------------
  col1 col2  col3
0  aaa    4     7
1  aaa   py     8
2  aaa    6     9
-----------
并未变化
  col1 col2  col3
0  aaa    4     7
1  aaa   py     8
2  aaa    6     9

Process finished with exit code 0


  数据删除

  1、del

  2、drop删除行

  df.drop(['a', 'd'], axis=0) 删除行,默认axis=0

  df.drop(['a', 'd'], axis=1) 删除列

  df.drop(['a', 'd'], axis=1 ,inplace=False) 生成新df,不改变原df。默认是False

  df.drop(['a', 'd'], axis=1 ,inplace=False) 改变原df

# -*- coding: utf-8 -*-
import pandas as pd

d = {'col1': [1,2,3], 'col2': [4,'66',6],'col3':[7,8,9]}
df = pd.DataFrame(data=d)
print(df)
print('------------------')

# del删除
del(df['col1'])
print(df)
print('------------')

# drop删除
res = df.drop([1])
print(res)
print('-------------')

df.drop(['col2'],axis=1,inplace=True)
print(df)

D:python3installpython.exe D:/python/py3script/test.py
   col1 col2  col3
0     1    4     7
1     2   66     8
2     3    6     9
------------------
  col2  col3
0    4     7
1   66     8
2    6     9
------------
  col2  col3
0    4     7
2    6     9
-------------
   col3
0     7
1     8
2     9

Process finished with exit code 0


  对齐

# -*- coding: utf-8 -*-
import pandas as pd

d1 = {'col1': [1, 2], 'col2': [3, 4]}
d2 = {'col1': [4, 8], 'col2': [7, 9],'col3':[1,2]}
df1 = pd.DataFrame(data=d1)
df2 = pd.DataFrame(data=d2)
df = df1 + df2
print(df)

D:python3installpython.exe D:/python/py3script/python66.py
   col1  col2  col3
0     5    10   NaN
1    10    13   NaN

Process finished with exit code 0



  排序

  1)按值排序

  DataFrame.sort_values(by, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last')

  参数解释:

  by:字符串或者列表;如果axis=0,那么by="列名";如果axis=1,那么by="行名"。

  axis:默认值0,默认按照列排序,即纵向排序;如果为1,则是横向排序。

  ascending:布尔型,True则升序,如果by=['列名1','列名2'],则该参数可以是[True, False],即第一字段升序,第二个降序。

  inplace:布尔型,是否用排序后的数据框替换现有的数据框。

  kind:排序方法,{‘quicksort’, ‘mergesort’, ‘heapsort’}, 默认是‘quicksort’。似乎不用太关心。

  na_position:{‘first’, ‘last’}, 默认是‘last’,默认缺失值排在最后面。

  单列排序及多列排序

# -*- coding: utf-8 -*-
import pandas as pd


df = pd.DataFrame({'b':[1,2,3,2],'a':[4,3,2,1],'c':[1,3,8,2]},index=[2,0,1,3])
print(df)
print('------------')


df1 = df.sort_values(by='b',axis=0)
print(df1)
print('--------------')

# 多列排序
df2 = df.sort_values(by=['b','a'],axis=0,ascending=[False,True])
print(df2)

D:python3installpython.exe D:/python/py3script/python66.py
   b  a  c
2  1  4  1
0  2  3  3
1  3  2  8
3  2  1  2
------------
   b  a  c
2  1  4  1
0  2  3  3
3  2  1  2
1  3  2  8
--------------
   b  a  c
1  3  2  8
3  2  1  2
0  2  3  3
2  1  4  1

Process finished with exit code 0


  2)按索引排序

  .sort_index函数

  参数解释:

  sort_index(axis=0,level=None,ascending=True,inplace=False,kind='quicksort',na_position='last',sort_remaining=True,by=None)

  axis:{0 or ‘index’, 1 or ‘columns’}, default 0

  The axis along which to sort. The value 0 identifies the rows, and 1 identifies the columns.

  level:int or level name or list of ints or list of level names

  If not None, sort on values in specified index level(s).

  ascending:bool, default True

  Sort ascending vs. descending.

  inplace:bool, default False

  If True, perform operation in-place.

  kind:{‘quicksort’, ‘mergesort’, ‘heapsort’}, default ‘quicksort’

  Choice of sorting algorithm. See also ndarray.np.sort for more information.mergesortis the only stable algorithm. For DataFrames, this option is only applied when sorting on a single column or label.

  na_position:{‘first’, ‘last’}, default ‘last’

  Puts NaNs at the beginning iffirst;lastputs NaNs at the end. Not implemented for MultiIndex.

  sort_remaining:bool, default True

  If True and sorting by level and index is multilevel, sort by other levels too (in order) after sorting by specified level.

# -*- coding: utf-8 -*-
import pandas as pd


df = pd.DataFrame({'b':[1,2,2,3],'a':[4,3,2,1],'c':[1,3,8,2]},index=[2,0,1,3])
print(df)
print('----------')

#默认按“行标签”升序排序
df1 = df.sort_index()
print(df1)
print('-------------')

#按“列标签”降排序
df2 = df.sort_index(axis=1,ascending=False)
print(df2)
D:python3installpython.exe D:/python/py3script/python66.py
   b  a  c
2  1  4  1
0  2  3  3
1  2  2  8
3  3  1  2
----------
   b  a  c
0  2  3  3
1  2  2  8
2  1  4  1
3  3  1  2
-------------
   c  b  a
2  1  1  4
0  3  2  3
1  8  2  2
3  2  3  1

Process finished with exit code 0