Transform DataFrame

Convert Pandas DataFrame to nest DataFrame

transform.add_columns(df, columns, default=nan, deepcopy=False)

Add column if column is not exist

Parameters:
  • df (pandas DataFrame) –
  • columns (list, add column names) –
  • default (list or a object(defalut: NaN)) –
  • deepcopy (bool, deepcopy df or not(default: True)) –
Returns:

Return type:

pandas DataFrame

Examples

>>> import pandas as pd
>>> from tidyframe import add_columns
>>> df = pd.DataFrame()
>>> df['a'] = [1, 6]
>>> df['b'] = [2, 7]
>>> df['c'] = [3, 8]
>>> df['d'] = [4, 9]
>>> df['e'] = [5, 10]
>>> add_columns(df, columns=['a', 'f'], default=[30, [10, 11]])
>>> df
a  b  c  d   e   f
0  1  2  3  4   5  10
1  6  7  8  9  10  11
transform.apply_window(df, func, partition=None, columns=None)

apply window function in DataFrame

Parameters:
  • df (DataFrameGroupBy or DataFrame) –
  • func (list of function) –
  • partition (list of partition columns) –
  • columns (list of columns which need to apply func) –
Returns:

Return type:

Pandas Series

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from tidyframe import apply_window
>>>
>>> iris = datasets.load_iris()
>>> df = pd.DataFrame({"range":[1,2,3,4,5,6],"target":[1,1,1,2,2,2]})
>>> apply_window(df, np.mean, partition=['target'], columns=df.columns[1])
0    1
1    1
2    1
3    2
4    2
5    2
Name: target, dtype: int64
transform.nest(df, columns=[], columns_minus=[], columns_between=[], key='data', copy=False)

Nest repeated values

Parameters:
  • df (DataFrameGroupBy or DataFrame) –
  • columns (list or index, nest columns) –
  • columns_minus (list or index, columns which do not want to nest) – (must choose one of columns and columns_minus)
  • columns_between (list with length 2, assigin nest columns between to two columns) –
  • copy (False, return DataFrame using copy.deepcopy) –
transform.rolling(list_object, window_size, missing=nan)

Rolling list of object

Parameters:
  • list_object (list of objects) –
  • window_size (rolling windows size) –
  • missing (default value if missing value in rolling window) –
Returns:

Return type:

list of list

Examples

>>> import pandas as pd
>>> from tidyframe import rolling
>>> a = list(range(10))
>>> pd.DataFrame({'a': a, 'b': rolling(a, 3)})
a              b
0  0  [nan, nan, 0]
1  1    [nan, 0, 1]
2  2      [0, 1, 2]
3  3      [1, 2, 3]
4  4      [2, 3, 4]
5  5      [3, 4, 5]
6  6      [4, 5, 6]
7  7      [5, 6, 7]
8  8      [6, 7, 8]
9  9      [7, 8, 9]
transform.to_dataframe(data, index_name='index')

Change list of Pandas Serice to Pandas DataFrame

Parameters:
  • data (list of pandas Series) –
  • index_name (return index DataFrame column name) –

Examples

>>> import pandas as pd
>>> from tidyframe import to_dataframe
>>> list_series = [
...     pd.Series([1, 2], index=['i_1', 'i_2']),
...     pd.Series([3, 4], index=['i_1', 'i_2'])
... ]
>>> to_dataframe(list_series)
   i_1  i_2 index
0    1    2  None
1    3    4  None
transform.unnest(df, drop=[], copy=False)

Inverse Nest DataFrame

Parameters:
  • df (DataFrame with Series of Dataframe) –
  • drop (list of column which do not return) –