Transform DataFrame¶
Convert Pandas DataFrame to nest DataFrame
-
transform.
add_columns
(df, columns, default=nan, deepcopy=False)¶ Add column if column is not exist
Parameters: - df (pandas DataFrame) –
- columns (list, add column names) –
- default (list or a object(defalut: NaN)) –
- deepcopy (bool, deepcopy df or not(default: True)) –
Returns: Return type: pandas DataFrame
Examples
>>> import pandas as pd >>> from tidyframe import add_columns >>> df = pd.DataFrame() >>> df['a'] = [1, 6] >>> df['b'] = [2, 7] >>> df['c'] = [3, 8] >>> df['d'] = [4, 9] >>> df['e'] = [5, 10] >>> add_columns(df, columns=['a', 'f'], default=[30, [10, 11]]) >>> df a b c d e f 0 1 2 3 4 5 10 1 6 7 8 9 10 11
-
transform.
apply_window
(df, func, partition=None, columns=None)¶ apply window function in DataFrame
Parameters: - df (DataFrameGroupBy or DataFrame) –
- func (list of function) –
- partition (list of partition columns) –
- columns (list of columns which need to apply func) –
Returns: Return type: Pandas Series
Examples
>>> import pandas as pd >>> import numpy as np >>> from tidyframe import apply_window >>> >>> iris = datasets.load_iris() >>> df = pd.DataFrame({"range":[1,2,3,4,5,6],"target":[1,1,1,2,2,2]}) >>> apply_window(df, np.mean, partition=['target'], columns=df.columns[1]) 0 1 1 1 2 1 3 2 4 2 5 2 Name: target, dtype: int64
-
transform.
nest
(df, columns=[], columns_minus=[], columns_between=[], key='data', copy=False)¶ Nest repeated values
Parameters: - df (DataFrameGroupBy or DataFrame) –
- columns (list or index, nest columns) –
- columns_minus (list or index, columns which do not want to nest) – (must choose one of columns and columns_minus)
- columns_between (list with length 2, assigin nest columns between to two columns) –
- copy (False, return DataFrame using copy.deepcopy) –
-
transform.
rolling
(list_object, window_size, missing=nan)¶ Rolling list of object
Parameters: - list_object (list of objects) –
- window_size (rolling windows size) –
- missing (default value if missing value in rolling window) –
Returns: Return type: list of list
Examples
>>> import pandas as pd >>> from tidyframe import rolling >>> a = list(range(10)) >>> pd.DataFrame({'a': a, 'b': rolling(a, 3)}) a b 0 0 [nan, nan, 0] 1 1 [nan, 0, 1] 2 2 [0, 1, 2] 3 3 [1, 2, 3] 4 4 [2, 3, 4] 5 5 [3, 4, 5] 6 6 [4, 5, 6] 7 7 [5, 6, 7] 8 8 [6, 7, 8] 9 9 [7, 8, 9]
-
transform.
to_dataframe
(data, index_name='index')¶ Change list of Pandas Serice to Pandas DataFrame
Parameters: - data (list of pandas Series) –
- index_name (return index DataFrame column name) –
Examples
>>> import pandas as pd >>> from tidyframe import to_dataframe >>> list_series = [ ... pd.Series([1, 2], index=['i_1', 'i_2']), ... pd.Series([3, 4], index=['i_1', 'i_2']) ... ] >>> to_dataframe(list_series) i_1 i_2 index 0 1 2 None 1 3 4 None
-
transform.
unnest
(df, drop=[], copy=False)¶ Inverse Nest DataFrame
Parameters: - df (DataFrame with Series of Dataframe) –
- drop (list of column which do not return) –