Select DataFrame¶
Easy Select Column Method from Pandas DataFrame
-
tools.select.
get_batch_dataframe
(df, batch_size=100)¶ split DataFrame to sub-DataDrame and each sub-DataDrame row size is batch_size
Parameters: - df (Pandas DataFrame) –
- batch_size (number of records in each sub-dataframe(default: 100)) –
Returns: Return type: DataFrame generator
Examples
>>> import pandas as pd >>> from tidyframe import get_batch_dataframe >>> df = pd.DataFrame() >>> df['col_1'] = list("abcde") >>> df['col_2'] = [1, 2, 3, 4, 5] >>> dfs = [ x for x in get_batch_dataframe(df,2)] >>> dfs[-1] col_1 col_2 4 e 5 >>> [ x.shape[0] for x in dfs] [2, 2, 1]
-
tools.select.
reorder_columns
(df, columns=None, pattern=None, last_columns=None)¶ reorder columns of pandas DataFrame
Parameters: - df (Pandas DataFrame) –
- columns (list which want to head column name(non-use if pattern is not None)) –
- pattern (regular expression pattern which let selected columns be at head columns) –
- last_columns (list which want to last column name) –
Returns: Return type: Pandas DataFrame
Examples
>>> import pandas as pd >>> from tidyframe import reorder_columns >>> df = pd.DataFrame([{'a': 1, 'b': 1, 'c': 1, 'd': 1, 'e': 2}]) >>> df_reorder = reorder_columns(df, ['b', 'c'], last_columns=['a', 'd']) >>> df_reorder b c e a d 0 1 1 2 1 1
-
tools.select.
select
(df, columns=None, columns_minus=None, columns_between=None, pattern=None, copy=False)¶ Select Pandas DataFrame Columns
Parameters: - df (Pandas DataFrame) –
- columns_minus (column which want to remove) –
- columns_between (list with two element, select columns bwtween two columns) –
- pattern (regular expression or list of regular expression, return match columns) –
- copy (whether return deep copy DataFrame) –
Returns: Return type: Pandas DataFrame
Examples
>>> import numpy as np >>> import pandas as pd >>> from tidyframe import select >>> df = pd.DataFrame(np.array(range(10)).reshape(2, 5), ... columns=list('abcde'), ... index=['row_1', 'row_2']) >>> select(df, columns=['b', 'd']) b d row_1 1 3 row_2 6 8 >>> select(df, columns_minus=['b', 'd']) a c e row_1 0 2 4 row_2 5 7 9 >>> select(df, pattern='[a|b]') a b row_1 0 1 row_2 5 6
-
tools.select.
select_index
(x, i, otherwise=nan)¶ Select by index and Catch all Exception
Parameters: - x (array) –
- i (index) –
- otherwise (fill value if exist exception) –
Returns: Return type: x[i] if not exception happen else return otherwise