■ DataFrameGroupBy 클래스의 transform 메소드를 사용해 그룹별 데이터를 집계하는 방법을 보여준다.
▶ main.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
import pandas as pd import numpy as np dataFrame1 = pd.DataFrame({"key" : ["A", "B", "C", "D"], "value" : np.random.randn(4)}) dataFrame2 = pd.DataFrame({"key" : ["B", "D", "D", "E"], "value" : np.random.randn(4)}) dataFrame3 = dataFrame1.merge(dataFrame2, on = ["key"], how = "outer") dataFrameGroupBy = dataFrame3.groupby("key") series1 = dataFrameGroupBy.transform("mean" )["value_x"] series2 = dataFrameGroupBy.transform("mean" )["value_y"] series3 = dataFrameGroupBy.transform("sum" )["value_x"] series4 = dataFrameGroupBy.transform("sum" )["value_y"] series5 = dataFrameGroupBy.transform("count")["value_x"] series6 = dataFrameGroupBy.transform("count")["value_y"] dataFrame3 = pd.DataFrame( { 'mean_x' : series1, 'mean_y' : series2, 'sum_x' : series3, 'sum_y' : series4, 'count_x' : series5, 'count_y' : series6 } ) print(dataFrame3) """ mean_x mean_y sum_x sum_y count_x count_y 0 1.054826 NaN 1.054826 0.000000 1 0 1 0.780643 0.105718 0.780643 0.105718 1 1 2 -0.780831 NaN -0.780831 0.000000 1 0 3 -0.709413 -1.636002 -1.418825 -3.272004 2 2 4 -0.709413 -1.636002 -1.418825 -3.272004 2 2 5 NaN 1.348015 0.000000 1.348015 0 1 """ |
▶ requirements.txt
1 2 3 4 5 6 7 8 |
numpy==2.1.2 pandas==2.2.3 python-dateutil==2.9.0.post0 pytz==2024.2 six==1.16.0 tzdata==2024.2 |
※ pip install pandas 명령을 실행했다.