■ DataFrame 클래스의 melt 메소드에서 id_vars 인자를 사용해 WIDE 포맷 데이터에서 LONG 포맷 데이터를 구하는 방법을 보여준다.
▶ main.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
import pandas as pd dataFrame1 = pd.read_csv("air_quality_long.csv", index_col = "date.utc", parse_dates = True) # 5272건 dataFrame2 = dataFrame1[dataFrame1["parameter"] == "no2"] # 3447건 dataFrame3 = dataFrame2.pivot(columns = "location", values = "value") dataFrame4 = dataFrame3.reset_index() print(dataFrame4) """ location date.utc BETR801 FR04014 London Westminster 0 2019-04-09 01:00:00+00:00 22.5 24.4 NaN 1 2019-04-09 02:00:00+00:00 53.5 27.4 67.0 2 2019-04-09 03:00:00+00:00 54.5 34.2 67.0 3 2019-04-09 04:00:00+00:00 34.5 48.5 41.0 4 2019-04-09 05:00:00+00:00 46.5 59.5 41.0 ... ... ... ... ... 1700 2019-06-20 20:00:00+00:00 NaN 21.4 NaN 1701 2019-06-20 21:00:00+00:00 NaN 24.9 NaN 1702 2019-06-20 22:00:00+00:00 NaN 26.5 NaN 1703 2019-06-20 23:00:00+00:00 NaN 21.8 NaN 1704 2019-06-21 00:00:00+00:00 NaN 20.0 NaN [1705 rows x 4 columns] """ print() dataFrame5 = dataFrame4.melt(id_vars = "date.utc") print(dataFrame5) """ date.utc location value 0 2019-04-09 01:00:00+00:00 BETR801 22.5 1 2019-04-09 02:00:00+00:00 BETR801 53.5 2 2019-04-09 03:00:00+00:00 BETR801 54.5 3 2019-04-09 04:00:00+00:00 BETR801 34.5 4 2019-04-09 05:00:00+00:00 BETR801 46.5 ... ... ... ... 5110 2019-06-20 20:00:00+00:00 London Westminster NaN 5111 2019-06-20 21:00:00+00:00 London Westminster NaN 5112 2019-06-20 22:00:00+00:00 London Westminster NaN 5113 2019-06-20 23:00:00+00:00 London Westminster NaN 5114 2019-06-21 00:00:00+00:00 London Westminster NaN [5115 rows x 3 columns] """ |
▶ requirements.txt
1 2 3 4 5 6 7 8 |
numpy==2.1.2 pandas==2.2.3 python-dateutil==2.9.0.post0 pytz==2024.2 six==1.16.0 tzdata==2024.2 |
※ pip install pandas 명령을 실행했다.