Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
259 views
in Technique[技术] by (71.8m points)

python - pandas groupby and aggregator

I have the following dataframe:

dates,rr,ax,be
2018-01-01 00:00:00,45.73,47.63,45.83
2018-01-01 01:00:00,44.16,44.42,44.59
2018-01-01 02:00:00,42.24,42.34,40.22
2018-01-01 03:00:00,39.29,38.36,37.31
2018-01-01 04:00:00,36.0,36.87,32.88
2018-01-01 05:00:00,41.99,39.79,39.27
2018-01-01 06:00:00,42.25,42.08,43.62
2018-01-01 07:00:00,44.97,51.19,49.69
2018-01-01 08:00:00,45.0,59.69,49.98
2018-01-01 09:00:00,44.94,56.67,48.04
2018-01-01 10:00:00,45.04,53.54,46.85
2018-01-01 11:00:00,46.67,52.6,47.95
2018-01-01 12:00:00,46.99,50.77,46.6
2018-01-01 13:00:00,44.16,50.27,43.02
2018-01-01 14:00:00,45.26,50.64,44.2
2018-01-01 15:00:00,47.84,54.79,47.1
2018-01-01 16:00:00,50.1,60.17,50.83
2018-01-01 17:00:00,54.3,59.47,58.31
2018-01-01 18:00:00,51.91,60.16,63.5
2018-01-01 19:00:00,51.38,70.81,61.9
2018-01-01 20:00:00,49.2,62.65,59.62
2018-01-01 21:00:00,45.73,59.71,52.84
2018-01-01 22:00:00,44.84,50.96,51.43
2018-01-01 23:00:00,38.11,46.52,45.35
2018-01-02 00:00:00,19.19,49.62,41.61
2018-01-02 01:00:00,14.99,45.05,40.78
2018-01-02 02:00:00,11.0,45.18,39.59

then I group according to the hour in the index as follows:

df1 = df.assign(hour=lambda x: x.index.hour).groupby('hour')

at this point, I would like to compute the mean of each group. I do as follow:

aa = df1['hour'].mean()

This seems to not working. What am I missing? Thanks, Diego


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Solution looks good, should by simplify:

aa = df.groupby(df.index.hour.rename('hour')).mean()
print (aa)
          rr      ax      be
hour                        
0     32.460  48.625  43.720
1     29.575  44.735  42.685
2     26.620  43.760  39.905
3     39.290  38.360  37.310
4     36.000  36.870  32.880
5     41.990  39.790  39.270
6     42.250  42.080  43.620
7     44.970  51.190  49.690
8     45.000  59.690  49.980
9     44.940  56.670  48.040
10    45.040  53.540  46.850
11    46.670  52.600  47.950
12    46.990  50.770  46.600
13    44.160  50.270  43.020
14    45.260  50.640  44.200
15    47.840  54.790  47.100
16    50.100  60.170  50.830
17    54.300  59.470  58.310
18    51.910  60.160  63.500
19    51.380  70.810  61.900
20    49.200  62.650  59.620
21    45.730  59.710  52.840
22    44.840  50.960  51.430
23    38.110  46.520  45.350
    

In your solution:

df1 = df.assign(hour=lambda x: x.index.hour).groupby('hour')
#removed ['hour'] for get means of all numeric columns
aa = df1.mean()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...