Starting Point
I have a DataFrame df which has a three-level MultiIndex. The innermost level is a datetime.
value data_1 data_2 data_3 data_4
id_1 id_2 effective_date
ADH10685 CA1P0 2018-07-31 0.000048 17901701 3mra Actual 198.00
2018-08-31 0.000048 17901701 3mra Actual 198.00
CB0N0 2018-07-31 4.010784 17901701 3mra Actual 0.01
2018-08-31 2.044298 17901701 3mra Actual 0.01
2018-10-31 11.493831 17901701 3mra Actual 0.01
2018-11-30 13.929844 17901701 3mra Actual 0.01
2018-12-31 21.500490 17901701 3mra Actual 0.01
CB0P0 2018-07-31 22.389493 17901701 3mra Actual 0.03
2018-08-31 23.600726 17901701 3mra Actual 0.03
2018-09-30 45.105458 17901701 3mra Actual 0.03
2018-10-31 32.249056 17901701 3mra Actual 0.03
2018-11-30 60.790889 17901701 3mra Actual 0.03
2018-12-31 46.832914 17901701 3mra Actual 0.03
You can recreate this DataFrame with the following code:
df = pd.DataFrame({'id_1': ['ADH10685','ADH10685','ADH10685','ADH10685','ADH10685','ADH10685','ADH10685','ADH10685','ADH10685','ADH10685','ADH10685','ADH10685','ADH10685'],\
'id_2': ['CA1P0','CA1P0','CB0N0','CB0N0','CB0N0','CB0N0','CB0N0','CB0P0','CB0P0','CB0P0','CB0P0','CB0P0','CB0P0'],\
'effective_date': ['2018-07-31', '2018-08-31', '2018-07-31', '2018-08-31', '2018-10-31', '2018-11-30', '2018-12-31', '2018-07-31', '2018-08-31', '2018-09-30', '2018-10-31', '2018-11-30', '2018-12-31'],\
'value': [0.000048, 0.000048, 4.010784, 2.044298, 11.493831, 13.929844, 21.500490, 22.389493, 23.600726, 45.105458, 32.249056, 60.790889, 46.832914],\
'data_1': [17901701,17901701,17901701,17901701,17901701,17901701,17901701,17901701,17901701,17901701,17901701,17901701,17901701],\
'data_2': ['3mra','3mra','3mra','3mra','3mra','3mra','3mra','3mra','3mra','3mra','3mra','3mra','3mra'],\
'data_3': ['Actual','Actual','Actual','Actual','Actual','Actual','Actual','Actual','Actual','Actual','Actual','Actual','Actual'],\
'data_4': [198.00, 198.00, 0.01, 0.01,0.01,0.01,0.01,0.03,0.03,0.03,0.03,0.03,0.03]})
df.effective_date = pd.to_datetime(df.effective_date)
df = df.groupby(['id_1', 'id_2', 'effective_date']).first()
Desired outcome
The date range I am interested in is 2018-07-31 to 2018-12-31. For each combination of id_1 and id_2, I want to resample on value.
For ('ADH10685', 'CA1P0'), I want to get 0 values from September to December. For CB0N0, I want to set September to 0, and for CB0P0, I want to change nothing.
value data_1 data_2 data_3 data_4
id_1 id_2 effective_date
ADH10685 CA1P0 2018-07-31 0.000048 17901701 3mra Actual 198.00
2018-08-31 0.000048 17901701 3mra Actual 198.00
2018-09-30 0.000000 17901701 3mra Actual 198.00
2018-10-31 0.000000 17901701 3mra Actual 198.00
2018-11-30 0.000000 17901701 3mra Actual 198.00
2018-12-31 0.000000 17901701 3mra Actual 198.00
CB0N0 2018-07-31 4.010784 17901701 3mra Actual 0.01
2018-08-31 2.044298 17901701 3mra Actual 0.01
2018-09-30 0.000008 17901701 3mra Actual 0.01
2018-10-31 11.493831 17901701 3mra Actual 0.01
2018-11-30 13.929844 17901701 3mra Actual 0.01
2018-12-31 21.500490 17901701 3mra Actual 0.01
CB0P0 2018-07-31 22.389493 17901701 3mra Actual 0.03
2018-08-31 23.600726 17901701 3mra Actual 0.03
2018-09-30 45.105458 17901701 3mra Actual 0.03
2018-10-31 32.249056 17901701 3mra Actual 0.03
2018-11-30 60.790889 17901701 3mra Actual 0.03
2018-12-31 46.832914 17901701 3mra Actual 0.03
What I've tried
I've asked a couple of questions [1] [2] related to this subject, so I have a sense of how to set the upper and lower limits for the dates and how to resample while keeping the non-value Series intact.
I have developed the following code, which works if I hardcode slicing each level.
min_date = '2018-07-31'
max_date = '2018-12-31'
# Slice to specific combination of id_1 and id_2
s = df.loc[('ADD00785', 'CA1P0')]
if not s.index.isin([min_date]).any():
s.loc[pd.to_datetime(min_date)] = np.nan
if not s.index.isin([max_date]).any():
s.loc[pd.to_datetime(max_date)] = np.nan
s.resample('M').first().fillna({'value': 0}).ffill().bfill()
I am looking for guidance on is how to best go through a large DataFrame and apply the logic to each pair of (id_1, id_2). I am also looking to clean up my sample code above to be more efficient.
JavaScript questions and answers, JavaScript questions pdf, JavaScript question bank, JavaScript questions and answers pdf, mcq on JavaScript pdf, JavaScript questions and solutions, JavaScript mcq Test , Interview JavaScript questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)