```python

load inputs

actives = pd.read_pickle(“actives_final.pkl”) decoys = pd.read_pickle(“decoys_final.pkl”)

stack tables

df = pd.concat([actives, decoys])

remove duplicate indices

df = df.reset_index()

sort and group by category and molId, selecting the last entry

ordered = df.sort_values(by=’tc’)
.groupby([‘category’, ‘molId’])
.last()
.reset_index()

shuffle and group by category and molId, selecting the last entry

shuffled = df.sample(frac=1, random_state=123456)
.groupby([‘category’, ‘molId’])
.last()
.reset_index()