Just testing if HRM Generalizes better compared to others in Vision Task, here ViT based architecture is used. Patchify then flatten the image and then pass it to HRM.
-
Here I made my own HRM layer based on the paper. Using Recurrent mechanism to model the High level and low level cycles.
-
And the Initial results are promising. The smaller HRM model beats the larger Resnet model. On Mnist dataset with 1000 balanced samples. And when me and my Arjun tested on it with more epochs, HRM didn't seeminly overfit.
-
Recent findings by us show that HRM is not only good at generalization but also in robustness.
Can this be a new type of architecture that can change the world of Deep Learning?