The rapid advance in communication technology brings a request for cryptography systems of higher performance.We systematically implement and compare several variants of partly parallel systolic architecture for Montgomery multiplier with different bit length as well as with different micro architectural approaches.The optimal options are chosen to take advantage of the underlying technology.The result analysis shows that the fully serial systolic architecture
in which one cell processes one bit
achieves the best performance.When the resource overhead is represented as area-time product
it is one of the most cost-efficient designs as well.