Parallel and systolic structures for matrix algebra algorithms have been around for quite a long time. Various implementations of different numerical techniques exist. With the advent of reconfigurable logic, especially FPGAs, a need has arisen to revisit these architectures and produce resource efficient versions of these algorithms. We have produced resource efficient parallel architectures for LU Decomposition and Triangular Matrix Inversion, keeping in view data computational rate requirements for real time control systems. These architectures decrease memory logic resources considerably and also maintain excellent clock period results. They also have the capability to be mapped over each other thereby further reducing resource usage and also providing us with the additional facility of Matrix Multiplication.