Process variations are of great concern in modern technologies. Early prediction of their effects on the circuit performance and parametric yield is extremely useful. In today's microprocessors, custom designed transistor level macros and memory array macros, like caches, occupy a significant fraction of the total core area. While block-based statistical static timing analysis (SSTA) techniques are fast and can be used for analyzing cell based designs, they cannot be used for transistor level macros. Currently, such macros are either abstracted with statistical timing models which are less accurate or are analyzed using statistical Monte-Carlo circuit simulations which are time consuming. In this paper, we develop a fast and accurate flow that can be used to perform SSTA on large transistor and memory array macros. The delay distributions of paths obtained using our flow for a large, industrial, 45nm, transistor level macro have error of less than 6% compared to those obtained after rigorous Monte-Carlo SPICE simulations. The resulting flow enables full-chip SSTA, provides visibility into the macro even at the chip level, and eliminates the need to abstract the macros with statistical timing models.