Workload Characterization for Memory Management in Emerging Embedded Platforms
Abstract
Memory has emerged as a primary performance and energy bottleneck for emerging embedded platforms that integrate heterogeneous compute units. Applications require a balance between performance and energy-efficiency and finding the optimal operating point on embedded platforms is challenging. There exist many opportunities to manage the memory subsystem efficiently at runtime to save energy without compromising quality in the face of dynamic workloads. Previous works have used memory bandwidth utilization to determine memory requirements and develop runtime policies to configure system knobs (e.g., memory controller frequency) accordingly. However, bandwidth utilization as a singular metric is not always sufficient: policies for a range of workload scenarios require insight into an application’s memory access pattern and working set size. Alternatively, memory profilers provide fine-grained information such as the memory access pattern for the entire virtual address space, and the load/store density of different regions of the memory. However, parsing this detailed information frequently at runtime induces excessive overhead. In this work, we propose a profiling mechanism that considers both (1) the working set size of running workloads and (2) memory bandwidth utilization to compute WBP (Working Set Size-Bandwidth Product). WBP can be estimated with low overhead, and the combined metric provides insights that runtime policies can use to determine desirable configurations for specific workload scenarios. Our early results show that a static configuration devised with this metric yields an optimal memory controller frequency for 8 out of 10 PARSEC workloads, demonstrating the promise of this approach.