ISCA 2014: CACM Research Highlights Suggestions * 1. Your name (this will only be visible to the program chair and is used only to ensure that an individual cannot unfairly vote multiple times): * 2. ISCA 2014 will forward a few papers for consideration for CACM Research Highlights (RH). Final nominations are made by a SIGARCH subcommittee using their specific procedures. RH nominations are forward-looking, thought-provoking, and of interest to a broad audience. They may be imperfect papers, but create significant excitement and spirited discussions at the conference. We seek input from the ISCA'14 audience to identify such papers. Please select 1 to 4 papers below that you feel satisfy the above RH criteria. Please vote at the end of the conference and vote only once. You are free to vote for a paper of which you are an author or with which you have a conflict of interest. Unifying on-chip and inter-node switching within the Anton 2 network A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services SCORPIO: A 36-Core Research Chip Demonstrating Snoopy Coherence on a Scalable Mesh NoC with In-Network Ordering Avoiding Core's DUE & SDC via Acoustic Wave Detectors and Tailored Error Containment and Recovery MemGuard: A Low Cost and Energy Efficient Design to Support and Enhance Memory System Reliability GangES: Gang Error Simulation for Hardware Resiliency Evaluation Real-World Design and Evaluation of Compiler-Managed GPU Redundant Multithreading ArchRanker: A Ranking Approach to Design Space Exploration Aladdin: A Pre-RTL, Power-Performance Accelerator Simulator Enabling Large Design Space Exploration of Customized Architectures SynFull: Synthetic Traffic Models Capturing Cache Coherent Behaviour Harnessing ISA Diversity: Design of a Heterogeneous-ISA Chip Multiprocessor The Direct-to-Data (D2D) Cache: Navigating the Cache Hierarchy with a Single Lookup SC^2: A Statistical Compression Cache Scheme The Dirty-Block Index Going Vertical in Memory Management: Handling Multiplicity by Multi-policy Fine-grain Task Aggregation and Coordination on GPUs Enabling Preemptive Multiprogramming on GPUs Single-Graph Multiple Flows: Energy Efficient Design Alternative for GPGPUs HELIX-RC: An Architecture-Compiler Co-Design for Automatic Parallelization of Irregular Programs Efficient Digital Neurons for Large Scale Cortical Architectures An Examination of the Architecture and System-level Tradeoffs of Employing Steep Slope Devices in 3D CMPs STAG: Spintronic-Tape Architecture for GPGPU Cache Hierarchies Memory Persistency Reducing Access Latency of MLC PCMs through Line Striping HIOS: A Host Interface I/O Scheduler for Solid State Disks Towards Energy Proportionality for Large-Scale Latency-Critical Workloads SleepScale: Runtime Joint Speed Scaling and Sleep States Management for Power Efficient Data Centers Optimizing Virtual Machine Consolidation Performance on NUMA Server Architecture for Cloud Workloads Row-Buffer Decoupling: A Case for Low-Latency DRAM Microarchitecture Half-DRAM: a High-bandwidth and Low-power DRAM Architecture from the Rethinking of Fine-grained Activation Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors Architecture Implications of Pads as a Scarce Resource Increasing Off-Chip Bandwidth in Multi-Core Processors with Switchable Pins A Low Power and Reliable Charge Pump Design for Phase Change Memories Fractal++: Closing the Performance Gap between Fractal and Conventional Coherence OmniOrder: Directory-Based Conflict Serialization of Transactions Pacifier: Record and Replay for Relaxed-Consistency Multiprocessors with Distributed Directory Protocol Replay Debugging: Leveraging Record and Replay for Program Debugging The CHERI capability model: Revisiting RISC in an age of risk CODOMs: Protecting Software with Code-centric Memory Domains EOLE: Paving the Way for an Effective Implementation of Value Prediction Improving the Energy Efficiency of Big Cores General-Purpose Code Acceleration with Limited-Precision Analog Computation Race Logic: A Hardware Acceleration for Dynamic Programming Algorithms Eliminating Redundant Fragment Shader Executions on a Mobile GPU via Hardware Memoization WebCore: Architectural Support for Mobile Web Browsing Done