Skip to yearly menu bar Skip to main content


Poster

Pruner-Zero: Evolving Symbolic Pruning Metric From Scratch for Large Language Models

Peijie Dong · Lujun Li · Zhenheng Tang · Xiang Liu · Xinglin Pan · Qiang Wang · Xiaowen Chu


Abstract:

Despite their remarkable capabilities, Large Language Models (LLMs) face deployment challenges due to their extensive size and parameter redundancy. Pruning methods drop a subset of weights to accelerate, but many of them require retraining, which is prohibitively expensive and computationally demanding. Recently, some post-training pruning approaches have presented various pruning metrics and can prune LLMs without retraining. However, these pruning metrics require the involvement of human experts and tedious try-and-trial. To efficiently seek superior pruning metrics, we develop an automatic search of symbolic pruning metric framework via genetic programming. In particular, we deconstruct previous pruning metrics to a set of symbols and then devise an elaborate search space encompassing the existing pruning metrics to discover the potential symbolic pruning metric. We proposed an opposing operation simplification strategy to increase the diversity of the population. In this way, our Pruner-Zero framework allows to auto-generation of symbolic pruning metrics without expert knowledge. Based on the searched results, we explore the correlation between pruning metrics and performance after pruning, and summarize some principles. Extensive experiments on LLaMA and LLaMA-2 on language modeling and zero-shot tasks demonstrate that our Pruner-Zero obtains superior performance than SOTA post-training pruning methods including SparseGPT and Wanda.

Live content is unavailable. Log in and register to view live content