Adaptive Pruning for Large Language Models with Structural Importance
Awareness
Adaptive Pruning for Large Language Models with Structural Importance
Awareness
The recent advancements in large language models (LLMs) have significantly improved language understanding and generation capabilities. However, it is difficult to deploy LLMs on resource-constrained edge devices due to their high computational and storage resource demands. To address this issue, we propose a novel LLM model pruning method, namely structurally-aware …