Ask a Question

Prefer a chat interface with context about you and your work?

Toxic Subword Pruning for Dialogue Response Generation on Large Language Models

Toxic Subword Pruning for Dialogue Response Generation on Large Language Models

How to defend large language models (LLMs) from generating toxic content is an important research area. Yet, most research focused on various model training techniques to remediate LLMs by updating their weights. A typical related research area is safety alignment. This however is often costly and tedious and can expose …