Skip to yearly menu bar Skip to main content


Poster

Modeling Language Tokens as Functionals of Semantic Fields

Zhengqi Pei · Anran Zhang · Shuhui Wang · Qingming Huang


Abstract: Recent advances in natural language processing have relied heavily on using Transformer-based language models.However, Transformers often require large parameter sizes and model depth.Existing Transformer-free approaches using state-space models demonstrate superiority over Transformers, yet they still lack a neuro-biologically connection to the human brain.This paper proposes ${\it LasF}$, representing ${\bf L}$anguage tokens as ${\bf F}$unctionals of semantic fields, to simulate the neuronal behaviors for better modeling language.The resulting ${\it LasF}$ module is equivalent to a nonlinear approximator tailored for sequential data. By replacing the final neural layer of pre-trained language models with the ${\it LasF}$ module, we obtain ${\it LasF}$-based models.Experiments conducted for standard reading comprehension and question-answering tasks demonstrate that the ${\it LasF}$-based models consistently improve accuracy with fewer parameters.Besides, we use CommonsenseQA's blind test set to evaluate a full-parameter tuned ${\it LasF}$-based model, which outperforms the prior best ensemble and single models by $0.4\%$ and $3.1\%$, respectively.Furthermore, our ${\it LasF}$-only language model trained from scratch outperforms existing parameter-efficient methods on language modeling in standard datasets such as WikiText103 and PennTreebank.

Live content is unavailable. Log in and register to view live content