Researchers from Microsoft and UC Santa Barbara Propose LONGMEM: An AI Framework that Enables LLMs to Memorize Long History

www.marktechpost.com

Researchers from Microsoft and UC Santa Barbara Propose LONGMEM: An AI Framework that Enables LLMs to Memorize Long History

www.marktechpost.com

@megaman1970 to

Singularity • 2 years ago

In this paper authors from UCSB and Microsoft Research propose the LONGMEM framework, which enables language models to cache long-form prior context or knowledge into the non-differentiable memory bank and take advantage of them via a decoupled memory module to address the memory staleness problem. They create a revolutionary residual side network (SideNet) to achieve decoupled memory. A frozen backbone LLM is used to extract the paired attention keys and values from the previous context into the memory bank. The resulting attention query of the current input is utilized in the SideNet’s memory-augmented layer to access cached (keys and values) for earlier contexts. The associated memory augmentations are then fused into learning hidden states via a joint attention process.

Paper:

Augmenting Language Models with Long-Term Memory

You must log in or register to comment.

Chat

Singularity

[email protected]

Create a post

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

The technological singularity—or simply the singularity—is a hypothetical future point in time at which technological growth becomes uncontrollable and irreversible, resulting in unforeseeable changes to human civilization. According to the most popular version of the singularity hypothesis, I. J. Good’s intelligence explosion model, an upgradable intelligent agent will eventually enter a “runaway reaction” of self-improvement cycles, each new and more intelligent generation appearing more and more rapidly, causing an “explosion” in intelligence and resulting in a powerful superintelligence that qualitatively far surpasses all human intelligence.

— Wikipedia

This is a community for discussing theoretical and practical consequences related to the singularity, or any other innovation in the realm of machine learning capable of potentially disrupting our society.

You can share news, research papers, discussions and opinions. This community is mainly meant for information and discussion, so entertainment (such as memes) should generally be avoided, unless the content is thought-provoking or has some other qualities.

Rules:

Be nice to everyone, even if you disagree.
No spam. No ads.
No NSFW.
Self-promotion is acceptable if not excessive (i.e. no spam).

1 user / day
1 user / week
1 user / month
2 users / 6 months
244 subscribers
45 Posts
17 Comments
Modlog

mods:
Drew Got No Clue