Community-driven open-souce LLM

@lily33 · edit-2 2 years ago

Community-driven open-souce LLM

@xylogx · 2 years ago

Have a look at this paper from MS research -> https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/

“ Recent research has focused on enhancing the capability of smaller models through imitation learning, drawing on the outputs generated by large foundation models (LFMs). A number of issues impact the quality of these models, ranging from limited imitation signals from shallow LFM outputs; small scale homogeneous training data; and most notably a lack of rigorous evaluation resulting in overestimating the small model’s capability as they tend to learn to imitate the style, but not the reasoning process of LFMs. To address these challenges, we develop Orca, a 13-billion parameter model that learns to imitate the reasoning process of LFMs. Orca learns from rich signals from GPT 4 including explanation traces; step-by-step thought processes; and other complex instructions, guided by teacher assistance from ChatGPT. To promote this progressive learning, we tap into large-scale and diverse imitation data with judicious sampling and selection. Orca surpasses conventional state-of-the-art instruction-tuned models such as Vicuna-13B by more than 100% in complex zero-shot reasoning benchmarks like Big-Bench Hard (BBH) and 42% on AGIEval. Moreover, Orca reaches parity with ChatGPT on the BBH benchmark and shows competitive performance (4 pts gap with optimized system message) in professional and academic examinations like the SAT, LSAT, GRE, and GMAT, both in zero-shot settings without CoT; while trailing behind GPT–4. Our research indicates that learning from step-by-step explanations, whether these are generated by humans or more advanced AI models, is a promising direction to improve model capabilities and skills.”

James Dreben :mw: · 2 years ago

deleted by creator

@bbsm3678 · 2 years ago

To add to this, here is another model that seems to aim to to be a poor man’s chatgpt: https://huggingface.co/togethercomputer/GPT-NeoXT-Chat-Base-20B

@lily33 · 2 years ago

HuggingFace looks to me like it’s a corporation. Like, when I click on “about > join us”, I’m sent to their job offer page.

James Dreben :mw: · edit-2 2 years ago

deleted by creator

@[email protected] · 2 years ago

https://open-assistant.io/

@[email protected] · 2 years ago

The LMSYS group does some interesting benchmarks of a variety of LLM’s: https://lmsys.org/blog/

@[email protected] · 2 years ago

https://ai.facebook.com/blog/large-language-model-llama-meta-ai/

@tehnomad · 2 years ago

There’s a lot of people working on large language models. The fastest performing ones are based on Llama, which is a leaked model from Facebook. There are many llama-based models on huggingface.co. The best software to run them is oobabooga textgen UI or koboldcpp. The smaller models run pretty fast on recent Nvidia GPUs. Unfortunately, no LLM currently matches the performance of the chatgpt models yet.

The best resources I’ve found are r/localllama on reddit, discord (KoboldAI, TheBloke, and Oobabooga servers) and 4chan /lmg/.

@[email protected] · 1 year ago

Something under a copyleft (reciprocal) license would be good, anybody knows if it exist?

circuitfarmer · 1 year ago

At this point, I’d like to see better regulation about usage of user data for training before this gets approached by the FOSS community. Ideally we should see a regulatory bloodbath where AI training data is concerned (using other people’s data or creation without explicit consent, and ultimately regurgitating that data, as LLMs do).

*I don’t think we’ll ever see sufficient regulation at all – but we should. Use of data in the way needed and quantities needed clearly call for it, in my view

@[email protected] · 2 years ago

I don’t know if this is exactly what your looking for but Falcon LLM looks promising. I’ve never used it but it may work.