文章主题:moss, moon, plugin

666AI工具大全,助力做AI时代先行者!

IT之家 4 月 21 日消息,复旦大学自然语言处理实验室开发的新版 MOSS 模型今日正式上线,成为国内首个插件增强的开源对话语言模型

目前,MOSS 模型已上线开源,相关代码、数据、模型参数已在 Github 和 Hugging Face 等平台开放,供科研人员下载。

据介绍,MOSS 是一个支持中英双语和多种插件的开源对话语言模型,moss-moon 系列模型具有 160 亿参数,在 FP16 精度下可在单张 A100 / A800 或两张 3090 显卡运行,在 INT4/8 精度下可在单张 3090 显卡运行。MOSS 基座语言模型在约七千亿中英文以及代码单词上预训练得到,后续经过对话指令微调、插件增强学习和人类偏好训练具备多轮对话能力及使用多种插件的能力。

MOSS 来自复旦大学自然语言处理实验室的邱锡鹏教授团队,名字与《流浪地球》电影中的 AI 同名,已发布至公开平台(https://moss.fastnlp.top/),邀请公众参与内测。

IT之家查看 MOSS 的 GitHub 页面发现,该项目所含代码采用 Apache 2.0 协议,数据采用 CC BY-NC 4.0 协议,模型权重采用 GNU AGPL 3.0 协议。如需将该项目所含模型用于商业用途或公开部署,需要签署文件并发送至 robot@fudan.edu.cn 取得授权,商用情况仅用于记录,不会收取任何费用。

MOSS 用例:
▲ 解方程
▲ 生成图片
▲ 无害性测试模型

Moss Moon Base: MOSS-003, a high-quality base model trained in both English and Chinese using self-supervised pre-training on large-scale data. The pre-trained dataset contains approximately 700 billion words and 6.67×1022 floating-point operations.

moss-moon-003-sft: 基座模型在约 110 万多轮对话数据上微调得到,具有指令遵循能力、多轮对话能力、规避有害请求能力。

🌟 Moss Moon 003 SFT Plugin: 🌟🚀 The base model was fine-tuned on over 110 thousand rounds of dialogue data and approximately 30 thousand plugin enhancements. It also has the ability to use search engines, generate text from images, perform calculations, and solve equations.💡 Moss Moon 003 SFT is built upon the foundation of moss-moon-003-sft, adding four additional plugins: search engines, text-to-image generation, calculator, and equation-solving.

🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀🚀 Moss Moon 003 PM 🚀

🚀 Moss Moon 003 is the final model trained on the basis of Moss Moon 003-SFT, which has better factuality and security as well as more stable response quality. It will be released soon.

🚀 Moss Moon 003 Plugin: A Stronger Intent Understanding and Plugin Usage Capable Model Trained on Moss Moon 003 SFT Plugin’s Foundation 🌟🚀 Moss Moon 003 Plugin is a powerful model that has been trained using the Moss Moon 003 SFT Plugin as its foundation. This advanced model has undergone preference modeling to enhance its intent understanding capabilities and plugin usage proficiency, making it an excellent choice for your needs.🚀 The upcoming release of this model promises to be a game-changer in the world of plugins. With its enhanced abilities, you can expect to see significant improvements in the performance and functionality of your Moss Moon 003-based applications.🚀 As a seasoned plugin developer, you know that having a reliable and efficient tool is crucial for success. That’s why we’re excited to announce the imminent release of this model, which will undoubtedly enhance your workflow and make your development process smoother than ever before.🚀 So stay tuned for more updates on Moss Moon 003 Plugin and its capabilities. We can’t wait to see what this powerful tool has in store for us!# moss-moon-003-plugin #intent-understanding #plugin-usage-capable-model #training #sft-plugin #foundation #game-changer #performance #functionality #workflow #development-process #reliability #efficiency #tool #powerful #update #capabilities #excited #wait

数据

✨ Moss-002 SFT Data 🌟🚀 The moss-002-sft-data dataset is a comprehensive resource that covers three key aspects of conversation quality: usefulness, reliability, and harmlessness. It includes over 570,000 English conversations generated by text-davinci-003 and approximately 590,000 Chinese conversations.💡 The dataset provides valuable insights into the effectiveness of different conversational models in various contexts. By analyzing these conversations, researchers can better understand how to design more effective and trustworthy AI systems.📝 If you’re interested in exploring this dataset further, feel free to contact us at [insert email address here] for more information or assistance with your research.# Moss-002 SFT Data 🌟🚀 The moss-002-sft-data dataset is a comprehensive resource that covers three key aspects of conversation quality: usefulness, reliability, and harmlessness. It includes over 570,000 English conversations generated by text-davinci-003 and approximately 590,000 Chinese conversations.💡 The dataset provides valuable insights into the effectiveness of different conversational models in various contexts. By analyzing these conversations, researchers can better understand how to design more effective and trustworthy AI systems.📝 If you’re interested in exploring this dataset further, feel free to contact us at [insert email address here] for more information or assistance with your research.# Moss-002 SFT Data 🌟🚀 The moss-002-sft-data dataset is a comprehensive resource that covers three key aspects of conversation quality: usefulness, reliability, and harmlessness. It includes over 570,000 English conversations generated by text-davinci-003 and approximately 590,000 Chinese conversations.💡 The dataset provides valuable insights into the effectiveness of different conversational models in various contexts. By analyzing these conversations, researchers can better understand how to design more effective and trustworthy AI systems.📝 If you’re interested in exploring this dataset further, feel free to contact us at [insert email address here] for more information or assistance with your research.

moss-003-sft-data: moss-moon-003-sft 所使用的多轮对话数据,基于 MOSS-002 内测阶段采集的约 10 万用户输入数据和 gpt-3.5-turbo 构造而成,相比 moss-002-sft-data,moss-003-sft-data 更加符合真实用户意图分布,包含更细粒度的有用性类别标记、更广泛的无害性数据和更长对话轮数,约含 110 万条对话数据。目前仅开源少量示例数据,完整数据将在近期开源。

moss-003-sft-plugin-data: moss-moon-003-sft-plugin 所使用的插件增强的多轮对话数据,包含支持搜索引擎、文生图、计算器、解方程等四个插件在内的约 30 万条多轮对话数据。目前仅开源少量示例数据,完整数据将在近期开源。

moss-003-pm-data: moss-moon-003-pm 所使用的偏好数据,包含在约 18 万额外对话上下文数据及使用 moss-moon-003-sft 所产生的回复数据上构造得到的偏好对比数据,将在近期开源。

MOSS 的 GitHub 页面:点此查看

AI时代,掌握AI大模型第一手资讯!AI时代不落人后!

免费ChatGPT问答,办公、写作、生活好得力助手!

扫码右边公众号,驾驭AI生产力!

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注