编组
ConsoleUser Center

Mengzi GPT

Introduction to Technology

The Mengzi GPT Model is a large-scale language model developed with Langboat self-developed technology, which undergoes pre training, SFT, and alignment steps. It can handle multilingual and multimodal data, while supporting multiple text comprehension and generation tasks, meeting the needs of different fields and application scenarios. The Mengzi models, based on the Transformer architecture, contains parameters from 1B,10B to 100B of parameters. They were trained with tens of trilions of tokens of high-quality corpus covering numourous internet web pages, communities, news, books, e-commerce websites, finance websites and other sources . Mengzi is a well-known domestic model brand that has achieved excellent results in benchmark evaluations of Chinese LLMs, such as C-EVAL and SUPERCLUE. The Mengzi model has been registered with the China Cyberspace Administration for Generative Artificial Intelligence by the end of 2023, and it has officially opened to the public for general service.

In addtion to GPT, we also developed LLMs based on BERT、T5 architectures,which have been widely applied in our information extraction and macine translation products.

solution.title

Mengzi GPT

Introduction to Technology

The Mengzi GPT Model is a large-scale language model developed with Langboat self-developed technology, which undergoes pre training, SFT, and alignment steps. It can handle multilingual and multimodal data, while supporting multiple text comprehension and generation tasks, meeting the needs of different fields and application scenarios. The Mengzi models, based on the Transformer architecture, contains parameters from 1B,10B to 100B of parameters. They were trained with tens of trilions of tokens of high-quality corpus covering numourous internet web pages, communities, news, books, e-commerce websites, finance websites and other sources . Mengzi is a well-known domestic model brand that has achieved excellent results in benchmark evaluations of Chinese LLMs, such as C-EVAL and SUPERCLUE. The Mengzi model has been registered with the China Cyberspace Administration for Generative Artificial Intelligence by the end of 2023, and it has officially opened to the public for general service.

In addtion to GPT, we also developed LLMs based on BERT、T5 architectures,which have been widely applied in our information extraction and macine translation products.

Technical Solutions

mengzi-advantage

Support Multiple Model Architectures

  • Autoregressive models: such as GPT
  • Self-encoding models: such as BERT
  • Encoder-Decoder model: T5
mengzi-advantage

Lightweight Model Performance Enhancement

  • Fusion of multiple pre-training tasks
  • SMART adverserial training
  • Knowledge distillation
mengzi-advantage

Knowledge Graph Based Enhancement

  • Enhancements with entity extraction
  • Knowledge graph enhancement (isa relationship)
  • Knowledge graph to text conversion
mengzi-advantage

Linguistic Knowledge Based Enhancement

  • Mask mechanism enhanced by syntactic information
  • Semantic role embedding enhancement
  • Attention weight constrained pruning of dependencies
mengzi-advantage

Few-Shot/Zero-Shot Learning

  • Prompt template construction
  • Multi-task learning technique
  • Common information extraction scenarios, out of the box
mengzi-advantage

Retrieval Based Enhancement

  • Knowledge decoupling
  • Strong interpretability
  • External knowledge components are updated in real time

Technical Advantages

advantage

It has achieved better performance than conventional models in multiple tasks

advantage

It supports BERT, GPT, T5 and other architectures, with different scenarios covered

line2line2line1centerline4
advantage

It supports image and text dual-mode input, which better handles image and text related tasks

advantage

It supports rapid optimization for vertical domains, and offers models scaling from 10M to 1B parameters

C-Eval Leaderboard

*Ranking as of August, 2023

#01234
ModelMengziChatGLM2InternLM-123BGPT-4*AiLMe-100B v2
CreatorLangboatTsinghua & Zhipi.AIShanghai AI Lab & Sense TimeOpenAIAPUS
Submission Date2023/8/252023/6/252023/8/222023/5/152023/7/25
Avg71.571.168.868.767.7
Avg(Hard)48.8505054.955.3
STEM62.364.463.567.165.4
Social Science87.281.681.477.672.3
Humanities76.873.772.764.571.2
Others68.671.36367.864
#ModelCreatorSubmission DateAvgAvg(Hard)STEMSocial ScienceHumanitiesOthers
0MengziLangboat2023/8/2571.548.862.387.276.868.6
1ChatGLM2Tsinghua & Zhipi.AI2023/6/2571.15064.481.673.771.3
2InternLM-123BShanghai AI Lab & Sense Time2023/8/2268.85063.581.472.763
3GPT-4*OpenAI2023/5/1568.754.967.177.664.567.8
4AiLMe-100B v2APUS2023/7/2567.755.365.472.371.264

CLUE Leaderboards

*Ranking as of July 30, 2021

Ranking123
ModelMengziMotianBETRTSGHuman Level
Scale1B1B10B
Total Score82.9082.1581.8086.68
AFQMC79.8278.3079.8581.00
TNEWS64.6857.4257.4271.00
IFLYTEK65.0865.4664.5480.30
OCNLI81.8784.9785.9390.30
WSC202096.5594.8395.1798.00
CSL89.8790.1789.0084.00
CMRC201882.2585.3083.8092.40
CHID96.0094.4393.0687.10
C389.9888.4987.4496.00
RankingModelScaleTotal ScoreAFQMCTNEWSIFLYTEKOCNLIWSC2020CSLCMRC2018CHIDC3
1Mengzi1B82.9079.8264.6865.0881.8796.5589.8782.2596.0089.98
2Motian1B82.1578.3057.4265.4684.9794.8390.1785.3094.4388.49
3BETRTSG10B81.8079.8557.4264.5485.9395.1789.0083.8093.0687.44
Human Level86.6881.0071.0080.3090.3098.0084.0092.4087.1096.00

Application Scenarios

scene

Bulletin Extraction

The model can extract announcement information from a large amount of text, which is convenient for quickly obtaining important information.
scene

Fiction Generation

The model can automatically generate novel content based on the information provided by users.
scene

Sentiment Classification

The model can perform sentiment analysis on the text to distinguish positive, negative or neutral sentiment in the text.
scene

Research Reports Classification

The model can classify research reports and classify them according to different themes.
scene

News Digest

The model can automatically generate news summaries and quickly provide news key information.
scene

Knowledge Map Construction

The model can build a knowledge graph based on existing knowledge, which is convenient for quick query.
scene

Q&A System

The model can provide answers to questions through semantic analysis.
scene

Image-text Mutual Inspection

The model can measure the relevance of a text and images.

Customer Stories

https://cdn.langboat.com/portal/page.technology.mengzi.case1.title

Hithink RoyalFlush Information Network Co., Ltd.

Together with RoyalFlush, Langboat Technology focuses on the field of cognitive intelligence, jointly innovates NLP technology, upgrades products and services in the financial technology field, and brings better user experience to customers.

Experience Mengzi GPT Model

Products

Business Cooperation Email

bd@langboat.com

ewm

Address

Floor 16, Fangzheng International Building, No. 52 Beisihuan West Road, Haidian District, Beijing, China.


© 2023, Langboat Co., Limited. All rights reserved.


Large Model Registration Code:Beijing-MengZiGPT-20231205


Business Cooperation:

bd@langboat.com

Address:

Floor 16, Fangzheng International Building, No. 52 Beisihuan West Road, Haidian District, Beijing, China.

Official Accounts:

ewm

© 2023, Langboat Co., Limited. All rights reserved.

Large Model Registration Code:Beijing-MengZiGPT-20231205

support
business