Guide to HuggingFace DeepSeekMoE

1. Introduction to DeepSeekMoE
– For a detailed introduction to DeepSeekMoE, visit the Introduction page on our GitHub repository.

2. How to Use
Chat Completion
Python code to use our model for chat completion:
“`python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

model_name = “deepseek-ai/deepseek-moe-16b-chat”
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map=”auto”)
model.generation_config = GenerationConfig.from_pretrained(model_name)
model.generation_config.pad_token_id = model.generation_config.eos_token_id

messages = [
{“role”: “user”, “content”: “Who are you?”}
]
input_tensor = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors=”pt”)
outputs = model.generate(input_tensor.to(model.device), max_new_tokens=100)

result = tokenizer.decode(outputs[0][input_tensor.shape[1]:], skip_special_tokens=True)
print(result)
“`
Note: Replace `messages` with your input in the provided format.

3. License
This code repository is licensed under the MIT License. The use of DeepSeekMoE models is subject to the Model License. DeepSeekMoE supports commercial use. For more details, refer to the LICENSE-MODEL file on our GitHub repository.

4. Contact
If you have any questions or need assistance, please raise an issue on the GitHub repository or contact us at service@deepseek.com.

For more information, visit our homepage, chat with DeepSeek LLM, join our Discord community, or connect with us on WeChat. Also, you can access the DeepSeekMoE paper using the provided link.

Source link
Manual Tutorial: Using DeepSeekMoE with Hugging Face

1. Introduction to DeepSeekMoE
– The DeepSeekMoE model is a multi-lingual, multi-modal model designed to handle a wide range of natural language processing tasks.
– For more details, refer to the Introduction here: [Introduction](https://github.com/deepseek-ai/DeepSeek-MoE/blob/main)

2. How to Use
– Using the model for chat completion can be done as shown in the example below:

“`python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

model_name = “deepseek-ai/deepseek-moe-16b-chat”
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map=”auto”)
model.generation_config = GenerationConfig.from_pretrained(model_name)
model.generation_config.pad_token_id = model.generation_config.eos_token_id

messages = [
{“role”: “user”, “content”: “Who are you?”}
]
input_tensor = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors=”pt”)
outputs = model.generate(input_tensor.to(model.device), max_new_tokens=100)

result = tokenizer.decode(outputs[0][input_tensor.shape[1]:], skip_special_tokens=True)
print(result)
“`

– Alternatively, you can interact with the model using the sample template shown below:

“`
User: {messages[0][‘content’]}

Assistant: {messages[1][‘content’]}<|end▁of▁sentence|>User: {messages[2][‘content’]}

Assistant:
“`

– Note that ‘messages’ should be replaced by your input. By default, the tokenizer adds a ‘bos_token’ before the input text.

3. License
– The code repository for DeepSeekMoE is licensed under the MIT License. The use of DeepSeekMoE models is subject to the Model License. DeepSeekMoE supports commercial use.
– For detailed information, refer to the LICENSE-MODEL here: [LICENSE-MODEL](https://github.com/deepseek-ai/DeepSeek-MoE/blob/main/LICENSE-MODEL)

4. Contact
– For any inquiries or questions, please raise an issue or contact the DeepSeekMoE team at [service@deepseek.com](https://huggingface.co/deepseek-ai/mailto:service@deepseek.com)

DeepSeek Chat

[🏠Homepage] | [🤖 Chat with DeepSeek LLM] | [Discord] | [Wechat(微信)]

Paper Link👁️




1. Introduction to DeepSeekMoE

See the Introduction for more details.



2. How to Use

Here give some examples of how to use our model.

Chat Completion

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

model_name = "deepseek-ai/deepseek-moe-16b-chat"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
model.generation_config = GenerationConfig.from_pretrained(model_name)
model.generation_config.pad_token_id = model.generation_config.eos_token_id

messages = [
    {"role": "user", "content": "Who are you?"}
]
input_tensor = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
outputs = model.generate(input_tensor.to(model.device), max_new_tokens=100)

result = tokenizer.decode(outputs[0][input_tensor.shape[1]:], skip_special_tokens=True)
print(result)

Avoiding the use of the provided function apply_chat_template, you can also interact with our model following the sample template. Note that messages should be replaced by your input.

User: {messages[0]['content']}

Assistant: {messages[1]['content']}<|end▁of▁sentence|>User: {messages[2]['content']}

Assistant:

Note: By default (add_special_tokens=True), our tokenizer automatically adds a bos_token (<|begin▁of▁sentence|>) before the input text. Additionally, since the system prompt is not compatible with this version of our models, we DO NOT RECOMMEND including the system prompt in your input.



3. License

This code repository is licensed under the MIT License. The use of DeepSeekMoE models is subject to the Model License. DeepSeekMoE supports commercial use.

See the LICENSE-MODEL for more details.



4. Contact

If you have any questions, please raise an issue or contact us at service@deepseek.com.

The

tag in HTML is a container element that groups together a set of HTML elements. In this specific code snippet, the

tag is used to structure the content of a web page, and it includes various other HTML elements such as headers, paragraphs, images, and links. Here, we will explore some potential use cases of the

tag in HTML without using the h1, head, and body tags.

1. Content Organization:
The

tag is commonly used to organize the content of a web page. It can group together related elements such as text, images, and links. In the provided HTML snippet, the

tag is used to structure the entire content, including headings, paragraphs, code snippets, and links, making it easier to manage and style the content using CSS.

2. Semantic Grouping:
The

tag can be used for semantically grouping elements. In the code example, it is used to create semantic groupings of information by encapsulating related elements. For instance, the “Introduction to DeepSeekMoE,” “How to Use,” “License,” and “Contact” sections are enclosed within

tags to create clear demarcations between different sections of the content.

3. Styling and Layout:
The

tag is often used to define sections of a web page that can be styled using CSS. By enclosing content within

tags, developers can apply specific styles, such as positioning, backgrounds, borders, and margins, to these sections. The

tag provides a way to structure the content so that it can be styled and positioned as needed.

4. Code Organization:
In the provided code snippet, the

tag is used to organize various components, such as images, links, headings, and paragraphs. This makes it easier for developers to manage and maintain the codebase by grouping related elements together within

containers.

5. Separation of Concerns:
Using

tags allows developers to separate the structure of the content from its presentation. This separation of concerns is a fundamental principle in web development and facilitates the maintenance and evolution of web pages. Each

container encapsulates a specific portion of the content, enabling developers to update and modify the structure or appearance of individual sections without affecting the overall layout.

In conclusion, the

tag in HTML is a versatile and essential element for structuring and organizing content in web development. It provides a way to group related elements, apply styles, and define layout sections without the need for explicit structural tags such as and . The provided code snippet demonstrates some common use cases of the

tag, including content organization, semantic grouping, styling and layout, code organization, and separation of concerns.

2024-01-12T13:05:59+01:00