# Guide to HuggingFace Stable LM 2 Zephyr 1.6B

## Model Description
`Stable LM 2 Zephyr 1.6B` is a 1.6 billion parameter instruction-tuned language model inspired by the training pipeline of [`HuggingFaceH4’s Zephyr 7B`](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta). The model is trained on a mix of publicly available datasets and synthetic datasets, utilizing [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290).

## Usage
`Stable LM 2 Zephyr 1.6B` uses the following instruction format:
“`
<|user|>
Which famous math number begins with 1.6 …?

Source link
HuggingFace StableLM 2 Zephyr 1.6B Manual

Model Description:
The Stable LM 2 Zephyr 1.6B is a language model with 1.6 billion parameters. It is trained on a mix of publicly available datasets and synthetic datasets using Direct Preference Optimization (DPO) training pipeline. The model is inspired by HugginFaceH4’s Zephyr 7B.

Usage:
The model uses the instruction format:
|user|
Which famous math number begins with 1.6 …?|endoftext|
|assistant|
The number you are referring to is 1.618033988749895. This is the famous value known as the golden ratio|endoftext|

The format is available through the tokenizer’s apply_chat_template method in Python.

Model Details:
Training Dataset:
The dataset is a mixture of open and large-scale datasets available on the HuggingFace Hub.

Performance:
MT-Bench:
The model has a score of 5.42 in the MT-Bench evaluation.

OpenLLM Leaderboard:
The model has an average score of 49.89% in the OpenLLM Leaderboard evaluation.

Training Infrastructure:
The model was trained on the Stability AI cluster using 8 nodes with 8 A100 80GBs GPUs for each node. The DPO training was done using the HuggingFace Alignment Handbook script.

Use and Limitations:
Intended Use:
The model is intended for use in chat-like applications. Developers must evaluate the safety and performance of the model for their specific use case.

Limitations and Bias:
The model is not trained against adversarial inputs and may output potentially harmful or misinformation. Guardrails around inputs and outputs are recommended to prevent this.

How to Cite:
To cite the StableLM 2 Zephyr 1.6B model, use the following BibTeX citation:

@misc{StableLM-2-1.6B,
url={[https://huggingface.co/stabilityai/stablelm-2-1.6b](https://huggingface.co/stabilityai/stablelm-2-1.6b)},
title={Stable LM 2 1.6B},
author={Stability AI Language Team}

For additional information, please refer to the HuggingFace website.



Model Description

Stable LM 2 Zephyr 1.6B is a 1.6 billion parameter instruction tuned language model inspired by HugginFaceH4’s Zephyr 7B training pipeline. The model is trained on a mix of publicly available datasets and synthetic datasets, utilizing Direct Preference Optimization (DPO).



Usage

StableLM 2 Zephyr 1.6B uses the following instruction format:

<|user|>
Which famous math number begins with 1.6 ...?<|endoftext|>
<|assistant|>
The number you are referring to is 1.618033988749895. This is the famous value known as the golden ratio<|endoftext|>

This format is also available through the tokenizer’s apply_chat_template method:

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('stabilityai/stablelm-2-zephyr-1_6b', trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    'stabilityai/stablelm-2-zephyr-1_6b',
    trust_remote_code=True,
    device_map="auto"
)

prompt = [{'role': 'user', 'content': 'Which famous math number begins with 1.6 ...?'}]
inputs = tokenizer.apply_chat_template(
    prompt,
    add_generation_prompt=True,
    return_tensors='pt'
)

tokens = model.generate(
    inputs.to(model.device),
    max_new_tokens=1024,
    temperature=0.5,
    do_sample=True
)

print(tokenizer.decode(tokens[0], skip_special_tokens=False))



Model Details



Training Dataset

The dataset is comprised of a mixture of open datasets large-scale datasets available on the HuggingFace Hub:

  1. SFT Datasets
  • HuggingFaceH4/ultrachat_200k
  • meta-math/MetaMathQA
  • WizardLM/WizardLM_evol_instruct_V2_196k
  • Open-Orca/SlimOrca
  • openchat/openchat_sharegpt4_dataset
  • LDJnr/Capybara
  • hkust-nlp/deita-10k-v0
  1. Preference Datasets:
  • allenai/ultrafeedback_binarized_cleaned
  • Intel/orca_dpo_pairs



Performance



MT-Bench

mt_bench_plot

Model Size MT-Bench
Mistral-7B-Instruct-v0.2 7B 7.61
Llama2-Chat 70B 6.86
stablelm-zephyr-3b 3B 6.64
MPT-30B-Chat 30B 6.39
stablelm-2-zephyr-1.6b 1.6B 5.42
Falcon-40B-Instruct 40B 5.17
Qwen-1.8B-Chat 1.8B 4.95
dolphin-2.6-phi-2 2.7B 4.93
phi-2 2.7B 4.29
TinyLlama-1.1B-Chat-v1.0 1.1B 3.46



OpenLLM Leaderboard

Model Size Average ARC Challenge (acc_norm) HellaSwag (acc_norm) MMLU (acc_norm) TruthfulQA (mc2) Winogrande (acc) Gsm8k (acc)
microsoft/phi-2 2.7B 61.32% 61.09% 75.11% 58.11% 44.47% 74.35% 54.81%
stabilityai/stablelm-2-zephyr-1_6b 1.6B 49.89% 43.69% 69.34% 41.85% 45.21% 64.09% 35.18%
microsoft/phi-1_5 1.3B 47.69% 52.90% 63.79% 43.89% 40.89% 72.22% 12.43%
stabilityai/stablelm-2-1_6b 1.6B 45.54% 43.43% 70.49% 38.93% 36.65% 65.90% 17.82%
mosaicml/mpt-7b 7B 44.28% 47.70% 77.57% 30.80% 33.40% 72.14% 4.02%
KnutJaegersberg/Qwen-1_8B-Llamaified* 1.8B 44.75% 37.71% 58.87% 46.37% 39.41% 61.72% 24.41%
openlm-research/open_llama_3b_v2 3B 40.28% 40.27% 71.60% 27.12% 34.78% 67.01% 0.91%
iiuae/falcon-rw-1b 1B 37.07% 35.07% 63.56% 25.28% 35.96% 62.04% 0.53%
TinyLlama/TinyLlama-1.1B-3T 1.1B 36.40% 33.79% 60.31% 26.04% 37.32% 59.51% 1.44%



Training Infrastructure

  • Hardware: StableLM 2 Zephyr 1.6B was trained on the Stability AI cluster across 8 nodes with 8 A100 80GBs GPUs for each nodes.
  • Code Base: We use our internal script for SFT steps and used HuggingFace Alignment Handbook script for DPO training.



Use and Limitations



Intended Use

The model is intended to be used in chat-like applications. Developers must evaluate the model for safety performance in their specific use case. Read more about safety and limitations below.



Limitations and Bias


This model is not trained against adversarial inputs. We strongly recommend pairing this model with an input and output classifier to prevent harmful responses.

Through our internal red teaming, we discovered that while the model will not output harmful information if not prompted to do so, it will hallucinate many facts. It is also willing to output potentially harmful outputs or misinformation when the user requests it.
Using this model will require guardrails around your inputs and outputs to ensure that any outputs returned are not misinformation or harmful.
Additionally, as each use case is unique, we recommend running your own suite of tests to ensure proper performance of this model.
Finally, do not use the models if they are unsuitable for your application, or for any applications that may cause deliberate or unintentional harm to others.



How to Cite

@misc{StableLM-2-1.6B,
      url={[https://huggingface.co/stabilityai/stablelm-2-1.6b](https://huggingface.co/stabilityai/stablelm-2-1.6b)},
      title={Stable LM 2 1.6B},
      author={Stability AI Language Team}
}

The

tag in HTML is used to group elements together and create sections or divisions within a webpage. It has numerous use cases, and here are some examples:

1. Container: The

tag is commonly used as a container to group and organize other HTML elements. It can be styled and manipulated using CSS to create visually distinct sections on a webpage.

2. Navigation:

tags can be used to create navigation bars or menus on a webpage. By grouping links and button elements within a

, developers can create a cohesive and easily navigable user interface.

3. Model Description: In the provided HTML example, the

tag is used to encapsulate the model description section. This allows for easy styling and manipulation of the content within the section.

4. Usage Instructions: Similarly, the

tag can be used to structure and style usage instructions for a particular model, application, or feature. It allows for clear separation of content and easy targeting for JavaScript manipulation.

5. Performance Metrics: When presenting performance data, such as benchmarks or comparison tables, the

tag can be used to create organized and visually appealing sections to display the information.

6. Training Infrastructure: When providing details about the training infrastructure of a model or system, using a

tag to encapsulate the content can help improve the readability and organization of the information.

7. Intended Use and Limitations: The

tag can be used to structure and style sections that describe the intended use and limitations of a particular tool or model, allowing for clear communication to the end user.

8. Citation Information: Using a

tag to group and style citation information makes it easy to visually distinguish the citation from the rest of the content.

Overall, the

tag is a versatile and fundamental part of HTML, allowing developers to create structured and organized content on webpages. When combined with CSS and JavaScript, the

tag can facilitate a wide range of complex and interactive user interfaces, making it an essential element in web development.

2024-01-21T19:02:37+01:00