# Huggingface Guide

## Model Description
The `stable-code-3b` is a 2.7 billion parameter decoder-only language model pre-trained on 1.3 trillion tokens of diverse textual and code datasets. It is trained on 18 programming languages and demonstrates state-of-the-art performance on the MultiPL-E metrics across multiple programming languages tested using BigCode’s Evaluation Harness.

![spiderchart](https://huggingface.co/stabilityai/stable-code-3b/resolve/main/stable_code_3b_spiderchart.svg)

### Model Comparison
| Model | Size | Python | C++ | Javascript | Java | PHP | Rust |
|—————-|——|——–|——|————|——|——|——|
| Stable Code | 3B | 32.4% | 30.9%| 32.1% | 32.1%| 24.2%| 23.0%|
| CodeLlama | 7B | 30.0% | 28.2%| 32.5% | 31.1%| 25.7%| 26.3%|
| Deepseek Coder | 1.3B | 28.6% | 29.2%| 28.7% | 29.0%| 23.6%| 18.5%|

### Key Features
– Fill in Middle Capability (FIM)
– Supports Long Context, trained with Sequences up to 16,384

## Usage
Get started generating text with `stable-code-3b` by using the following code snippet:

“`python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(“stabilityai/stable-code-3b”, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(“stabilityai/stable-code-3b”, trust_remote_code=True, torch_dtype=”auto”)
model.cuda()

inputs = tokenizer(“import torch\nimport torch.nn as nn”, return_tensors=”pt”).to(model.device)
tokens = model.generate(**inputs, max_new_tokens=48, temperature=0.2, do_sample=True)

print(tokenizer.decode(tokens[0], skip_special_tokens=True))
“`

## Run with Fill in Middle (FIM) ⚡️

Click to expand

“`python
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(“stabilityai/stable-code-3b”, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(“stabilityai/stable-code-3b”, trust_remote_code=True, torch_dtype=”auto”, attn_implementation=”flash_attention_2″)
model.cuda()

inputs = tokenizer(“def fib(n): else:\n return fib(n – 2) + fib(n – 1)“, return_tensors=”pt”).to(model.device)
tokens = model.generate(**inputs, max_new_tokens=48, temperature=0.2, do_sample=True)

print(tokenizer.decode(tokens[0], skip_special_tokens=True))
“`

## Run with Flash Attention 2 ⚡️

Click to expand

“`python
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(“stabilityai/stable-code-3b”, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(“stabilityai/stable-code-3b”, trust_remote_code=True, torch_dtype=”auto”, attn_implementation=”flash_attention_2″)
model.cuda()

inputs = tokenizer(“import torch\nimport torch.nn as nn”, return_tensors=”pt”).to(model.device)
tokens = model.generate(**inputs, max_new_tokens=48, temperature=0.2, do_sample=True)

print(tokenizer.decode(tokens[0], skip_special_tokens=True))
“`

## Model Details
– **Developed by**: [Stability AI](https://stability.ai)
– **Model type**: `stable-code-3b` models are auto-regressive language models based on the transformer decoder architecture.
– **Language(s)**: English, Code
– **Library**: [GPT-NeoX](https://github.com/EleutherAI/gpt-neox)
– **License**: Other
– **Contact**: For questions and comments about the model, please email `lm@stability.ai`

### Model Architecture
The model is a decoder-only transformer with the following specifications:
| Parameters | Hidden Size | Layers | Heads | Sequence Length |
|————–|————-|——–|——-|——————|
| 2,796,431,360| 2560 | 32 | 32 | 16384 |

## Training
### Training Dataset
The dataset is comprised of a filtered mixture of open-source large-scale datasets available on the HuggingFace Hub, including Falcon RefinedWeb extract, CommitPackFT, and Github Issues. We further supplement our training with data from mathematical domains.

Top 18 programming languages trained on:
– C
– CPP
– Java
– JavaScript
– CSS
– Go
– HTML
– Ruby
– Rust
– Markdown
– Shell
– Php
– Sql
– R
– Typescript
– Python
– Jupyter-Clean
– RestructuredText

### Training Procedure
The model is pre-trained on the aforementioned datasets in `bfloat16` precision, optimized with AdamW.

### Training Infrastructure
– **Hardware**: `stable-code-3b` was trained on the Stability AI cluster across 256 NVIDIA A100 40GB GPUs (AWS P4d instances).
– **Software**: The model was trained under 2D parallelism (Data and Tensor Parallel) with ZeRO-1, and relied on flash-attention as well as SwiGLU and Rotary Embedding kernels from FlashAttention-2.

## Use and Limitations
### Intended Use
The model is intended to be used as a foundational base model for application-specific fine-tuning. Developers must evaluate and fine-tune the model for safe performance in downstream applications.

### Limitations and Bias
As a base model, this model may exhibit undesirable behaviors that must be corrected through evaluation and fine-tuning prior to deployment. The pre-training dataset may have contained offensive or inappropriate content, which can be reflected in the model-generated text. We recommend that users exercise caution when using these models in production systems.

## How to Cite
“`bibtex
@misc{stable-code-3b,
url={https://huggingface.co/stabilityai/stable-code-3b},
title={Stable Code 3B},
author={Pinnaparaju, Nikhil and Adithyan, Reshinth and Phung, Duy and Tow, Jonathan and Baicoianu, James and and Cooper, Nathan}
}
“`

Source link
# HuggingFace Manual

## Model Description
The `stable-code-3b` is a 2.7B billion parameter decoder-only language model pre-trained on 1.3 trillion tokens of diverse textual and code datasets. It is trained on 18 programming languages and demonstrates state-of-the-art performance on the MultiPL-E metrics across multiple programming languages tested using BigCode’s Evaluation Harness.

You can view the spiderchart for the model [here](https://huggingface.co/stabilityai/stable-code-3b/blob/main/stable_code_3b_spiderchart.svg).

Model Comparison:
| Model | Size | Python | C++ | Javascript | Java | PHP | Rust |
|——————-|——|——–|——|————-|——|——|——|
| Stable Code | 3B | 32.4% | 30.9%| 32.1% | 32.1%| 24.2%| 23.0%|
| CodeLLama | 7B | 30.0% | 28.2%| 32.5% | 31.1%| 25.7%| 26.3%|
| Deepseek Coder | 1.3B | 28.6% | 29.2%| 28.7% | 29.0%| 23.6%| 18.5%|
| Wizard Coder | 3B | 31.6% | 25.6%| 26.2% | 25.8%| 25.3%| 20.4%|
| StarCoder | 3B | 21.6% | 19.8%| 21.5% | 20.5%| 19.0%| 16.9%|
| Replit Code V1.5 | 3B | 23.0% | 25.9%| 26.2% | 23.6%| 23.2%| 21.5%|
| Deci Coder | 1B | 19.1% | 6.8% | 18.4% | 16.7%| 2.1% | 1.7% |

**Key Features**
– Fill in Middle Capability (FIM)
– Supports Long Context, trained with Sequences up to 16,384

## Usage
To get started generating text with `stable-code-3b`, use the following code snippet:

“`python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(“stabilityai/stable-code-3b”, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(“stabilityai/stable-code-3b”, trust_remote_code=True, torch_dtype=”auto”)
model.cuda()

inputs = tokenizer(“import torch\nimport torch.nn as nn”, return_tensors=”pt”).to(model.device)
tokens = model.generate(**inputs, max_new_tokens=48, temperature=0.2, do_sample=True)
print(tokenizer.decode(tokens[0], skip_special_tokens=True))
“`

## Run with Fill in Middle (FIM) ⚡️

Click to expand

“`python
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(“stabilityai/stable-code-3b”, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(“stabilityai/stable-code-3b”, trust_remote_code=True, torch_dtype=”auto”, attn_implementation=”flash_attention_2″)
model.cuda()

inputs = tokenizer(“def fib(n): else:\n return fib(n – 2) + fib(n – 1)“, return_tensors=”pt”).to(model.device)
tokens = model.generate(**inputs, max_new_tokens=48, temperature=0.2, do_sample=True)
print(tokenizer.decode(tokens[0], skip_special_tokens=True))
“`

## Run with Flash Attention 2 ⚡️

Click to expand

“`python
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(“stabilityai/stable-code-3b”, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(“stabilityai/stable-code-3b”, trust_remote_code=True, torch_dtype=”auto”, attn_implementation=”flash_attention_2″)
model.cuda()

inputs = tokenizer(“import torch\nimport torch.nn as nn”, return_tensors=”pt”).to(model.device)
tokens = model.generate(**inputs, max_new_tokens=48, temperature=0.2, do_sample=True)
print(tokenizer.decode(tokens[0], skip_special_tokens=True))
“`

## Model Details
– **Developed by**: [Stability AI](https://stability.ai/)
– **Model type**: `stable-code-3b` models are auto-regressive language models based on the transformer decoder architecture.
– **Language(s)**: English, Code
– **Library**: [GPT-NeoX](https://github.com/EleutherAI/gpt-neox)
– **License**: Other
– **Contact**: For questions and comments about the model, please email lm@stability.ai

## Model Architecture
The model is a decoder-only transformer similar to the LLaMA architecture with some modifications. It has the following parameters:
– Parameters: 2,796,431,360
– Hidden Size: 2560
– Layers: 32
– Heads: 32
– Sequence Length: 16384
– Position Embeddings: Rotary Position Embeddings applied to the first 25% of head embedding dimensions for improved throughput
– Tokenizer: We use a modified version of the GPTNeoX Tokenizer with special tokens for Fill in the Middle (FIM) capabilities

## Training
### Training Dataset
The dataset is comprised of a filtered mixture of open-source large-scale datasets available on the HuggingFace Hub, Falcon RefinedWeb extract, CommitPackFT, Github Issues, and StarCoder. It also includes data from mathematical domains such as Azerbayev and Zhangir, and Yu, Longhui.

Top 18 programming languages trained on include C, CPP, Java, JavaScript, CSS, Go, HTML, Ruby, Rust, Markdown, Shell, Php, Sql, R, Typescript, Python, Jupyter-Clean, and RestructuredText.

### Training Procedure
The model is pre-trained on the aforementioned datasets in bfloat16 precision, optimized with AdamW.

### Training Infrastructure
– **Hardware**: `stable-code-3b` was trained on the Stability AI cluster across 256 NVIDIA A100 40GB GPUs (AWS P4d instances).
– **Software**: We use a fork of `gpt-neox`, train under 2D parallelism (Data and Tensor Parallel) with ZeRO-1, and rely on flash-attention as well as SwiGLU and Rotary Embedding kernels from FlashAttention-2.

## Use and Limitations
### Intended Use
The model is intended to be used as a foundational base model for application-specific fine-tuning. Developers must evaluate and fine-tune the model for safe performance in downstream applications.

### Limitations and Bias
As a base model, `stable-code-3b` may exhibit unreliable, unsafe, or other undesirable behaviors that must be corrected through evaluation and fine-tuning prior to deployment. The pre-training dataset may have contained offensive or inappropriate content, even after applying data cleansing filters, which can be reflected in the model-generated text. We recommend that users exercise caution when using these models in production systems.

## How to Cite
“`bibtex
@misc{stable-code-3b,
url={[https://huggingface.co/stabilityai/stable-code-3b](https://huggingface.co/stabilityai/stable-code-3b)},
title={Stable Code 3B},
author={Pinnaparaju, Nikhil and Adithyan, Reshinth and Phung, Duy and Tow, Jonathan and Baicoianu, James and Cooper, Nathan}
}
“`

For any further questions and information about the model, please visit [HuggingFace](https://huggingface.co/stabilityai/stable-code-3b).



Model Description

stable-code-3b is a 2.7B billion parameter decoder-only language model pre-trained on 1.3 trillion tokens of diverse textual and code datasets. stable-code-3b is trained on 18 programming languages (selected based on the 2023 StackOverflow Developer Survey) and demonstrates state-of-the-art performance (compared to models of similar size) on the MultiPL-E metrics across multiple programming languages tested using BigCode’s Evaluation Harness.

spiderchart

Model Size Python C++ Javascript Java PHP Rust
Stable Code 3B 32.4% 30.9% 32.1% 32.1% 24.2% 23.0%
CodeLLama 7B 30.0% 28.2% 32.5% 31.1% 25.7% 26.3%
Deepseek Coder 1.3B 28.6% 29.2% 28.7% 29.0% 23.6% 18.5%
Wizard Coder 3B 31.6% 25.6% 26.2% 25.8% 25.3% 20.4%
StarCoder 3B 21.6% 19.8% 21.5% 20.5% 19.0% 16.9%
Replit Code V1.5 3B 23.0% 25.9% 26.2% 23.6% 23.2% 21.5%
Deci Coder 1B 19.1% 6.8% 18.4% 16.7% 2.1% 1.7%

Key Features

  • Fill in Middle Capability (FIM)
  • Supports Long Context, trained with Sequences upto 16,384



Usage

Get started generating text with stable-code-3b by using the following code snippet:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("stabilityai/stable-code-3b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
  "stabilityai/stable-code-3b",
  trust_remote_code=True,
  torch_dtype="auto",
)
model.cuda()
inputs = tokenizer("import torch\nimport torch.nn as nn", return_tensors="pt").to(model.device)
tokens = model.generate(
  **inputs,
  max_new_tokens=48,
  temperature=0.2,
  do_sample=True,
)
print(tokenizer.decode(tokens[0], skip_special_tokens=True))



Run with Fill in Middle (FIM) ⚡️

Click to expand
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("stabilityai/stable-code-3b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
  "stabilityai/stable-code-3b",
  trust_remote_code=True,
  torch_dtype="auto",
+ attn_implementation="flash_attention_2",
)
model.cuda()
inputs = tokenizer("<fim_prefix>def fib(n):<fim_suffix>    else:\n        return fib(n - 2) + fib(n - 1)<fim_middle>", return_tensors="pt").to(model.device)
tokens = model.generate(
  **inputs,
  max_new_tokens=48,
  temperature=0.2,
  do_sample=True,
)
print(tokenizer.decode(tokens[0], skip_special_tokens=True))



Run with Flash Attention 2 ⚡️

Click to expand
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("stabilityai/stable-code-3b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
  "stabilityai/stable-code-3b",
  trust_remote_code=True,
  torch_dtype="auto",
+ attn_implementation="flash_attention_2",
)
model.cuda()
inputs = tokenizer("import torch\nimport torch.nn as nn", return_tensors="pt").to(model.device)
tokens = model.generate(
  **inputs,
  max_new_tokens=48,
  temperature=0.2,
  do_sample=True,
)
print(tokenizer.decode(tokens[0], skip_special_tokens=True))



Model Details

  • Developed by: Stability AI
  • Model type: stable-code-3b models are auto-regressive language models based on the transformer decoder architecture.
  • Language(s): English, Code
  • Library: GPT-NeoX
  • License: Other
  • Contact: For questions and comments about the model, please email lm@stability.ai



Model Architecture

The model is a decoder-only transformer similar to the LLaMA (Touvron et al., 2023) architecture with the following modifications:

Parameters Hidden Size Layers Heads Sequence Length
2,796,431,360 2560 32 32 16384
  • Position Embeddings: Rotary Position Embeddings (Su et al., 2021) applied to the first 25% of head embedding dimensions for improved throughput following Black et al. (2022).
  • Tokenizer: We use a modified version of the GPTNeoX Tokenizer.NeoX. We add special tokens to train for Fill in the Middle (FIM) capabilities like <FIM_PREFIX> and <FIM_SUFFIX> along with other special tokens.



Training



Training Dataset

The dataset is comprised of a filtered mixture of open-source large-scale datasets available on the HuggingFace Hub: Falcon RefinedWeb extract (Penedo et al., 2023), along with CommitPackFT and Github Issues (BigCode., 2023), and StarCoder (Li et al., 2023). We further supplement our training with data from mathematical domains (Azerbayev, Zhangir, et al., 2023 and, Yu, Longhui, et al., 2023).

Top 18 programming languages trained on:

  • C
  • CPP
  • Java
  • JavaScript
  • CSS
  • Go
  • HTML
  • Ruby
  • Rust
  • Markdown
  • Shell
  • Php
  • Sql
  • R
  • Typescript
  • Python
  • Jupyter-Clean
  • RestructuredText



Training Procedure

The model is pre-trained on the aforementioned datasets in bfloat16 precision, optimized with AdamW.



Training Infrastructure

  • Hardware: stable-code-3b was trained on the Stability AI cluster across 256 NVIDIA A100 40GB GPUs (AWS P4d instances).

  • Software: We use a fork of gpt-neox (EleutherAI, 2021), train under 2D parallelism (Data and Tensor Parallel) with ZeRO-1 (Rajbhandari et al., 2019), and rely on flash-attention as well as SwiGLU and Rotary Embedding kernels from FlashAttention-2 (Dao et al., 2023)



Use and Limitations



Intended Use

The model is intended to be used as a foundational base model for application-specific fine-tuning. Developers must evaluate and fine-tune the model for safe performance in downstream applications.



Limitations and Bias


As a base model, this model may exhibit unreliable, unsafe, or other undesirable behaviors that must be corrected through evaluation and fine-tuning prior to deployment. The pre-training dataset may have contained offensive or inappropriate content, even after applying data cleansing filters, which can be reflected in the model-generated text. We recommend that users exercise caution when using these models in production systems. Do not use the models if they are unsuitable for your application, or for any applications that may cause deliberate or unintentional harm to others.



How to Cite

@misc{stable-code-3b,
      url={[https://huggingface.co/stabilityai/stable-code-3b](https://huggingface.co/stabilityai/stable-code-3b)},
      title={Stable Code 3B},
      author={Pinnaparaju, Nikhil and Adithyan, Reshinth and Phung, Duy and Tow, Jonathan and Baicoianu, James and  and Cooper, Nathan}
}

The `

` tag is an extremely versatile and commonly used element in HTML, allowing developers to create a container for other HTML elements. It can be used for various purposes throughout a website or web application. Here are some use cases of the `

` tag in HTML:

1. Structure and Layout: The `

` tag is commonly used to create the structure and layout of a web page. It can be used to divide the content into sections, columns, or sidebars. Developers can apply styling and positioning to the `

` elements to create visually appealing and organized web layouts.

2. Grouping Content: Developers can use the `

` tag to group related content together. For example, a `

` element can be used to group a set of images, paragraphs, or lists together. This allows for easier manipulation and styling of the grouped content as a unit.

3. Grid Systems: When building responsive web designs, developers often use grid systems to create flexible layouts. The `

` tag is used to define grid containers and grid items, allowing for the creation of responsive grid layouts that adjust based on screen size and device.

4. Styling Containers: The `

` tag serves as a container that can be styled using CSS. Developers can apply borders, background colors, padding, and margins to `

` elements to create visually distinct sections on a web page.

5. Wrapping Content: In some cases, developers may use a `

` tag to wrap around other HTML elements, providing a way to target specific groups of elements for styling or manipulation using JavaScript.

6. Building Navigation Menus: Navigation menus on a website can be built using `

` elements to create the structure and style the menu items and sections. This allows for building custom navigation layouts tailored to the specific design requirements.

7. Building Modals and Overlays: The `

` tag is often used to create overlays, modals, and pop-up content on a web page. By positioning and styling a `

` element, developers can create visually appealing modal windows for displaying additional content or user interactions.

In the context of the provided HTML snippet, the `

` element is used as a container for various elements including a table, images, code snippets, and description sections. It serves the purpose of structuring and grouping content related to the description and usage of a language model. Additionally, it is used to create a scrollable container for a table with detailed model performance metrics. The `

` tag provides a flexible and versatile way to organize and present information on a web page.

2024-01-17T08:27:55+01:00