# Guide to Using Hugging Face M2-BERT-80M-32K-Retrieval

### Introduction
The M2-BERT-80M-32K-Retrieval model is a fine-tuned version of M2-BERT, pretrained with sequence length 32768, for long-context retrieval. This guide will walk you through how to use this model for generating embeddings for retrieval.

### Getting Started
Before using the model, make sure you have the Hugging Face library installed in your environment. If not, you can install it using the following command:
“`sh
pip install transformers
“`

### Loading the Model
You can load the M2-BERT-80M-32K-Retrieval model using Hugging Face `AutoModel`. Follow the code snippet below to load the model:
“`python
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(
“togethercomputer/m2-bert-80M-32k-retrieval”,
trust_remote_code=True
)
“`

### Generating Embeddings
This model generates embeddings for retrieval with a dimensionality of 768. You can use it to generate embeddings for a given input text as shown below:
“`python
from transformers import AutoTokenizer, AutoModelForSequenceClassification

max_seq_length = 32768
testing_string = “Every morning, I make a cup of coffee to start my day.”

tokenizer = AutoTokenizer.from_pretrained(
“bert-base-uncased”,
model_max_length=max_seq_length
)
input_ids = tokenizer(
[testing_string],
return_tensors=”pt”,
padding=”max_length”,
return_token_type_ids=False,
truncation=True,
max_length=max_seq_length
)

outputs = model(**input_ids)
embeddings = outputs[‘sentence_embedding’]
“`

### Using Together API
You can also get embeddings from this model using the Together API by following the code snippet below:
“`python
import os
import requests

def generate_together_embeddings(text: str, model_api_string: str, api_key: str):
url = “https://api.together.xyz/api/v1/embeddings”
headers = {
“accept”: “application/json”,
“content-type”: “application/json”,
“Authorization”: f”Bearer {api_key}”
}
session = requests.Session()
response = session.post(
url,
headers=headers,
json={
“input”: text,
“model”: model_api_string
}
)
if response.status_code != 200:
raise ValueError(f”Request failed with status code {response.status_code}: {response.text}”)
return response.json()[‘data’][0][’embedding’]

print(generate_together_embeddings(
‘Hello world’,
‘togethercomputer/m2-bert-80M-32k-retrieval’,
os.environ[‘TOGETHER_API_KEY’])[:10]
)
“`

### Acknowledgments
Alycia Lee helped with AutoModel support for the M2-BERT-80M-32K-Retrieval model.

### Citation
If you use this model, or found the work valuable, you can cite the authors as follows:
“`bibtex
@inproceedings{fu2023monarch,
title={Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture},
author={Fu, Daniel Y and Arora, Simran and Grogan, Jessica and Johnson, Isys and Eyuboglu, Sabri and Thomas, Armin W and Spector, Benjamin and Poli, Michael and Rudra, Atri and R{\’e}, Christopher},
booktitle={Advances in Neural Information Processing Systems},
year={2023}
}
“`

### Conclusion
This guide provides the necessary information to use the M2-BERT-80M-32K-Retrieval model for generating embeddings for retrieval. For further details and instructions, you can also check out the [GitHub repository](https://github.com/HazyResearch/m2/tree/main) for this model.

Source link
Hugging Face Tutorial

In this tutorial, we will guide you on how to use the 80M checkpoint of M2-BERT, which has been pre-trained with a sequence length of 32768 and fine-tuned for long-context retrieval. This model is created by Jon Saad-Falcon, Dan Fu, and Simran Arora.

Downloading and Fine-Tuning the Model
You can download and fine-tune the model by following the instructions on our GitHub repository: https://github.com/HazyResearch/m2/tree/main

Usage in Python
To load this model using Hugging Face, you can use the following Python code:

“`
from transformers import AutoModel
model = AutoModel.from_pretrained(
“togethercomputer/m2-bert-80M-32k-retrieval”,
trust_remote_code=True
)
“`

Expect to see a large error message about unused parameters for FlashFFTConv when you load the model. If you would like to load the model with FlashFFTConv, refer to our GitHub repository for the instructions.

Generating Embeddings
This model generates embeddings for retrieval with a dimensionality of 768. You can generate embeddings using the following Python code:

“`
from transformers import AutoTokenizer, AutoModelForSequenceClassification

max_seq_length = 32768
testing_string = “Every morning, I make a cup of coffee to start my day.”
model = AutoModelForSequenceClassification.from_pretrained(
“togethercomputer/m2-bert-80M-32k-retrieval”,
trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained(
“bert-base-uncased”,
model_max_length=max_seq_length
)
input_ids = tokenizer(
[testing_string],
return_tensors=”pt”,
padding=”max_length”,
return_token_type_ids=False,
truncation=True,
max_length=max_seq_length
)

outputs = model(**input_ids)
embeddings = outputs[‘sentence_embedding’]
“`

Getting Embeddings with Together API
You can also get embeddings from this model using the Together API as follows:

“`
import os
import requests

def generate_together_embeddings(text: str, model_api_string: str, api_key: str):
url = “https://api.together.xyz/api/v1/embeddings”
headers = {
“accept”: “application/json”,
“content-type”: “application/json”,
“Authorization”: f”Bearer {api_key}”
}
session = requests.Session()
response = session.post(
url,
headers=headers,
json={
“input”: text,
“model”: model_api_string
}
)
if response.status_code != 200:
raise ValueError(f”Request failed with status code {response.status_code}: {response.text}”)
return response.json()[‘data’][0][’embedding’]

print(generate_together_embeddings(
‘Hello world’,
‘togethercomputer/m2-bert-80M-32k-retrieval’,
os.environ[‘TOGETHER_API_KEY’])[:10]
)
“`

Acknowledgments
We would like to acknowledge Alycia Lee for her help with AutoModel support.

Citation
If you use this model or find our work valuable, you can cite us as follows:

“`
@inproceedings{fu2023monarch,
title={Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture},
author={Fu, Daniel Y and Arora, Simran and Grogan, Jessica and Johnson, Isys and Eyuboglu, Sabri and Thomas, Armin W and Spector, Benjamin and Poli, Michael and Rudra, Atri and R{é}, Christopher},
booktitle={Advances in Neural Information Processing Systems},
year={2023}
}
“`

For more information, check out the paper “Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture” and our blog post on retrieval. Also, feel free to visit our GitHub repository for further instructions and details on using the model.

An 80M checkpoint of M2-BERT, pretrained with sequence length 32768, and it has been fine-tuned for long-context retrieval.

Check out the paper Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture and our blog post on retrieval for more on how we trained this model for long sequence.

This model was trained by Jon Saad-Falcon, Dan Fu, and Simran Arora.

Check out our GitHub for instructions on how to download and fine-tune it!



How to use

You can load this model using Hugging Face AutoModel:

from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained(
  "togethercomputer/m2-bert-80M-32k-retrieval",
  trust_remote_code=True
)

You should expect to see a large error message about unused parameters for FlashFFTConv.
If you’d like to load the model with FlashFFTConv, you can check out our GitHub.

This model generates embeddings for retrieval. The embeddings have a dimensionality of 768:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

max_seq_length = 32768
testing_string = "Every morning, I make a cup of coffee to start my day."
model = AutoModelForSequenceClassification.from_pretrained(
  "togethercomputer/m2-bert-80M-32k-retrieval",
  trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained(
  "bert-base-uncased",
  model_max_length=max_seq_length
)
input_ids = tokenizer(
  [testing_string],
  return_tensors="pt",
  padding="max_length",
  return_token_type_ids=False,
  truncation=True,
  max_length=max_seq_length
)

outputs = model(**input_ids)
embeddings = outputs['sentence_embedding']

You can also get embeddings from this model using the Together API as follows (you can find your API key here):

import os
import requests

def generate_together_embeddings(text: str, model_api_string: str, api_key: str):
    url = "https://api.together.xyz/api/v1/embeddings"
    headers = {
        "accept": "application/json",
        "content-type": "application/json",
        "Authorization": f"Bearer {api_key}"
    }
    session = requests.Session()
    response = session.post(
        url,
        headers=headers,
        json={
            "input": text,
            "model": model_api_string
        }
    )
    if response.status_code != 200:
        raise ValueError(f"Request failed with status code {response.status_code}: {response.text}")
    return response.json()['data'][0]['embedding']

print(generate_together_embeddings(
  'Hello world',
  'togethercomputer/m2-bert-80M-32k-retrieval',
  os.environ['TOGETHER_API_KEY'])[:10]
)



Acknowledgments

Alycia Lee helped with AutoModel support.



Citation

If you use this model, or otherwise found our work valuable, you can cite us as follows:

@inproceedings{fu2023monarch,
  title={Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture},
  author={Fu, Daniel Y and Arora, Simran and Grogan, Jessica and Johnson, Isys and Eyuboglu, Sabri and Thomas, Armin W and Spector, Benjamin and Poli, Michael and Rudra, Atri and R{\'e}, Christopher},
  booktitle={Advances in Neural Information Processing Systems},
  year={2023}
}

The `

` tag in HTML is a versatile and important element for creating robust and well-structured web pages. It is commonly used to group together HTML elements and apply styling or behavior to them as a whole. Here are some use cases of the `

` tag in various contexts:

1. Website Development:
– One of the most common use cases of the `

` tag is in website development. It is often used to create sections or divisions within a web page, allowing developers to organize and structure the content effectively.
– It can be used to create containers for various elements such as headers, footers, navigation bars, sidebars, and main content sections. This helps in maintaining a clean and organized layout for the web page.

2. Styling and Layout:
– The `

` tag is extensively used in CSS to apply styling and layout to the grouped elements. Developers can apply classes and IDs to the div elements and then manipulate them using CSS rules to enhance the visual appeal and functionality of the web page.

3. JavaScript and DOM Manipulation:
– In JavaScript programming, the `

` tag is often manipulated using Document Object Model (DOM) methods. It allows developers to dynamically add, remove, or modify content within the div elements based on user interactions or other events.

4. Form Creation:
– While creating forms in HTML, developers often use the `

` tag to group related form elements. This makes it easier to style and structure the form fields and align them in a meaningful way.

5. Data Presentation:
– The `

` tag is used to present data in organized and visually appealing ways. It can be used to separate and display different types of data, such as images, text, tables, and multimedia content.

6. Frameworks and Libraries:
– Popular web development frameworks and libraries, such as Bootstrap, React, and Vue.js, heavily utilize the `

` tag for creating responsive and interactive web applications. The flexibility of the div element allows for seamless integration with these frameworks.

7. Mobile App Development:
– In mobile app development, especially with technologies like Flutter, the `

` tag plays a vital role in creating layouts and UI components for cross-platform applications. It helps in structuring the UI elements and applying styling to create a consistent look and feel across different devices.

Overall, the `

` tag is a fundamental building block in web development and is used in a wide range of scenarios to create well-organized, visually appealing, and functional web pages and applications. Its flexibility and versatility make it an indispensable tool for developers working with HTML, CSS, JavaScript, and various web development frameworks.

2024-01-15T19:07:35+01:00