Guide for setting up and running Llama2 on Mac systems with Apple silicon. This repo provides instructions
for installing prerequisites like Python and Git, cloning the necessary repositories, downloading and converting
the Llama models, and finally running the model with example prompts.
Prerequisites
Before starting, ensure your system meets the following requirements:
Python 3.8+ (Python 3.11 recommended):
Check your Python version:
Note: If you encounter an error about a vocab size mismatch (model has -1, but tokenizer.model has 32000), update params.json in ../llama2/llama-2-7b-chat from -1 to 32000.
Llama2 Installation Guide for Mac (M1 Chip)
Guide for setting up and running Llama2 on Mac systems with Apple silicon. This repo provides instructions for installing prerequisites like Python and Git, cloning the necessary repositories, downloading and converting the Llama models, and finally running the model with example prompts.
Prerequisites
Before starting, ensure your system meets the following requirements:
Python 3.8+ (Python 3.11 recommended): Check your Python version:
Install Python 3.11 (if needed):
Install Mini Conda.
Cloning the Llama2 Repository
Clone the llama C++ port repository
Now, both repositories should be in your
llama2
directory. Inside thellama.cpp
directory, build it:Requesting Access to Llama Models
Downloading the Models
In your terminal, navigate to the
llama
directory:Run the download script:
When prompted, enter the custom URL from the email.
Converting the Downloaded Models
Navigate back to the
llama.cpp
repository:Create a conda environment named
llama2
:Activate the environment:
Install Python dependencies:
Convert the model to f16 format:
Quantize the model to reduce its size:
Running the Model
Execute the following command:
-m
: Model file-n
: Number of tokens--color
: Colored text input-i
: Interactive mode-r "User:"
: User input marker-f
: Path to prompt fileNow you’re ready to use Llama2!