How to train your own AI mini model?

2025-11-04 0 17

If you want to build a proprietary AI small model, you need two steps:

Step 1: Choose an open-source AI model

Step 2: Fine tune the model

So when it comes to open source models, we have to mention huggingface

 

It is an open-source community dedicated to artificial intelligence models, providing a large number of pre trained models and datasets. Of course, the above also provides some functions that can directly call the large model, such as chatting, drawing, etc.

The following chart shows the trend of Google’s popularity, which has been on the rise.

 

Currently, 1133267 open-source models have been collected on Huggingface.

How to train your own AI mini model?插图

一、Training program

How to train your own AI mini model?插图1

1.1、Full model training

  • Starting from scratch to train a model, all model parameters will be initialized and updated based on the training data. The initial method used was to train the model directly through transformers.
  • Ollama+Llama3.2
  • Python 3.8+
  • PyTorch
  • Hugging Face Transformers
  • Datasets
  • CUDA

from transformers import LlamaForCausalLM, AutoTokenizer, Trainer, TrainingArguments, AutoModelForCausalLM, \

AutoTokenizer

from datasets import load_dataset

 

# 加载模型和分词器

model_name = “Llama-3.2-1B”  # 替换为你的模型名称

model = AutoModelForCausalLM.from_pretrained(model_name)

tokenizer = AutoTokenizer.from_pretrained(model_name, legacy=False)

 

# 检查词汇文件路径

print(type(tokenizer))

 

# 确保分词器有 pad_token

if tokenizer.pad_token is None:

tokenizer.add_special_tokens({‘pad_token’: ‘[PAD]’})

model.resize_token_embeddings(len(tokenizer))

 

# 加载数据集

dataset = load_dataset(“json”, data_files=”training_data.json”)

……..

The trained model will be very large, for example, the original 2G model will have over 4G after complete training, of course, this is the size after removing the checkpoints. For example, quantifying parameters or compressing models can be used to shrink the model.

1.2、fine-tuned model

On the basis of pre training the model, further train the model using data from specific tasks to adapt to new tasks.

Generally, Lora is used for fine-tuning models. In addition to LoRA, Adapter Layers, Freeze, Initialize Tuning, Prompt Tuning, BitFit, and UniPELT are similar fine-tuning techniques.

二、Train the model

My local environment is Windows 11, so I trained the model directly on Windows because there is an NVIDIA graphics card on the machine.

 

How to train your own AI mini model?插图2

I have chosen LLaMA Factory from Github 34.6k star to fine tune the Qwen2-0.5B model, which is relatively small. That’s about all the knowledge and process involved in fine-tuning one’s own exclusive AI model.

Through the above explanation, we can see that using open-source models and tools for training and fine-tuning can help us create exclusive AI applications. If you also want to master these technologies and apply them to practical projects, then Zhihu Zhixuetang’s AI application live course will be your best choice. In addition, the construction package in the course has been preliminarily encapsulated and built, making it relatively easier for everyone to get started quickly.

 

 

The following is a purely self built method without any encapsulation, which is relatively complex and difficult to get started with. It has a certain technical foundation.

Here are the detailed steps:

1. Environment Preparation

1.1 Install Python

Install Python 3.8 or higher version. You can download and install it from the Python official website.

1.2 Install CUDA

Download and install the CUDA version suitable for the graphics card from the NVIDIA official website.

nvidia-smi

CUDA<=12.6 is supported here

How to train your own AI mini model?插图3

How to train your own AI mini model?插图4

 

You can find version 12.6.0 in CUDA

 

How to train your own AI mini model?插图5

How to train your own AI mini model?插图6

nvcc -V

It indicates that the installation has been successful

 

How to train your own AI mini model?插图7

 

2. Install LLaMA Factory

2.1 Install dependent pseudo environment (optional) Factory repository

Open a command-line tool (such as PowerShell or CMD) and run the following command to clone the LLaMA Factory repository:

git clone https://github.com/hiyouga/LLaMA-Factory.git

cd LLaMA-Factory

2.2 Create a virtual environment (optional)

Although it is feasible, the initial installation may damage the environment. Therefore, in order to isolate the environment, it is recommended to create a virtual environment. If it is damaged, a new virtual environment can be created:

python -m venv llama_factory_env

.\llama_factory_env\Scripts\activate

2.3 Install dependencies

Install the required Python dependencies in a virtual environment::

pip install -r requirements.txt

Generally speaking, installation is a problem of not being able to download dependencies,

Another issue that needs to be noted is that llamafactory cli webui often reports the following error: webui cannot be opened

RuntimeError: Failed to import trl.trainer.ppo_config because of the following error (look up to see its traceback):

No module named ‘tyro’

No matter how I reinstall and install, it doesn’t work. I found a solution from a friend in GitHub issues later

pip install tyro==0.8.14

Finally launch the graphical interface

llamafactory-cli webui

3. Download Qwen2-0.5B model

3.1 Download  model

You can download the Qwen2-0.5B model from Hugging Face or other model repositories. Assuming you have downloaded the model file and placed it in the models directory.

 

 

 

 

4. Configure LLaMA Factory

4.1 configuration file

You can directly configure parameters on the page, usually by default.

 

 

 

 

One thing to note here is that if the model trained according to the default configuration on the page is ineffective, you can try adjusting some parameters

 

 

 

 

You can also save your OK training parameters, as shown in the following figure

 

 

 

The following is the content of the YAML file

top.booster: auto

top.checkpoint_path:

– train_2024-11-20-16-03-49

top.finetuning_type: lora

top.model_name: Qwen2-0.5B

top.quantization_bit: none

top.quantization_method: bitsandbytes

top.rope_scaling: none

top.template: default

train.additional_target: ”

train.badam_mode: layer

train.badam_switch_interval: 50

train.badam_switch_mode: ascending

train.badam_update_ratio: 0.05

train.batch_size: 2

train.compute_type: fp16

train.create_new_adapter: false

train.cutoff_len: 1024

train.dataset:

– identity

train.dataset_dir: data

train.ds_offload: false

train.ds_stage: none

train.extra_args: ‘{“optim”: “adamw_torch”}’

train.freeze_extra_modules: ”

train.freeze_trainable_layers: 2

train.freeze_trainable_modules: all

train.galore_rank: 16

train.galore_scale: 0.25

train.galore_target: all

train.galore_update_interval: 200

train.gradient_accumulation_steps: 16

train.learning_rate: 1e-4

train.logging_steps: 5

train.lora_alpha: 32

train.lora_dropout: 0

train.lora_rank: 16

train.lora_target: ”

train.loraplus_lr_ratio: 0

train.lr_scheduler_type: cosine

train.mask_history: false

train.max_grad_norm: ‘1.0’

train.max_samples: ‘100000’

train.neat_packing: false

train.neftune_alpha: 0

train.num_train_epochs: ‘100.0’

train.packing: false

train.ppo_score_norm: false

train.ppo_whiten_rewards: false

train.pref_beta: 0.1

train.pref_ftx: 0

train.pref_loss: sigmoid

train.report_to: false

train.resize_vocab: false

train.reward_model: null

train.save_steps: 100

train.shift_attn: false

train.train_on_prompt: false

train.training_stage: Supervised Fine-Tuning

train.use_badam: false

train.use_dora: false

train.use_galore: false

train.use_llama_pro: false

train.use_pissa: false

train.use_rslora: false

train.val_size: 0

train.warmup_steps: 0

 

4.2 data preparation

To place the training data in the data directory or directly use the example training data provided by LLaMA Factory in the data directory, it is necessary to replace the variables in the example data with their own content.

 

 

 

5. Start fine-tuning

5.1 Start fine-tuning

In the graphical interface, click the “Start Training” button, and LLaMA Factory will start fine-tuning the Qwen2-0.5B model using the LoRA method.

 

 

5.2 Monitor the training process

You can monitor the training process in real-time in the graphical interface and view metrics such as loss and learning rate.

 

 

Triggering GPU usage during training

 

 

 

6. Use the fine tuned model

6.1 Save Model

After the training is completed, the fine tuned model will be saved in the output directory. When using it, you can directly select it from the checkpoint path, and the file name is based on the training time at that time

 

 

6.2 Use the fine tuned model

Directly use the “Chat” feature in the LLaMA Factory graphical interface to verify the fine tuned effect

 

Disclaimer: This article is published by a third party and represents the views of the author only and has nothing to do with this website. This site does not make any guarantee or commitment to the authenticity, completeness and timeliness of this article and all or part of its content, please readers for reference only, and please verify the relevant content. The publication or republication of articles by this website for the purpose of conveying more information does not mean that it endorses its views or confirms its description, nor does it mean that this website is responsible for its authenticity.

Ictcoder ICT News How to train your own AI mini model? https://ictcoder.com/2805291.html

Qizhuwang Source Code Trading Platform

Q&A
  • 1. Automatic: After making an online payment, click the (Download) link to download the source code; 2. Manual: Contact the seller or the official to check if the template is consistent. Then, place an order and make payment online. The seller ships the goods, and both parties inspect and confirm that there are no issues. ICTcoder will then settle the payment for the seller. Note: Please ensure to place your order and make payment through ICTcoder. If you do not place your order and make payment through ICTcoder, and the seller sends fake source code or encounters any issues, ICTcoder will not assist in resolving them, nor can we guarantee your funds!
View details
  • 1. Default transaction cycle for source code: The seller manually ships the goods within 1-3 days. The amount paid by the user will be held in escrow by ICTcoder until 7 days after the transaction is completed and both parties confirm that there are no issues. ICTcoder will then settle with the seller. In case of any disputes, ICTcoder will have staff to assist in handling until the dispute is resolved or a refund is made! If the buyer places an order and makes payment not through ICTcoder, any issues and disputes have nothing to do with ICTcoder, and ICTcoder will not be responsible for any liabilities!
View details
  • 1. ICTcoder will permanently archive the transaction process between both parties and snapshots of the traded goods to ensure the authenticity, validity, and security of the transaction! 2. ICTcoder cannot guarantee services such as "permanent package updates" and "permanent technical support" after the merchant's commitment. Buyers are advised to identify these services on their own. If necessary, they can contact ICTcoder for assistance; 3. When both website demonstration and image demonstration exist in the source code, and the text descriptions of the website and images are inconsistent, the text description of the image shall prevail as the basis for dispute resolution (excluding special statements or agreements); 4. If there is no statement such as "no legal basis for refund" or similar content, any indication on the product that "once sold, no refunds will be supported" or other similar declarations shall be deemed invalid; 5. Before the buyer places an order and makes payment, the transaction details agreed upon by both parties via WhatsApp or email can also serve as the basis for dispute resolution (in case of any inconsistency between the agreement and the description of the conflict, the agreement shall prevail); 6. Since chat records and email records can serve as the basis for dispute resolution, both parties should only communicate with each other through the contact information left on the system when contacting each other, in order to prevent the other party from denying their own commitments. 7. Although the probability of disputes is low, it is essential to retain important information such as chat records, text messages, and email records, in case a dispute arises, so that ICTcoder can intervene quickly.
View details
  • 1. As a third-party intermediary platform, ICTcoder solely protects transaction security and the rights and interests of both buyers and sellers based on the transaction contract (product description, agreed content before the transaction); 2. For online trading projects not on the ICTcoder platform, any consequences are unrelated to this platform; regardless of the reason why the seller requests an offline transaction, please contact the administrator to report.
View details

Related Source code

ICTcoder Customer Service

24-hour online professional services