OpenAI Fine-tuning Usage
OpenAI provides Fine-tuning functionality that allows customizing AI models
Prerequisites
OpenAI account registration and
api key
acquisitionInstall the toolkit, using Node.js as an example
npm i openai
Generate Dataset
Organize data in the following format:
OpenAI officially recommends 50-100
examples for fine-tuning datasets, with a minimum requirement of 10
examples.
{"messages": [{"role": "system", "content": "Given a sports headline, provide the following fields in a JSON dict, where applicable: \"player\" (full name), \"team\", \"sport\", and \"gender\"."}, {"role": "user", "content": "Sources: Colts grant RB Taylor OK to seek trade"}, {"role": "assistant", "content": "{\"player\": \"Jonathan Taylor\", \"team\": \"Colts\", \"sport\": \"football\", \"gender\": \"male\" }"}]}
For the reasoning behind this data format requirement, see the official explanation: https://platform.openai.com/docs/guides/fine-tuning/data-formatting
Validate Data
openai tools fine_tunes.prepare_data -f ./test/trainai.jsonl
When performing data validation, you may encounter some errors, such as:
missing pandas
OpenAI error:
missing
pandas
The error occurs because libraries like numpy
and pandas
, due to their large size, are not installed by default. They are required for certain features of the library but are typically not needed for API communication. If you encounter a MissingDependencyError
, install them using:
pip install pandas
Create Model
openai api fine_tunes.create -t test/trainai.jsonl -m davinci
After successful creation, the command line will display the model name/training cost
Use Model
Once the model training is successful, it can be used. In addition to direct API calls, for simple testing you can use the official playground. The model dropdown will show the successfully trained models, and you can test them directly.
Update Model?
OpenAI currently supports secondary fine-tuning of already fine-tuned models.
Pricing
Model training costs money. For specific pricing, see: https://openai.com/pricing#language-models
Final Thoughts
With Fine-tuning, building applications like official documentation assistants/customer service will become much simpler.