At the Data Center World 2023 event in Austin, Texas, the high expenses of training huge language models were discussed. There, theMind’s Constantine Goltsev, an AI/ML solutions business, stressed the expenses associated with training models like ChatGPT.
Goltsev stressed that training ChatGPT’s 175 billion parameters is extremely expensive, requiring 175 billion computations for each input and using a large amount of electricity. OpenAI’s future GPT-4 model conducted compute operations for months using 12,000 to 15,000 Nvidia A100 processors, which cost $10,000 apiece. These discoveries highlight the need for a more realistic approach to artificial intelligence deployment.
Goltsev advocated using smaller, open source models that can match or even outperform ChatGPT’s performance. Organizations may accomplish spectacular results by adopting the same fine-tuning approaches used to produce ChatGPT by using academic models or open source equivalents with parameters numbering in the billions (e.g., 6 billion or 3 billion). This method makes it possible to use AI technology at a reasonable cost.
Goltsev used Amazon Web Services (AWS) as an example to show how a major legal firm may use AWS to develop a semantic search engine. The company might achieve its goals for roughly $30 to $40 per hour by hiring a few big instances with around eight A100 cards, plenty of memory, and storage.
The sources for this piece include an article in DataCenterKnowledge.