According to a new study, 57% of the content on the internet is generated by artificial intelligence or translated by artificial intelligence algorithms. This situation draws attention to the fact that information on the internet is increasingly generated by artificial intelligence tools and this content is used for educational purposes. The research emphasizes that feeding AI-generated content from the internet, especially to tools such as ChatGPT and Copilot, can lead to limited and incorrect answers.
The Increase of Artificial Intelligence Content and Model Collapse
A study conducted by researchers at Amazon Web Services (AWS) states that training AI models with increasingly large AI content causes a situation called “model collapse.” Dr. Ilia Shumailov from Oxford University says that this collapse causes the model to represent less of various types of data and the diversity of responses decreases.
What is Model Crash?
Model collapse is a situation that occurs when artificial intelligence systems are constantly faced with the same type of artificial content during their training processes. This reduces the complexity and diversity of the model, negatively affecting the accuracy of the answers produced.
Artificial Intelligence is Being Trained with Insufficient Information
The research shows that unless AI models are fed accurate and diverse content, they are more likely to produce false and misleading information. AI content published online without verification causes these models to make more mistakes. This cyclical situation could further deteriorate the performance of AI tools in the future.
Possible Consequences of Misinformation Production
The proliferation of AI-generated content on the internet increases the risk of users encountering incorrect information. This can have serious consequences, especially in areas where informed decisions are critical (health, finance, law, etc.). In order to ensure the reliability of AI models, content verification processes need to be developed and models need to be fed with various data sources.
Researchers' Recommendations
Researchers emphasize the importance of using more diverse and validated data in training AI models. This is considered a critical step to improve the models’ responses and prevent potential errors. It is also emphasized that users should question the accuracy of the information they obtain when interacting with AI tools.
In an era when artificial intelligence technologies are rapidly developing, it is of great importance that these tools are fed with the right data sources to increase their accuracy and reliability. The increase in artificial intelligence content on the internet can negatively affect the quality of information, which can threaten the future performance of models. Therefore, more robust and diverse data sources are needed for training artificial intelligence models.