We use cookies
This site uses cookies from cmlabs to deliver and enhance the quality of its services and to analyze traffic..
We use cookies
This site uses cookies from cmlabs to deliver and enhance the quality of its services and to analyze traffic..
Published at Jan 30, 2025 08:01 | Last updated at Jan 30, 2025 08:01 by Rochman Maarif
The eventualities are an inevitability that drives every business player to prepare scenarios and adapt. Watch the Anti-Trivial podcast featuring Mas Rochman, Bro Jimmy, and Pak Agus; a combination of a business practitioner, investor, and company leader, discussing how to enhance the foresight of business leaders in welcoming 2025. Don’t miss this special year-end edition of cmlabs Class, Episode 24 with title "New vs Conventional Search Engine. Prepare for the Eventualities!"
See Event DetailsAs technology develops, artificial intelligence (AI) starts penetrating human activities, including information gathering. Recently, many people have been looking for information using AI-based tools, like ChatGPT developed by OpenAI.
According to a survey, 45% of entrepreneurs and workers in Indonesia have utilized AI applications. Also, 52% of those workers admitted to using ChatGPT.
However, the thing we should pay attention to is whether AI-based ChatGPT is already “perceptive” to the rules of writing in Indonesian, including the standard word.
This is important to note because users can use this model to search for information that is then used for professional or academic purposes.
To answer this concern, we conducted an experiment to see how ChatGPT produces results from prompts in Indonesian.
In the research process, we carried out three data collection activities, namely entering question prompts with standard words, non-standard words, and translation prompts from English to Indonesian.
Before discussing the research results, let’s learn briefly about ChatGPT. Chat Generative Pre-training Transformer or ChatGPT is an AI-based chatbot using Natural Language Processing (NLP) to make human-like conversational dialogues.
This language model can respond to questions and create writings, such as essays, articles, and many more. This application also uses deep learning to answer various kinds of prompts of any scheme.
ChatGPT is a generative AI that can learn feedback from humans and sort the best responses through reinforcement learning from human feedback (RLHF) after the prompt is entered.
Moreover, ChatGPT gains data from millions of documents on the internet, including articles, conversations, and books to provide comprehensive answers according to the context.
Talking about ChatGPT, you might find deep learning and natural language processing (NLP). Deep learning is an AI method training computers to process data as if they are human. That way, computers can comprehend images, voices, texts, or other data to provide comprehensive results.
One form of deep learning usage is natural language processing (NLP). NLP combines computational linguistics with machine learning and deep learning algorithms.
Computational linguistics uses data science to analyze language and speech, which also includes syntactic and semantic analysis.
NLP helps AI to summarize text automatically, analyze texts, and examine the sentiment on social media. That way, the AI model can understand the context of the writings well.
Not only that, AI has reinforcement learning from human feedback (RLHF) helping machine learning to learn independently and efficiently.
RLHF incorporates human feedback into the reward function so that machine learning models can perform tasks that better align with human goals, desires, and needs.
RLHF is recognized as the industry standard technique for ensuring that large language models (LLMs) produce factual, harmless, and useful content.
Aside from that, human values and preferences also affect the output of LLMs. Each model is trained slightly differently and uses different human respondents so the results can differ even between LLMs.
ChatGPT is an AI model that many users rely on to look for information. As explained before about deep learning and NLP, AI like ChatGPT can study the feedback and provide comprehensive results.
However, does this affect the rules of writing in Indonesian, especially the use of standard and non-standard words? Therefore, we analyze two important points, namely:
To answer both questions, we have collected 16 standard and non-standard word pairs from prompts entered into ChatGPT without an account. We also included 8 English prompts to answer the second point.
Note: The ChatGPT we use is accountless (we don't register email accounts) so the results of the prompts are most likely not influenced by usage history.
ChatGPT has become one of the applications helping human with their needs, especially in crafting writing and looking for information. This means that ChatGPT can be used by many levels of users, from workers to academia.
That’s why, AI needs to adhere to the existing writing rules. Moreover, ChatGPT is available in many languages. Therefore, it must adapt to the writing rules in every response to provide in-depth answers.
Research published on Computer Methods and Programs in Biomedicine Update in 2024 proves that AI has helped academia and students in 6 dimensions, such as:
Regarding this, AI is expected to be able to provide the best results even on the writing side. The same study also revealed that out of 24 studies using AI, several studies had limitations in writing development, language correction, and challenges in developing writing that did not use English.
This means that AI models, one of which is ChatGPT, are expected to provide results that are not only comprehensive but can also answer various prompts with correct writing rules.
This issue is the reason why we want to see whether ChatGPT follows the writing rules of KBBI (Kamus Besar Bahasa Indonesia) for standard sentences.
To answer this question, we had a simple experiment on ChatGPT focusing on the usage of standard and non-standard words in every response this AI model provided.
During data collection, we used free ChatGPT without an account, meaning that it is likely that this AI model does not store any search history or prompts and does not learn user preferences. We did this to avoid biased results due to AI models that have learned prompt patterns. The type of ChatGPT used in this research is ChatGPT-4o.
We limited the research sample to 40 samples for three types of prompts, namely 1) prompts that use standard words, 2) prompts that use non-standard words, and 3) prompts that instruct the AI model to translate a paragraph from English to Indonesian.
These three types of prompts were chosen to see how ChatGPT provides answers regardless of the spelling mistakes made by the user and what kind of language dataset the AI model has.
The first result is from the prompt with standards and non-standard words in Indonesian. We entered 16 pairs of identical prompts with different standard words.
For example, we entered the prompt “apa itu asas legalitas?” with asas as the standard word and “apa itu azas legalitas?” with azas as the non-standard word to see whether ChatGPT gave the result based on the standard writing rules or following the prompt entered.
We selected the standard words registered in Kamus Besar Bahasa Indonesia (KBBI). In the table, below are the results:
Table 1: Results 1
No. | Prompt | KBBI | ChatGPT Results | Compliance with KBBI |
1. | Aktivitas apa yang bisa dilakukan ibu hamil agar tetap sehat? (Standard) | Aktivitas | Aktivitas | True |
2. | Aktifitas apa yang bisa dilakukan ibu hamil agar tetap sehat? (Non-Standard) | Aktivitas | Aktivitas | True |
3. | Apakah kemampuan analisis masih dibutuhkan di era digitalisasi? (Standard) | Analisis | Analisis | True |
4. | Apakah kemampuan analisa masih dibutuhkan di era digitalisasi? (Non-Standard) | Analisis | Analisis | True |
5. | Bagaimana cara menjadi pegawai yang andal meskipun tidak punya banyak pengalaman? (Standard) | Andal | Andal | True |
6. | Bagaimana cara menjadi pegawai yang handal meskipun tidak punya banyak pengalaman? (Non-Standard) | Andal | Handal | False |
7. | Apa itu asas legalitas? (Standard) | Asas | Asas | True |
8. | Apa itu azas legalitas? (Non-standard) | Asas | Azas | False |
9. | Bagaimana cara membuat karya tulis yang autentik? (Standard) | Autentik | Autentik | True |
10. | Bagaimana cara membuat karya tulis yang otentik? (Non-Standard) | Autentik | Otentik | False |
11. | Kemampuan berpikir seperti apa yang diperlukan untuk bisa menjadi seorang diplomat? (Standard) | Pikir | Pikir | True |
12. | Kemampuan berfikir seperti apa yang diperlukan untuk bisa menjadi seorang diplomat? (Non-Standard) | Pikir | Pikir | False |
13. | Manfaat makan cabai? (Standard) | Cabai | Cabai | True |
14. | Manfaat makan cabe? (Non-Standard) | Cabai | Cabai | True |
15. | Penanganan cedera lutut (Standard) | Cedera | Cedera | True |
16. | Penanganan cidera lutut (Non-Standard) | Cedera | Cedera | True |
17. | Masakan yang menggunakan cengkih (Standard) | Cengkih | Cengkih | True |
18. | Masakan yang menggunakan cengkeh (Non-Standard) | Cengkih | Cengkeh | False |
19. | Ras manusia apa saja yang memiliki mata cokelat? (Standard) | Cokelat | Cokelat | True |
20. | Ras manusia apa saja yang memiliki mata coklat? (Non-Standard) | Cokelat | Coklat | False |
21. | Cara mengobati alergi detergen (Standard) | Detergen | Detergen | True |
22. | Cara mengobati alergi deterjen (Non-Standard) | Detergen | Deterjen | False |
23. | Apakah seorang perawat boleh melakukan diagnosis pada pasien? (Standard) | Diagnosis | Diagnosis | True |
24. | Apakah seorang perawat boleh melakukan diagnosa pada pasien? (Non-Standard) | Diagnosis | Diagnosis and Diagnosa | Some Are True |
25. | Jika terjebak di area dengan suhu esktrem, apa yang seharusnya dilakukan? (STandard) | Esktrem | Ekstrem | True |
26. | Jika terjebak di area dengan suhu esktrim, apa yang seharusnya dilakukan? (Non-Standard) | Ekstrem | Ekstrem and Ekstrim | Some Are True |
27. | Apakah aman memakan makanan kedaluwarsa? (Standard) | Kedaluwarsa | Kedaluwarsa | True |
28. | Apakah aman memakan makanan kadaluarsa? (Non-Standard) | Kedaluwarsa | Kadaluwarsa and Kadaluarsa | False |
29. | Seberapa penting orisinalitas konten untuk situs web? (Standard) | Orisinal | Orisinal | True |
30. | Seberapa penting orisinilitas konten untuk situs web? (Non-Standard) | Orisinal | Orisinal | True |
31. | Seberapa besar risiko kematian orang dengan komplikasi diabetes dan kolesterol? (Standard) | Risiko | Risiko | True |
32. | Seberapa besar resiko kematian orang dengan komplikasi diabetes dan kolesterol? (Non-Standard) | Risiko | Risiko | True |
Of the 32 prompts given based on the explanation above, 24 results used the correct standard words regardless of the words used in the prompt. Then, 7 others showed non-standard words in the results when the prompt with non-standard words was given.
From the results, our hypothesis was ChatGPT might have a sufficient dataset of standard words to provide extensive answers according to writing rules. However, this AI model might also tend to follow the given prompt, as seen from the 7 incorrect results that followed the prompt with non-standard words, although only 21.9% of the total results.
Interestingly, there were 2 unique where ChatGPT gave various results, which were using both versions of standard and non-standard in the non-standard prompt. The first case was the prompt “apakah seorang perawat boleh melakukan diagnosa pada pasien?” which was a prompt using non-standard word diagnosa.
The result was ChatGPT answered using diagnosa (non-standard word) and diagnosis (standard word) as depicted below:
Prompt A was a prompt with non-standard words, while prompt B used standard words. The green label indicated that the targeted answer used standard words, while the red label indicated non-standard words.
In prompt A, it could be seen that ChatGPT used the words diagnosa and diagnosis, while in prompt B, ChatGPT successfully displayed the answer with standard words.
This unique case also occured in the prompt “jika terjebak di area dengan suhu esktrim, apa yang seharusnya dilakukan?”. With the same description as the prompt above, below is the comparison of the results:
The answer to prompt A showed that ChatGPT uses the words ekstrem and ekstrim, while prompt B successfully uses the standard word esktrem.
The hypothesis we got from this case was that ChatGPT might have learned that ekstrem and diagnosis are standard words that comply with writing rules, but there was a possibility that this AI model was influenced by the given prompt.
The reason can be seen from the comparison above, where this AI model could provide answers according to standard words in prompt B, but could also change to non-standard when prompt A was given.
Several studies have shown that prompts may influence the answers given by ChatGPT. This AI model can adjust user requests through the prompts given.
Additionally, there is another unique case of 7 incorrect results, where the prompt “apakah aman memakan makanan kadaluarsa?” gives two incorrect results, as illustrated below:
In prompt A, we provided the word kadaluarsa which is a non-standard word for “kedaluwarsa”. As a result, ChatGPT provided answers containing the words kadaluarsa (red label) and kadaluwarsa (yellow label), both of which are non-standard words.
However, in prompt B, this AI model successfully provided an answer with the standard word kedaluwarsa after we brought a prompt containing the standard word “kedaluwarsa”.
Regarding this result, we assumed that the result was related to ChatGPT's ability to obtain data from documentation on the Internet. In the search engine itself, the words “kadaluarsa” and “kadaluwarsa” are still widely used by users and common content, although many also use the word “kedaluwarsa”.
This means that ChatGPT might have a dataset for all three words. The answer to prompt A may also be influenced by the non-standard word prompt given, considering that even with the standard word in prompt B, ChatGPT could still provide an answer with the standard word.
To test the writing rules in Indonesian, we also tested 8 prompts that instructed ChatGPT to translate English paragraphs and sentences into Indonesian.
Of the 8 prompts, 4 ChatGPT answers used non-standard words in one of the words, and 4 correct answers used standard words in all the words. The samples below illustrated one of each outcome.
In this image, it could be seen that ChatGPT provided a translation for turnover with omset. “Omset” is a non-standard word for omzet according to the Kamus Besar Bahasa Indonesia (KBBI).
Next, from the 4 ChatGPT answer results containing standard words, the image above was one of the samples. ChatGPT successfully translated standardization into standardisasi and standard into standar, which were following Kamus Besar Bahasa Indonesia.
Because the results were balanced, it could be concluded that ChatGPT might have a capable standard word dataset, although, among the data, this AI model might also store non-standard words resulting from absorbing various kinds of documentation.
Conclusion
This experiment was conducted to answer our two big questions: Whether ChatGPT can provide answers according to good and correct Indonesian writing rules or just follow the prompt. In addition, we also explored whether this AI model can provide proper Indonesian translations of English paragraphs.
For the first question, the results were ChatGPT successfully answered 24 prompts according to standard words, 7 answers were less precise, and 2 answers were special cases because they had a combination of standard and non-standard words in one answer.
From here, we assumed that ChatGPT might have a good dataset of standard words, but this AI model might still follow the given prompt, making the answer contain non-standard words. Or, another possibility was that ChatGPT did have a dataset in the form of non-standard words resulting from absorbing various documentation on the internet.
For the second question, the results were a balance between the number of translations with standard and non-standard words. Thus, we assumed that ChatGPT probably has datasets for both types of word standardization.
Please note that the results of this experiment may differ from other experiments or devices. The results may also vary from one account to another. Furthermore, the history of using ChatGPT with an account may also affect the results of the experiment.
This experiment was also conducted once for each prompt so we did not see the results in other experiments. Therefore, it can be a starting point for research on writing rules for other AI models in the future.
Those are information about the simple research we conducted to see the standardization of answers given by ChatGPT. The use of AI models must be reconsidered, especially for commercial needs.
Since its ability is still a little confusing in providing answers according to the correct writing rules, it is a wise idea to choose the right writing partner, one of which is the Writing Services by cmlabs.
The Writing Services by cmlabs is executed by reliable writers with experience in various niches, from health to construction.
Our writing results also continue to contribute engagement to clients and are trusted by big names, such as AQUA, Siloam Hospitals, OCBC bank, and many more.
So, what are you waiting for? Schedule a meeting with our marketing team and tell us your article needs now!
Thank you for taking the time to read my article! At cmlabs, we regularly publish new and insightful articles related to SEO almost every week. So, you'll always get the latest information on the topics you're interested in. If you really enjoy the content on cmlabs, you can subscribe to our email newsletter. By subscribing, you'll receive updates directly in your inbox. And hey, if you're interested in becoming a writer at cmlabs, don't worry! You can find more information here. So, come join the cmlabs community and stay updated on the latest SEO developments with us!
WDYT, you like my article?