The proposed method has been tried and tested on Vietnam’s road traffic legal documents. With the current method, it can be seen as a flexible approach applicable to various knowledge domains, and it may even function with languages other than Vietnamese. However, when transitioning to a new knowledge domain, specialized knowledge in that field is required to carry out information extraction. Additionally, when switching to other languages, the system also needs to utilize alternative natural language processing applications instead of PhoBERT and VnCoreNLP, as these libraries are specific to the Vietnamese language. Nonetheless, other processes such as the formation of the knowledge graph or graph matching to yield results can be maintained unchanged.
The current method still faces several challenges. Although the system has utilized PhoBERT and VnCoreNLP to support the process of constructing the knowledge base, transforming information from legal documents into the knowledge base still requires the intervention of individuals with in-depth knowledge of the field to make final decisions, and this process takes a considerable amount of time. Furthermore, as the system expands to serve various legal domains, this process becomes more complex, and leveraging existing knowledge becomes more challenging, as each field employs specialized language and unique knowledge. Moreover, the current graph matching method is limited to processing individual questions and lacks the capability to address complex legal queries, where legal regulations overlap with each other.
In this study, the research team utilized state-of-the-art methods to test with the collected queries, including natural language processing methods. These methods delve deeply into analysing the syntax and semantics of words and sentences in legal documents, in order to provide meaningful answers suitable for the user’s question. In addition, the study also tested LLMs such as ChatGPT and Gemini, currently in widespread use. The results of the experimental process are compared, reflecting the advantages of the method presented by the research team compared to well-known methods in practice.
5.1 Testing on Vietnam Traffic Law
The proposed method has been tested and experimented on Vietnamese traffic law documents in this section. The study utilized a dataset gathered from various sources, including 36 articles, 306 clauses, and 762 points pertaining to traffic offenses and their corresponding punishments as outlined in traffic laws (Vietnam National Assembly,
2008; Vietnam Government,
2019; Vietnam National Assembly,
2021; Vietnam Ministry of Transport,
2019). Based on this extensive data, the researchers used a knowledge representation framework to establish 90 distinct concepts and 49 relations between concepts in the road traffic legal and regulatory framework. This process resulted in the creation of 616 nodes and 1239 relations between edges on the road traffic legal domain graph, which form the underlying structure of the simulated knowledge graph as shown in Fig.
8. This knowledge graph effectively captures the essence of traffic violations defined in road traffic laws.

Fig. 8
Simulation of knowledge base in road traffic field based on Lego-Onto model.
The system receives user queries, evaluates them, and converts the relevant knowledge into a knowledge graph. To identify relevant Points – Clause – Laws, this knowledge graph, which is tailored to the particular query, is then compared to the database’s knowledge graph. After that, the system shows the customer an extensive list of Points, Clauses, and Regulations that specifically answer their question.
Example 2.
Consider the query
${q_{1}}$ = “How is a person riding a motorbike fined for not wearing a helmet?”. The designed system provides an accurate answer as Fig.
9:

Fig. 9
The answer of our system for query ${q_{1}}$.
The meaning of Fig.
9 is as follows:
According to Article 6, Clause 3, Point n (Decree 123/2021/ND-CP):
∙ Article 6: Penalizing individuals riding motorbikes, including electric ones, and similar motorized vehicles violating road traffic regulations.
∙ Clause 3: Imposing fines ranging from $400\hspace{0.1667em}000$ VND to $600\hspace{0.1667em}000$ VND for specific violations:
∙ Point n: Not wearing a motorcycle helmet or wearing it without proper strapping when participating in road traffic.
This response accurately and suitable captures pertinent legal information for query ${q_{1}}$. Moreover, the designed system can support to extract information from multiple documents with suitable articles for an inputted query.
Example 3.
Consider the query
${q_{2}}$ = “
What is the fine for using a mobile phone while riding a motorcycle?”. The designed system provides an accurate answer, as shown in Fig.
10.

Fig. 10
The answer of our system for query ${q_{2}}$.
For query
${q_{2}}$, the designed system can combine the required information from many clauses and points in articles of multiple law documents:
-
• At first, through Article 6, Clause 4, Point h in Vietnam Government (
2019) was amended and supplemented in Vietnam National Assembly (
2021), it shows a fine ranging from 600 000 VND to 1 000 000 VND is imposed on individuals operating motorcycles, including electric motorcycles, and similar motor vehicles violating regulations on the use of mobile phones, audio devices, excluding hearing aids.
-
• In addition to the monetary penalty, those violating this regulation may face supplementary penalties, which were retrieved from two legal documents (Vietnam Government,
2019; Vietnam National Assembly,
2021), such as:
-
– Revocation of the driving license for 1 to 3 months (Article 6, Clause 10, Point b in Vietnam Government,
2019).
-
– Revocation of the driving license for 2 to 4 months if the violation results in a traffic accident (Article 6, Clause 10, Point c in Vietnam National Assembly,
2021).
This response accurately captures relevant legal information for query
${q_{2}}$ completely.
5.2 Evaluation of the Accuracy of the Designed System
The research conducted has yielded several noteworthy results. The central focus of this research was to explore and evaluate the utilization of knowledge graphs and artificial intelligence solutions in addressing legal queries in the field of road traffic law. This study conducts experiments focused on querying the meaning of terminology in Vietnamese road traffic law. Their knowledge contents are divided into 7 kinds:
-
• Kind 1: Queries about the definitions of concepts.
-
• Kind 2: Queries about violation of traffic signs and signals.
-
• Kind 3: Queries about personal safety violation.
-
• Kind 4: Queries about violations related to parking and reversing.
-
• Kind 5: Queries about violation of vehicle documents.
-
• Kind 6: Queries about violation of prohibited substances.
-
• Kind 7: Queries about obstruction of traffic violations.
The process of this system was checked by experts in Vietnamese road traffic law (traffic police and lawyer in road traffic). Table
3 and Fig.
11 show the testing results for each kind.
Table 3
Testing results on Road Traffic Law with the proposed method.
| Kind |
Meaning |
Quantity |
Correct |
Accuracy |
| 1 |
Query about definitions of concepts |
30 |
24 |
80.0% |
| 2 |
Violation of traffic signs and signals |
47 |
37 |
78.7% |
| 3 |
Personal safety violation |
48 |
40 |
83.3% |
| 4 |
Violations related to parking and reversing |
22 |
17 |
77.3% |
| 5 |
Vehicle document violations |
24 |
21 |
87.5% |
| 6 |
Violation of substances and prohibited substances |
14 |
13 |
92.8% |
| 7 |
Obstruction of traffic violations |
16 |
14 |
87.5% |
|
Total |
201 |
166 |
82.6% |

Fig. 11
Accuracy chart of the query system.
In practice, the common queries are kinds 1, 2, and 3. In these kinds, the system gets the highest performance with queries about personal safety violations (Kind 3). That means it can support users to determine actions that risk their safety. Regarding queries in Kind 2, there are many types and forms of traffic signs, and the system has difficulty precisely extracting violations with those signs. For other queries, the proposed method gets better results for queries about the violation of substances and prohibited substances (Kind 4), especially related to alcohol. In Vietnam, it is essential to prohibit the use of alcohol while driving. Thus, the system is designed to suit the requirements of Vietnamese traffic.
5.3 Comparison with NLP Methods
In this section, the proposed method is compared with other NLP methods, including TF-IDF, BM25, TIWS and TPS. They are compared based on Vietnamese traffic law documents with the metric TopK@acc.
Metrics: The effectiveness of the methods is evaluated using the TopK@acc metric, where accuracy is defined as the percentage of questions with completely correct labels found in the Top K documents.
${L_{K}}$ represents a set of k labels or ID of documents that the system predicts are most relevant to the query, while
${l_{q}}$ represents the practical set of labels associated with the query.
TF-IDF is a method for evaluating the importance of a word in a document or text within a dataset (Mishra and Vishwakarma,
2015). The TF-IDF weight is determined by two factors: the normalized term frequency (TF) factor, which represents the frequency of a word appearing in a document divided by the total number of words in that document. The second factor is related to the Inverse Document Frequency (IDF).
TIWS is a method that uses TF-IDF combined with Word Segmentation (Le
et al.,
2023). This method extracts the
k most relevant articles to answer a given query. Each sentence is encoded using the Word Segmentation extracted from the Undersea library.
In information retrieval,
BM25 acts as a scoring system for documents. It helps search engines to assess how relevant a document is to a search query (Robertson and Zaragoza,
2009). This method considers two factors: how often words appear in a document and how uncommon those words are in the entire database. Moreover, a parameter
k balances the importance of these two factors, while another parameter
b modifies the impact of document length on the final score.
TPS is a method using TF-IDF with PhoBERT Stage (Le
et al.,
2023). In this method, negative samples are generated using TF-IDF/BM25 and then embedded with PhoBERT before being fed into the training model. When the training process is finished, it will achieve improved repetition of PhoBERT. This fine-tuned TPS model supports embedding documents and queries from the test dataset. This method utilizes cosine similarity evaluation to predict the most relevant documents corresponding to a given question.
Table
4 compares the results of the TF-IDF, BM25, TIWS, TPS methods and the proposed solution according to the TopK@acc metric with
$K=5,10,20,50$ (Fig.
12).
Table 4
Results of the proposed method and NLP methods.
| Methods |
Top5@acc |
Top10@acc |
Top20@acc |
Top50@acc |
| TF-IDF |
0.042 |
0.084 |
0.153 |
0.393 |
| BM25 |
0.037 |
0.079 |
0.169 |
0.321 |
| TIWS |
0.058 |
0.079 |
0.147 |
0.367 |
| TPS |
0.441 |
0.5 |
0.563 |
0.688 |
| Proposed method |
0.561 |
0.603 |
0.645 |
0.671 |

Fig. 12
The search results of TF-IDF, BM25, TIWS, and TPS on Vietnamese Traffic Law.
5.4 Comparison with Other LLMs
ChatGPT and Gemini are popular question-answering systems which LLMs constructed. This section compares the designed system with ChatGPT 4o and Gemini 2.5 in the legal domain. For comparison of the results of our system, there are 171 queries related to traffic violations in the Road Traffic Law, including query
${q_{1}}$ in Example
2 and query
${q_{2}}$ in Example
3.
For query
${q_{1}}$, Fig.
13 and Fig.
14 are results of ChatGPT and Gemini for query
${q_{1}}$ in Example
2 (resp.). Those results are:

Fig. 13
The answer of ChatGPT for query ${q_{1}}$.

Fig. 14
The answer of Gemini for query ${q_{1}}$.
In Fig.
13, ChatGPT responds: “
In Vietnam, not wearing a helmet while riding a motorbike is considered a traffic violation and may result in a fine. Typically, the fine for this offense ranges from $100\hspace{0.1667em}000$ to $200\hspace{0.1667em}000$ VND. However, these fines can change over time and vary with regulatory updates”. Despite accurately describing the violation, the fine range of “100 000 to 200 000 VND” is entirely inaccurate.
In Fig.
14, Gemini’s response is succinct:
According to Decree 123
/2021
/ND-CP, the act of not wearing a helmet while participating in traffic will result in a fine ranging from $400\hspace{0.1667em}000$ to $600\hspace{0.1667em}000$ VND. Specifically, at Point c, Clause 1
, Article 21
of Decree 123/2021/ND-CP stipulates.
Individuals riding motorbikes, including electric ones and similar motorized vehicles, violating regulations on wearing helmets, except in cases of transporting emergency patients, children under 06 years old, or escorting individuals violating the law, will be fined from $400\hspace{0.1667em}000$ to $600\hspace{0.1667em}000$ VND. Thus, the fine for not wearing a helmet while participating in traffic is $400\hspace{0.1667em}000$ VND. However, if the violator has mitigating or aggravating circumstances, the fine will be applied according to the provisions of Decree 123/2021/ND-CP. Additionally, motorbike riders not wearing a helmet may also face the supplementary penalty of vehicle detention for up to 7 days, starting from the date of the administrative violation record . The fine “from 400 000 to 600 000 VND” that Gemini mentioned is entirely accurate; however, this fine is not stipulated at Point c, Clause 1, Article 21 of Decree 123/2021/ND-CP. Furthermore, this offense incurs only a monetary penalty and is not subject to an “additional penalty of vehicle detention for up to 7 days.” Therefore, Gemini’s presentation of the supplementary penalty is inaccurate.
For query
${q_{2}}$, Fig.
15 and Fig.
16 are results of ChatGPT and Gemini for for query
${q_{1}}$ in Example
3 (resp.). Those results are:

Fig. 15
The answer of ChatGPT for query ${q_{2}}$.

Fig. 16
The answer of Gemini for query ${q_{2}}$.
In Fig.
15, ChatGPT responded: “
In Vietnam, using a mobile phone while riding a motorcycle is considered a traffic violation and may result in a fine. The specific penalty will vary depending on the regulations of the competent authority and the actual situation. Typically, the fine for using a mobile phone while riding a motorcycle ranges from $100\hspace{0.1667em}000$ VND to $200\hspace{0.1667em}000$ VND or higher for this violation. However, please note that fines may change over time and according to the specific regulations of the traffic management authority. Using a phone while riding a motorcycle can reduce concentration and increase the risk of traffic accidents. Therefore, compliance with regulations is essential to ensure safety for oneself and other road users.”
While partially describing the violation, it inaccurately states the fine as “100 000 to 200 000 VND,” which is significantly different from reality.
In Fig.
16, Gemini’s feedback is concise and understandable:
According to Point c, Clause 1
, Article 21
of Decree $123/2021/$ND-CP, motorbike riders violating regulations on using mobile phones, sound devices, excluding hearing aids, will be fined administratively. The penalty for this violation is specified as follows:
A fine ranging from $800\hspace{0.1667em}000$ VND to $1\hspace{0.1667em}000\hspace{0.1667em}000$ VND.
Vehicle impoundment for up to 07 days, starting from the date of issuing the administrative violation report.
Furthermore, if using a mobile phone while riding a motorbike causes a traffic accident, the violator will be penalized according to the regulations in Article 260 of the Penal Code.
Although the response is succinct and accurate in specifying the fine amount, it fails to reference “Point c, Clause 1, Article 21 of Decree 123/2021/ND-CP.” Additionally, the supplementary penalty of “Vehicle impoundment for up to 07 days” is inaccurate and omits the penalty of “revocation of the driving license”.
Along with these queries, testing with Google consistently produced answers by extracting knowledge from articles. However, these results had approximately 25.8% of answers to violation-related questions that needed to be updated in accordance with the latest legal texts, leading to discrepancies in specific penalty levels. Regarding conceptual questions, Google Gemini provided responses with an accuracy rate of up to 93.3%.
Based on the criteria of designing of intelligent systems using the knowledge base (Nguyen
et al.,
2020; De Cruz,
2024), the comparison between ChatGPT, and Gemini and our system based on the following criteria: accuracy, suitability of content, and usability.
-
• Accuracy: In the context of legal knowledge retrieval, this criterion refers to the correctness, precision, and reliability of the information provided. It evaluates whether the system can deliver factual, up-to-date, and legally sound answers, including specific figures, relevant legal texts, and correct interpretations.
-
• Suitability of content: It assesses how well the generated responses align with the user’s query, are relevant to the legal context, and are presented in a comprehensive and organized manner. It goes beyond mere accuracy to evaluate the usefulness and applicability of the information for the user’s specific needs.
-
• Usability: It focuses on the ease of interaction with the system, the clarity of its outputs, and its practical utility for users, particularly in the legal domain. It evaluates how intuitive, efficient, and user-friendly the system is for accessing and utilizing legal knowledge.
Each criterion is evaluated based on the responding factors as Table
5, and Table
6 compares ChatGPT 4o, Google Gemini 2.5 and our system through these criteria.
Table 5
The factors for evaluating an intelligent search engine in legal domains.
| Criteria |
Evaluating factors |
| Accuracy |
∙ Factual Correctness. |
|
∙ Precision of Figures and Details. |
|
∙ Up-to-dateness/Currency of Information. |
|
∙ Domain-Specific Legal Knowledge. |
|
∙ Consistency and Reliability. |
| Suitability of content |
∙ Relevance to Query. |
|
∙ Specificity of Information. |
|
∙ Comprehensiveness and Completeness. |
|
∙ Contextual Understanding. |
|
∙ Integration of Knowledge Base. |
| Usability |
∙ Natural Language Understanding. |
|
∙ Clarity of Responses. |
|
∙ Referencing and Sourcing. |
|
∙ User Experience. |
|
∙ Efficiency of Retrieval. |
Table 6
Comparison of the designed system with ChatGPT and Gemini in Legal Domain.
| Criteria |
ChatGPT 4o |
Gemini 2.5 |
Our system |
| Accuracy |
ChatGPT was unable to provide precise figures for the answers and could only offer advice to the inquirer. The knowledge of ChatGPT is not always up-to-date, as it lacks a mechanism for real-time updates and continuous monitoring of changes in legal regulations and interpretations. Moreover, it also lacks domain-specific legal knowledge. The absence of expertise in interpreting and applying legal concepts can result in misunderstandings and inaccuracies. |
Gemini has the capability to access and collect huge amounts of information on the Internet, offering an overview of legal answers available online. It can extract information from various sources of information and provides specific values directly from legal texts, aiding in grasping specific information and legal regulations. The system relies on online sources and may not guarantee the accuracy, currency, or completeness of legal answers. |
By integrating knowledge fields into the legal domain, it was observed that information retrieval accuracy improved significantly, with an accuracy rate of 82.6%. This enhancement facilitates easier access to legal information for non-experts, enhancing transparency and enabling informed decision-making. Ontology contributes to maintaining the consistency and reliability of legal data through regular verification and updating. It also combines knowledge graphs to identify inconsistencies and errors, ensuring that the legal information remains accurate and dependable. |
| Suitability of content |
The result content is suitable and consistently analyses questions and provides answers relevant to the question’s content. It does not provide specific information from legal texts. Users need to cross-check and verify information from reliable sources to ensure the accuracy and reliability of the answers. |
Analyse and provide coherent responses to questions in alignment with the question content. However, the information has not yet been organized as a knowledge base. Thus, the accuracy of Gemini for queries that need information from multiple legal documents is not good. It only gives a simple answer. |
The knowledge base is integrated knowledge from diverse legal sources organized from the source of law documents. It ensures centralized and easily accessible legal information. This integration reduces data fragmentation and promotes consistency across legal knowledge. Thus, beside the suitable answer for the query’s meaning, relations of KG help extract necessary information from many documents. |
| Usability |
ChatGPT accurately understands natural language, and responds to a wide range of legal queries, including complex cases. However, it is unable to provide specific, accurate figures or details about the extracted legal documents. Users are advised to verify information from official legal sources. |
Ensure that answers are updated based on the latest legal documents, although the responses are accurate regarding the substantive content of fines. However, Gemini’s citations are inaccurately positioned within the legal text, and the supplementary penalties presented are also incorrect. |
The system supports reference to the original documents with confidence in the accuracy and reliability of the responses. The system is emerging to release a practical application supporting people to search for legal knowledge and violations in Vietnamese road Traffic law. |
The integration of Ontology Legal-Onto and knowledge graphs has proven highly valuable in organizing and presenting legal information in an understandable and consistent format in various legal texts. This significantly supports efficient analysis and understanding of complex legal content, thereby benefiting legal practitioners and researchers. The research results indicate that knowledge graphs have substantial potential to contribute to the legal field. They provide more accessible and comprehensible legal information, support legal decision making, and increase the reliability and efficiency of legal knowledge. These findings suggest a promising path for further research and development in integrating artificial intelligence and the legal field. In addition, the structure of the knowledge graph helps the designed system respond to specific legal queries depending on the depth and accuracy of the underlying data.