An ontology is a knowledge representation structure which has been used in Virtual Learning Environments (VLEs) to describe educational courses by capturing the concepts and the relationships between them. Several ontology-based question generators used ontologies to auto-generate questions, which aimed to assess students’ at different levels in Bloom’s taxonomy. However, the evaluation of the questions was confined to measuring the qualitative satisfaction of domain experts and students. None of the question generators tested the questions on students and analysed the quality of the auto-generated questions by examining the question’s difficulty, and the question’s ability to discriminate between high ability and low ability students. The lack of quantitative analysis resulted in having no evidence on the quality of questions, and how the quality is affected by the ontology-based generation strategies, and the level of question in Bloom’s taxonomy (determined by the question’s stem templates). This paper presents an experiment carried out to address the drawbacks mentioned above by achieving two objectives. First, it assesses the auto-generated questions’ difficulty, discrimination, and reliability using two statistical methods: Classical Test Theory (CTT) and Item Response Theory (IRT). Second, it studies the effect of the ontology-based generation strategies and the level of the questions in Bloom’s taxonomy on the quality of the questions. This will provide guidance for developers and researchers working in the field of ontology-based question generators, and help building a prediction model using machine learning techniques.