Information Services and Use, 18(1-2), 1998, pp53-63 (ISSN 0167-5265)
A computer based intelligent assessment system for numeric disciplines
Ashok Patel
CAL Research & Software Engineering Centre, De Montfort University, Leicester, UK
Tel./Fax: +44 116 257 7193; E-mail: apatel@dmu.ac.uk
Kinshuk
GMD FIT - National Research Centre for Information Technology, Sankt Augustin, GERMANY
Tel.: +49 2241 14 2144; Fax: +49 2241 14 2065; E-mail: kinshuk@gmd.de
David Russell
School of Business, De Montfort University, Leicester, UK
Tel./Fax: +44 116 257 7247; E-mail: drussell@dmu.ac.uk
Abstract.
The paper describes an intelligent assessment system for numeric disciplines. The system works in conjunction with the intelligent tutoring tools developed by TLTP Byzantium, a consortium of six UK Universities. The benefits of the intelligent assessment system discussed in this paper include the saving of teacher time and effort previously spent in marking and compilation of results. The faster turnaround of the assessment related work resulting into a much shorter testing, assessment and feedback cycle, enables more frequent testing. Since tutoring tools are knowledge based, they are capable of generating infinite number of test problems by randomly selecting the independent variables and assigning them random values, as well as providing solution to the generated problems. For developing data interpretation skills, it is possible for a teacher to hand out a problem expressed in a narrative form and provide a model answer to the assessment system. A unique feature of the Byzantium assessment system is its capability of discriminating between incorrect interpretation of given data and incorrect method of solution, allowing a teacher to set a fractional score for a variable that is calculated using a correct method but based on an incorrect interpretation of data.1. Introduction
Assessment is an essential part of learning as it provides a measure of what has already been learnt. This paper describes an intelligent assessment system developed by the Byzantium project funded under the Teaching and Learning Technology Programme (TLTP) of the Higher Education Funding Councils of United Kingdom. The system assesses the progress of learning and offers a delayed static feedback while the learning takes place with the help of the immediate dynamic feedback provided by various Byzantium Intelligent Tutoring Tools (ITTs). The ITTs are based on cognitive apprenticeship framework [4] and the students engage in learning by doing (for further details on ITTs, see [15] and [16]). Stuart [20] pointed out that the students often do not experience the challenges of problem identification and problem definition which are normal challenges in real work environment. The ITTs, therefore, in addition to their random generator facility, enable a teacher to set problems in a narrative form in a discrete fashion or within a case study. The assessment system is able to assess such problems and with the help of a two-fold expert model it can discriminate between an error of incorrect data interpretation and an error of employing a wrong method to derive a solution. The assessment system is data driven and can accommodate different number of variables to suit a particular domain.
Before looking at the details of the Byzantium intelligent assessment system in section 4, it would be useful to briefly review assessment in general (section 2) as well as the history of developments in automatic assessment (section 3). Section 5 concludes the paper by describing the experiences gained and plans for future enhancements to the system.
2. Assessment - General view
Bork et. al. [3] provided a comprehensive overview of the assessment practices popular in traditional academic environment. In their opinion, much of the assessment is not done properly and therefore does not contribute to effective learning. Waters and McCracken [23] emphasised that the assessment should not be solely a grade-assignment or ranking tool. The main goal of assessment should be to enhance the learning experience. The traditional assessment tools generally focus on isolated facts and techniques and ignore a student’s understanding of the larger integrated picture, allowing success based on rote memorisation rather than true understanding and in some cases even encouraging the superficial approaches [see 23,10]. Assessment should be used to learn about the gaps in knowledge and mistaken knowledge [3] and it should focus on problem solving, thinking and reasoning skills [23]. The US National Science Educational Standards [14] proposed that the assessment should be done for authentic tasks which are similar to the tasks performed in "real life". The Mathematical Sciences Education Board, USA [11] asserted three guiding principles to assessment: (i) content - assessment should reflect what is most important for the students to learn; (ii) learning - assessment should enhance learning and should support instructional practice; and (iii) equity - assessment should support every student’s opportunity to learn.
There are various forms of assessment used in practice. Some poor forms include multiple choice questions, true-false examinations [23] and assessment of memorisation [3]. Valencia [22] stressed that the assessment should be authentic and this requires one to be concerned with the type and nature of the content assessed. Many of the performance assessments - timed, with contrived problem scenarios - can hardly be called authentic [3]. Authentic content can facilitate construction of knowledge representations "that help students to connect ideas" [22].
3. History
There is a fast growing interest on computerised assessment though the research literature has instances of automatic assessment for two decades now. Early attempts were limited to only simple grading of assignment (mostly multiple-choice questions), but recent developments have started to focus on adaptive assessment. Forsythe and Wirth’s [7] approach for numerical analysis courses was one of the very early efforts of automatic grading of assignments. Taylor and Deever [21] described an assessment system for physics and mathematics course which generated an individual profile for each student. Rottmann and Hudson [19] described a system for multiple choice assignments in physics courses. The system used a mark-sense device to input the data. Myers [13] reported a grading system for chemistry experiments using special mark-sense cards. Literature has many other instances of multiple choice assessment systems (for example, [2], [9], [17] and [18]). Mastascusa and Hoyt [12] presented a grading system to assess the quizzes in introductory electrical engineering courses.
A more advanced assessment system for electrical engineering technology courses is described by Barker [1]. The system, named CHARLIE, accepts numeric values from the students and provides immediate feedback. If an answer differs little from correct answer, the system asks the student to provide more precise answer. Deductions are made in the grades according to the number of attempts and to take account of the submission deadlines.
The recent developments in assessment systems use adaptive approaches to support intelligent assessment and individualised learning. Various approaches have been used for this purpose and the field is still quite new. Collins et. al. [5] used granularity hierarchies and bayesian nets to provide assessment of multiple traits in a single test. Conati and VanLehn [6] presented a system OLAE which, with the help of a student modeling framework POLA, performs probabilistic assessment of student’s performance while they solve introductory physics problems. Huang [8] described an adaptive testing algorithm, CBAT-2, which generates content-balanced questions based on the portion of the course curriculum that meets the goals of a test. A simple machine learning procedure is used to determine the item parameters values.
The Byzantium intelligent assessment system adopts a different approach by benefiting from a two-fold expert model in conjunction with a student model.
4. Byzantium Intelligent Assessment System
The Byzantium intelligent assessment system works in conjunction with various Byzantium intelligent tutoring tools (ITTs). The ITTs provide learning of the domain concepts and skills. They also allow submission of test assignments for computer based assessment and viewing of the marked assignments. Current ITT implementations include Capital investment appraisal, Absorption costing, Marginal costing and Standard costing. The following sub-sections describe each component of the assessment set up and marking process.
4.1 The assessment workbench
A diagram of the overall assessment workbench follows in figure 1. The students submit their solutions for problems that are either randomly generated by the ITTs, loaded from a test question bank created by a teacher, or handed out in a narrative form. In case of generated or loaded problems the independent variables are already filled and the students have to calculate the dependent variables following a proper sequence in case of a hierarchy of dependency. For narrative form questions the students have to interpret the given text to identify the given variables and their values, entering these as well as calculating the dependent variables. The ITT’s expert model (local expert model in fig.1) is based on the student’s interpretation of the given data. Once the students submits the assignment for marking, the ITT provides an evaluation of the student performance as compared to the expert model along with other relevant data like the Course Id, Student Id, Example number etc. to the intelligent assessment system. The intelligent assessment system also obtains the data structure information from the ITTs.

Fig. 1
For randomly generated and example bank problems, the intelligent assessment system simply uses the ITT’s evaluation of performance for further adjustments, for example, penalty for late submission, matching student tasks to the required tasks and assigning the required weights to the performance in different parts of a problem. For narrative form problems, the system performs a further comparison with the remote expert i.e. solution based on the correct interpretation of data as supplied by the teacher. The comparison of a student’s values with the local expert model yields an assessment of the correctness of method employed (based on a student’s understanding of the interrelationships of the concepts) while that with the remote expert model assesses data interpretation.
On completion of the assessment procedure, the intelligent assessment system provides individual feedback to the student with the help of the interface provided within ITTs. It also maintains a database of the student’s scores and provides various individual and cumulative reports in a printed form or as an export to common file formats (e.g. Excel), to enable further statistical analysis.
4.2 The problem space in ITTs for assignment submission
The problem space in an ITT is made up of a consistent network of interdependent conceptual objects. The interface of the problem screen uses "fill in the blanks" metaphor by providing an instance container for each of the domain concept. Students fill values into these containers and for a given problem, the set of instances must be mutually consistent. An instance of a concept that isn’t constrained by existing instances of any other concepts is regarded as an independent variable within the problem space. As the students are extensively exposed to the "fill in the blanks" metaphor during their schooling, the empty containers are perceived as a challenge and the students are well-motivated to solve the problem. An example of a typical problem space is shown in figure 2.

Fig. 2
For the randomly generated or example bank problems, independent variables are already provided by the system. In case of the narrative form problems, students get a textual description of a problem. They need to interpret the data, identify independent variables and enter their values before calculating the remaining dependent variables. On completion of each assignment, the students save it either on a network server or on a floppy disk (for non-networked installations) in a custom format that is inaccessible till it has been assessed by the intelligent assessment system.
4.3. Functionality of the intelligent assessment system
The intelligent assessment system assesses the student assignments under a batch process using the different marking schemes created by the teachers for different courses. The system provides a high degree of customisation executed through instinctive and user-friendly interfaces. The main functions of the system are described in the following sub-sections.
4.3.1. Data transfer for marking schemes, model answers and examples
The system allows transfer of marking schemes, model answers and any additions to the bank of practice and test examples between computers. This facility enables the teachers to create such data on their own computers and then transfer it to networked drives. It also facilitates transfer of such data at year end to a new database. The data transfer interface is shown in figure 3.

Fig. 3
4.3.2. Creation of different marking schemes
A teacher creates a marking scheme for each course to set the number of assignments to be assessed, the scoring method (explained later) to be used and a link to the model answer for a narrative form assignment. The teachers can create several marking schemes identifying them with scheme numbers and set the assessment system to mark one scheme at a time or more than one scheme simultaneously within a batch run.
The teachers can either choose an already created Course or create a new one. The next step is to select the required ITT from all the installed ITTs listed by the system. Once the package is selected, the teacher selects an already existing scheme or creates a new scheme. If a new scheme is created, the interface also allows any brief notes or comments to be entered for subsequent reference. The next step in a scheme creation is to select the example numbers that will be assessed, specifying which of these will be set as a narrative form problem and therefore have an associated model answer. An appropriate model answer is selected from the list of model shown by the system.
The system also allows selection or creation of a scoring method. It is called a method to distinguish it from the main marking scheme. It enables different weights to be assigned to different parts of a problem’s solution, in recognition of the need to change the relative importance of a set of tasks at different stages of the learning. The use of a scoring method also enables setting of partial problems to test a student’s understanding up to an intermediate stage or to focus on a sub-technique.
The scheme and scoring method creation interfaces are shown in figure 4 and figure 5 respectively.

Fig. 4

Fig. 5
The marking schemes need to be created only once for each course while the scoring methods can be used across different courses. Though the creation of the various schemes and scoring methods require some extra effort initially, they provide the necessary structure for the intelligent assessment system to rapidly assess different courses, different subject matter and different student cohorts using a single batch run set up. The batch marking process is described next.
4.3.3. Batch marking set up
One of the main advantages of an intelligent assessment system is that it frees up the time and effort spent by a human teacher on the relatively mechanical and repetitive task of marking numerical assignments while providing a quality of feedback that is similar in standard to a one to one tutoring. The use of such systems is beneficial as they shorten the test, assessment and feedback cycle, provide individual attention to each student and make more teacher time available for creative activities, thus improving both the teacher and student morale. This sub-section describes the batch marking process while the feedback is described later.
The intelligent assessment system can assess assignments from different courses and subject matter using different marking schemes in a single batch set up. Figure 6 shows a batch set up for marking the Capital investment appraisal and Standard costing assignments.

Fig. 6
While marking, the system checks its existing records and rejects any re-submissions, however it can be set to accept re-submissions and overwrite previous records. Penalty reductions can also be set, so that a second batch run, say a week after the submission deadline automatically deducts a certain percentage for late submission. It is set by entering a value for the Reductions: All marks parameter. The batch marking also handles narrative form problems and compares a student’s answer to the specified model answer. The system can be set to assign a partial score for ‘incorrect interpretation but correct method’ by entering an appropriate value for the Reductions: Wrong value parameter.
Quite often, some students submit assignments with a wrong Course Id. or incorrect example number. Such assignments are rejected in the batch marking run. However, the assessment system provides a Submission browser facility to view and edit such critical data. On completion of the assessment process, various forms of output are made available by the system.
4.3.5. The intelligent assessment system outputs
The system provides output for both teachers and students. It updates the student assessment database and provides various reports for a teacher to view on the screen, to print or to export. It also updates a student’s assignment so it can be viewed with the View marked work facility of the ITTs.
4.3.5.1. Teacher related output
The system provides various reports and creating new reports is relatively easy as the system uses report definition files created by Crystal Reportsä , a widely available report creation tool. The report selection interface is data driven and lists all the report definition files in the database directory. Table 1 and table 2, appended to the paper show sample reports created by the system. A quick way to look up a student’s results for one or more ITTs is provided by the Score browser option. Both the Score browser and Report generator allow filtering of data as shown in figure 7.

Fig. 7
4.3.5.2. Student related output
The View marked work interface in the ITTs is used by the assessment system to provide feedback to the students. The system provides a comparison of a students’ solution with the expert solution in an interface similar to the problem space shown in figure 2. The incorrect attempts are shown on a red background while the corresponding correct answers are shown on a green background to distinguish them from the correct attempts. The distinction between a completely incorrect attempt and the one where a correct solution method is applied to an incorrect data interpretation is under implementation and this will be achieved by introducing a dark magenta background. The overall score obtained by the student is also shown on the screen.
5. Experiences and future plans
The design and development of intelligent assessment system has gone through a long cycle of formative evaluation and subsequent improvements. Summative evaluation of the system as implemented at various institution during the prototyping phase has also ended. Currently the intelligent assessment system is being implemented in various universities in United Kingdom along with the Byzantium intelligent tutoring tools. The empirical experience of teachers and students has been positive.
An additional consideration that makes the intelligent assessment system specially suitable in the real work environment is its extensibility. The system is designed to ensure that its internal structure remains independent of any existing ITT. The data driven design of the system allows it to be used for any ITTs developed in the future as long as the ITT can provide its data structure information to the system using the standard protocol.
The plans for future development include making the system even more beneficial and adaptive to the student needs. One such plan is to provide the students a detailed analysis of their problem solving approach including comments on any sub-optimalities of their solutions. This analysis can be made by matching the prioritised conceptual relationships in knowledge base with that of used by the students. The ITTs provide a calculator interface and allows picking and dropping of values by left-click and right-click respectively on the variable containers within the problem space. By constraining the interface so that a student cannot directly enter a value and has to always use the calculator interface, the system can keep track of a student’s problem solving process. It can thus monitor the relationships actually used rather than inferring the student’s understanding from an outcomes based overlay model. There are some outstanding issues to resolve yet, mainly in connection with the interface considerations.
It is also a longer term vision to convert to a component based software engineering approach, provide an authoring shell and share the development environment using the World Wide Web. This would facilitate rapid and incremental development through shared resources. The work involved in achieving this vision cannot be underestimated, but it is worthwhile as there is huge number of pupils and students learning numeric disciplines. Also, people in general have more problems with numeric disciplines. It should be emphasised, however, that even in their current state, the combination of the intelligent tutoring tools and the intelligent assessment system is proving to be beneficial in real academic environment.
Acknowledgements
Authors wish to acknowledge Mr Jamie Hunter for his software engineering contributions in the development of the intelligent tutoring system.
References
[1] D. S. Barker, CHARLIE: A computer-managed homework, assignment and response, learning and instruction environment, Paper presented at the Frontiers in Education Conference, Nov. 5-8, Pittsburgh, USA, 1997.
[2] S. Barton, LEARN and SIFLEARN, University of Agriculture and Forestry, Brno, Czech Republic, 1991.
[3] A. M. Bork, D. R. Britton & S. Gunnarsdottir, Combining learning and assessment, in: Interactive Multimedia in University Education: Designing for change in Teaching and Learning (A-59) , K. Beattie, C. McNaught & S. Wills, eds., Elsevier Science B. V. (North-Holland), 1994, pp. 113-130.
[4] A. Collins, J. S. Brown & S. E. Newman, Cognitive Apprenticeship : Teaching the crafts of reading, writing and mathematics, in Knowing, Learning and Instruction, Lauren B. Resnick, ed., Lawrence Erlbaum Associates, Hillsdale, N. J., 1989, pp. 453-494.
[5] J. A. Collins, J. E. Greer & S. X. Huang, Adaptive assessment using granularity hierarchies and bayesian nets, Lecture Notes in Computer Science, 1086 (1996), pp. 569-577.
[6] C. Conati & K. VanLehn, POLA: a student modeling framework for probabilistic on-line assessment of problem solving performance, Proceedings of the Fifth International Conference on User Modeling, Kailua-Kona, HI, 1996, pp. 75-82.
[7] G. E. Forsythe & N. Wirth, Automatic grading programs, Communications of the ACM, 8 (1965), pp. 275-278.
[8] S. X. Huang, A content-balanced adaptive testing algorithm for computer-based training systems, Lecture Notes in Computer Science, 1086 (1996), pp. 306-314.
[9] P. Lira, M. Bronfman & J. Eyzaguirre, MULTITEST II: A program for the generation, correction, and analysis of multiple choice tests, IEEE Transactions on Education, 33 (1990), pp. 320-325.
[10] D. Maskell, Problem-based engineering design and assessment in a digital systems program, Paper presented at the Frontiers in Education Conference, Nov. 5-8, Pittsburgh, USA, 1997.
[11] Mathematical Sciences Education Board, Measuring what counts, A conceptual Guide for Mathematics Assessment, National Academy Press, USA, 1993.
[12] E. J. Mastascusa & B. Hoyt, Incorporating "computer-graded" components into electronics lessons, Paper presented at the Frontiers in Education Conference, Nov. 5-8, Pittsburgh, USA, 1997.
[13] R. Myers, Computerized grading of freshman chemistry laboratory experiments, Journal of Chemical Education, 63 (1986), pp. 507-509.
[14] National Science Education Standards, Assessments in Science Education, National Academy Press, USA, 1994.
[15] A. Patel & Kinshuk, Applied Artificial Intelligence for Teaching Numeric Topics in Engineering Disciplines, Lecture Notes in Computer Science, 1108 (1996), pp. 132-140.
[16] A. Patel & Kinshuk, Intelligent Tutoring Tools in a Computer Integrated Learning Environment for introductory numeric disciplines, Innovations in Education and Training International Journal, 34(3) (1997), pp. 200-207.
[17] J. Piotrowski, The small computer assisted lecturing system, SIGCSE Bulletin, 20 (1988), pp. 8-12.
[18] R. Posteraro, D. Blackwell & A. Huddleston, Techscore: A program for tabulating the results of multiple choice questions and correcting multiple choice examinations, Computers in Biology and Medicine, 16 (1986), pp. 259-265.
[19] R. M. Rottmann & H. T. Hudson, Computer grading as an instructional tool, Journal of College Science Teaching, 12 (1983), pp. 152-156.
[20] J. A. Stuart, A method for teaching problem assessment, Paper presented at the Frontiers in Education Conference, Nov. 5-8, Pittsburgh, USA, 1997.
[21] J. Taylor & D. Deever, Constructed-response, computer-graded homework, American Journal of Physics, 44 (1976), pp. 598-599.
[22] S. W. Valencia, You can’t have authentic assessment without authentic content, The Reading Teacher, 44(8) (1991), pp. 590-591.
[23] R. Waters & M. McCracken, Assessment and evaluation in problem-based learning, Paper presented at the Frontiers in Education Conference, Nov. 5-8, Pittsburgh, USA, 1997.
Table 1. Scores listed by Student, Package and Question
|
Report date |
Course |
70 |
Course description | ||||||||
|
Stud. Id |
Name |
Package |
Scheme |
Question |
% Score | ||||||
|
131 |
Surname_1, First_Name_1 |
ABS |
1.00 |
1.00 |
93.13 | ||||||
|
Sub-total for |
ABS |
93.13 | |||||||||
|
MCC |
1.00 |
1.00 |
100.00 | ||||||||
|
2.00 |
100.00 | ||||||||||
|
Sub-total for |
MCC |
200.00 | |||||||||
|
Surname_1, First_Name_1 |
Total for all pkgs. |
293.13 | |||||||||
|
316 |
Surname_n, First_Name_n |
ABS |
1.00 |
91.67 | |||||||
|
2.00 |
91.67 | ||||||||||
|
Sub-total for |
ABS |
183.33 | |||||||||
|
MCC |
1.00 |
90.00 | |||||||||
|
2.00 |
60.00 | ||||||||||
|
3.00 |
50.00 | ||||||||||
|
4.00 |
90.00 | ||||||||||
|
Sub-total for |
MCC |
290.00 | |||||||||
|
Surname_n, First_Name_n |
Total for all pkgs. |
473.33 | |||||||||
Table 2. Student scores and summary for a Course and Package
|
Student Performance Listing by Course | ||||||||||||||||||||||||
|
Report date |
Course: |
70 |
Course description |
Absorption Costing | ||||||||||||||||||||
|
Stud. Id |
Student Name |
Que 1 |
Que 2 |
Que 3 |
Que 4 |
Total |
||||||||||||||||||
|
131 |
Surname_1, First_Name_1 |
93.13 |
0.00 |
93.13 |
||||||||||||||||||||
|
139 |
Surname_2, First_Name_2 |
64.86 |
66.39 |
131.25 |
||||||||||||||||||||
|
316 |
Surname_n, First_Name_n |
91.67 |
91.67 |
183.34 |
||||||||||||||||||||
|
|
n = |
3 |
Scheme: |
1 |
Avg % |
83.22 |
52.69 |
135.91 |
| |||||||||||||||
|
|
n = |
3 |
Pkg: |
ABS |
Avg % |
83.22 |
52.69 |
135.91 |
| |||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||||
|
Student Performance Listing by Course | ||||||||||||||||||||||||
|
Report date |
Course: |
70 |
Course description |
Marginal Costing | ||||||||||||||||||||
|
Stud. Id |
Student Name |
Que 1 |
Que 2 |
Que 3 |
Que 4 |
Total |
||||||||||||||||||
|
131 |
Surname_1, First_Name_1 |
100.00 |
100.00 |
0.00 |
0.00 |
200.00 |
||||||||||||||||||
|
139 |
Surname_2, First_Name_2 |
100.00 |
100.00 |
80.00 |
100.00 |
380.00 |
||||||||||||||||||
|
316 |
Surname_n, First_Name_n |
90.00 |
60.00 |
50.00 |
90.00 |
290.00 |
||||||||||||||||||
|
|
n = |
3 |
Scheme: |
1 |
Avg % |
96.67 |
86.67 |
43.33 |
63.33 |
290.00 |
| |||||||||||||
|
|
n = |
3 |
Pkg: |
MCC |
Avg % |
96.67 |
86.67 |
43.33 |
63.33 |
290.00 |
| |||||||||||||