Jumat, 21 Januari 2011

TES dan DEFINISINYA


Language Testing
A test is a measuring device which we use when we want to compare an individual with other individuals who belong to the same group…Tests invite candidates to display their knowledge or skills in a concentrated fashion, so that the results can be graded and inferences  made from the standard of performance in the test about the general standard of performance that can be expected from the candidate, either at the time of the test or at some future time.
1. DEFINISI-DEFINISI

1.1 EVALUASI

a. Menurut Norman E. Gronlund:
Evaluasi adalah suatu proses yang sistematik dan berkesinambungan untuk mengetahui efisiensi KBM dan efektifitas dari pencapaian tujuan instruksional yang telah ditetapkan.
b. Menurut Edwin Wond dan Gerold W. Brown:
Evaluasi pendidikan adalah proses untuk menentukan nilai dari segala sesuatu yang berkenaan dengan pendidikan.
c. Menurut Wiersma dan Jurs:
Evaluasi adalah suatu proses yang mencakup pengukuran dan mungkin juga testing, yang juga berisi pengambilan keputusan tentang nilai
d. Menurut Suharsimi Arikunto
Evaluasi merupakan kegiatan mengukur dan menilai.
Pengukuran :
(measurement) adalah proses pemberian angka atau usaha memperoleh deskripsi numerik dari suatu tingkatan di mana seorang peserta didik telah mencapai karakteristik tertentu.
ASESMEN
  • Penilaian (asesmen) menjawab pertanyaan tentang sebaik apa hasil atau prestasi belajar seorang peserta didik.
  • Hasil penilaian dapat berupa nilai kualitatif (pernyataan naratif dalam kata-kata) dan nilai kuantitatif (berupa angka).
  • Pengukuran berhubungan dengan proses pencarian atau penentuan nilai kuantitatif tersebut.
TEST
o   Tes adalah serangkaian pertanyaan/latihan yang digunakan untuk mengukur ketrampilan pengetahuan, intelegensi, kemampuan atau bakat yang dimiliki individu / kelompok.
Ciri-ciri Tes yang Baik

a. Valid
This may denote both face validity, i.e. the way laymen (the learners, their parents, etc.) will appraise the test, whether the test appears to test what it is supposed to test, and content validity, i.e. the question whether the test reflects the content of the syllabus and whether it really measures what it is supposed to measure, and nothing else (e.g. summarizing a text heard from tape not only checks writing, but also listening comprehension and the ability to select, extract, and condense the most essential information; general knowledge, intelligence-testing and culturally-loaded questions do not test linguistic competence but extralinguistic knowledge or the analytical skills of the testee).

We can also distinguish between concurrent validity, related to testing the learners’ current command of the language, and predictive validity, i.e. assessing how well the learner will perform in future tasks, basing on his/her current level of linguistic attainment. Validity also implies that the tasks should be as realistic as possible and closely related to the situations in which the examinees will perform in real life.

b. Reliabel
In other words, the consistency and credibility of measurement; ensuring that the results of the test are not incidental.

A differentiation is usually made between:
  1. Test reliability ,i.e. whether the test measures language consistently (short tests are considered less reliable than ones covering a representative sample of the material taught and with a variety of testing formats), and
  2. Marker (examiner/rate/scorer) reliability (closely connected with objectivity).

Marker reliability is low for testing speaking  (the individual subjective tastes, norms and criteria of assessment of two examiners may differ considerably – the inter-marker reliability problem; moreover, the examiner’s assessment may be affected by the speaker’s physical appearance or other personal preferences) and writing (the norms assumed by different examiners may again be incongruous, and the final score may also be affected by the place the test occupies in the pile: an average composition marked directly after a good one will probably be given a lower mark than if it came after a poor one; moreover, a growing fatigue of the examiner may result in an increasing irritation or, quite the contrary, in a growing leniency – the intra-marker reliability problem). Marker reliability may be enhanced by increasing the number of examiners on the panel (in the case of oral interviews), developing a set of specified analytic criteria and standards instead of holistic (impressionistic) assessment, or grouping written works according to proficiency level prior to giving marks.

c. Praktis
This criterion comes into concern in mass testing, when there is a need for evaluating the progress of several learners. Examining the practicality of the test planned involves, among others, looking at: 

1.       The preparation necessary to design the test (basically how long it is expected to take)
2.       The administering of the test proper (arrangement of seating, distribution of the test among the learners, supervision, necessary equipment, timing, etc.; for instance, oral interviewing of several students will be time-consuming; moreover, the students taking the exam later may be better prepared knowing the questions)
3.       Scoring (marking e.g. essays, translation or dictation pieces may be a time-consuming and therefore inefficient process)


Aims of testing
l  Research
l   Progress
l   Guide to teaching and the curriculum
l   Representing terminal behavior
Requirements of a good test
Validity: the degree to which a test measures what it is meant to measure, or can be used successfully for the purposes for which it is intended.
l   Face validity
l   Content validity
l   Construct validity
l   Empirical validity
Reliability: stability or consistency of test scores
Factors  that may affect reliability include
        The extent of the sample of material selected for testing
        The administration of the test
        Scoring the test
        Test instructions
        Personal factors
        Discrimination: the degree to which a test or an item in a test distinguishes among better and weaker students who take the test
        Practicality: the usability of a test, or practical considerations such as ease of administration, scoring and interpretation as well as financial limitations and time constraints
Types of language test
Tests distinguished by use
l  Achievement/attainment tests
l  Proficiency tests
l  Aptitude tests
l  Diagnostic tests
l  Placement tests
Tests distinguished by the standard for measuring
l  Criterion-referenced tests (e.g. achievement tests)
l  Norm-referenced tests (e.g. proficiency tests)
(A tabulated summary of CRT and NRT)
Interpreting test results
Measures of central tendency
        The median
        The arithmetic mean (or simply the mean)
        The mode
Measures of dispersion
        Range
        Standard deviation (SD)
        Variance
Percentile ranks (or scales)

Tidak ada komentar:

Posting Komentar

Silahkan anda berkomentar, namun tetap jaga kesopanan dengan tidak melakukan komentar spam.