Introduction
The need for physical
testing for workers in manual materials handling jobs has been
recognized
by risk managers, personnel specialists, physiologists, occupational
physicians
and ergonomists increasingly over the past few years. Each of these
groups
have independently come to recognize the benefits to both the
individuals
being tested and the organizations themselves. Risk managers have an
interest
in job safety and reducing workers' compensation costs. Personnel
specialists
seek to hire the most qualified individual available, reduce
absenteeism
and sick leave, and yet comply with state and federal EEO mandates.
Physiologists
and physicians seek to reduce unnecessary injuries and find ways to
better
predict in advance those most likely to become injured. Ergonomists
seek
to study individual jobs and find ways through either job redesign or
job
selection systems that can better match the worker to the work.
Cognizant of these
issues, MED-TOX Health Services has developed an approach to assist
employers
in the validation of physical ability tests for new hires. Since
overexertion
injuries account for a large number of all work-related back injuries,
it makes sense to reduce the potential for overexertion. Hiring workers
with the adequate strength to perform the job is one way of reducing
these
injuries. A valid strength test, therefore, can reduce injuries in jobs
for which high levels of strength are required. The MED-TOX approach
has
two goals:
- Provide employers
with a valid and legally defensible job analysis of the critical and
essential,
frequently performed, and physically demanding tasks associated with
the
occupation.
- Provide employers
with a physical ability test that is job-related, valid, and reliable
that
can confidently be used in the selection of individuals for physically
demanding jobs.
The MED-TOX approach
is chosen on the basis of safety, reliability, and validity.
Validity
and reliability is discussed below. Ability tests are safer than work
sample
tests because it is preferred to determine how much weight an applicant
can lift rather than asking the applicant to lift a heavy weight. If
the
applicant does not have the necessary strength to lift the weight he
can
become injured during the test. Using an ability test allows the
employer
to determine if the applicant can only lift 20 lbs or 200 lbs. in a
safe,
efficient and standardized manner.
The MED-TOX approach
presented here is a criterion-related validation methodology. The
approach
is designed to ensure that the ability test (selection device) is
empirically
demonstrated to be related to the job. The empirical linkage between a
given job and a given test is not insignificant. Often testing
(physical
or otherwise) is based on "face validity." Face validity is not
recognized by the courts and has been described by experts as a claim
made
in the absence of meaningful data.
Because physical
ability tests can be subjected to a high standard of legal and
administrative
review, empirical evidence is usually necessary to show
job-relatedness.
A high standard of evidence is also necessary since all tests of
strength
show adverse impact against females. Since a showing of adverse impact
requires the employer demonstrate job-relatedness, documentation
confirming
such a relationship becomes crucial.
Job Analysis
Once and employer's
job has been selected for study, it is necessary to conduct structured
group interviews with workers from that job. The job analysis inquiry
is
directed at collecting tasks which require static strength. Static
strength
involves the continuous exertion of maximum muscle force for a brief
period
time. Tasks that involve the lifting, pulling, pushing, or carrying of
objects and materials require static strength.
Following the
structured
group interview, workers and the MED-TOX representative go to
warehouses,
storerooms, and other work areas to directly examine tools, equipment
and
materials that had been described by workers during the meeting. An
industrial
scale and/or a force gauge is used to directly weigh as many of the
relevant
objects as possible. If additional materials or tools are found that
are
also lifted, these objects are weighed, the weights recorded, and the
lifting
tasks added to the task listing. Multiple job analysis meetings may be
necessary if there are several different geographical locations or
significant
differences facilities or number of employees in each location. Once
these
meetings have been conducted, a task inventory is produced. A task
inventory
is a listing of all the task collected. Task inventories are generally
subject to several review phases prior to worker surveys.
In order to measure
a job, one needs a measuring tool. Rating scales are the most useful
measuring
tools when performing job analysis activities with task inventories.
MED-TOX
rating scales can have a number of customized features depending on the
job and specific organizational needs. To validate a strength test,
however,
at a minimum it is important to illicit from workers:
- Whether or not the task is
performed;
- How far the object is carried;
- How often the task is performed;
- How important the task is to the
job; and
- Whether persons who efficiently
perform the task are more capable than workers
who have a difficult time performing the task.
A random sample of
workers completes the task inventory. The responses are entered into
a statistical software program for data analysis. The first step in
data analysis is the computation of the percentage of workers that
perform each
individual task. Tasks performed by less than 50% of the workers are
eliminated from further consideration. Means are next computed for each
dimension
(How Far, Frequency, Importance and Proficiency). Next, a criticality
index is calculated. This index is the product of the mean Importance
rating and
the mean Frequency rating. This product is then multiplied by the
percentage of workers who indicate they performed the task and divided
by 100. Thus,
greater weight is given to tasks performed by all workers and less
weight is assigned to tasks performed by fewer workers. Tasks performed
by all
workers are more likely to be critical and essential core job tasks
than those performed by only some workers. Tasks with high criticality
ratings
are identified as essential job functions.
Work Sample
Development
Having determined
which static strength tasks are critical for the job, it is next
necessary
to determine which tasks are suitable for utilization as work samples.
Ideally, the tasks selected should be among the most demanding tasks
workers
are expected to perform. Additionally, other criteria should be
considered
including:
- Safety to
incumbents.
Tasks selected should be safe to perform in a testing situation. Some
tasks
might not be dangerous to experienced workers, but could be to a novice.
- Reasonable time
to administer. The tasks selected for work sample development
should
be those which can be completed in a reasonable amount of time.
- Unambiguous
scoring and clarity of results. Tasks selected should be amiable to
an unambiguous scoring or rating system. There should be no
disagreement
as to what constituted various levels of performance. Subjective
ratings
on "style of lifting" or "ease of lifting" are less
suitable when objective measures are possible.
- Regional
availability
and low cost of equipment. The materials necessary for task
performance
should be readily available and inexpensive.
- Simplicity.
The tasks selected should be as simple as possible from both the point
of view of instruction to incumbents and administration of the work
sample.
- Commonality.
The tasks selected should be commonly performed by as many workers as
possible.
Critical tasks
that meet the criteria can be categorized in a variety of ways. For
example,
all tasks involving the use of a wheelbarrow might form a group or task
category. Alternatively, all tasks that involve work at particular work
site, or all tasks performed while unloading a box car or repairing
heavy
equipment could form other groups. The nature of the job and tasks
performed
typically lend themselves to the selection of appropriate task
categories.
Task categories are important because they help the analyst organize
the
work and ensure that a variety of lifting tasks can be used to
construct
work samples. An example of a category might be:
Five
Gallon Container (Paint, Joint Compound, Floor Sealer) Tasks
- Lift/carry a five gallon can of
floor sealer (approx. weight 46.3 lbs.).
- Carry a five gallon bucket of
paint
(approx. weight 55.4 lbs.).
- Lift/handle a five gallon bucket
of
joint compound (approx. weight 51 lbs.).
- Lift a five gallon bucket of paint
into the back of a vehicle (approx. weight 55.4 lbs.).
- Lift a five gallon bucket of paint
up onto a stack of other five gallon paint buckets (approx.
weight 55.4 lbs.).
Work samples may then be developed
from these categories of common critical tasks. For example,
a work sample constructed from these tasks might be constructed as:
Five Gallon
Bucket Stack
Approach a row of
four five-gallon buckets of material. Stack three of the buckets on
top of one of the buckets of paint. Take the top bucket of paint off
the
stack and carry it to the truck bed. Set it down and release grip.
Regrip
paint can and return to the stack of three. Place the can beside the
stack
and replace the two remaining cans on the ground in a row, as they were
initially.
Selecting Appropriate
Static Strength Tests
MED-TOX has used
the Jackson Strength Evaluation System (JSES) in several projects and
has
found it to be a valid and reliable predictor of the ability of
individuals
to perform lifting, push, pulling, and carrying task.
The JSES was
developed
by Dr. Andrew S. Jackson of the University of Houston. It features an
electronic
load cell to ensure accurate and reproducible readings of isometric
strength.
Large readouts allow determination of both peak and average strength in
pounds. The system includes the control and load cell, a hand
dynamometer
fixture for the measurement of grip strength, and a heavy duty lifting
platform, bar and chain. The manufacturer reports that the JSES is
widely
used to measure static strength using the National Institute of
Occupational
Safety and Health (NIOSH) protocol.
The JSES has three
qualities that make it ideal for employment testing. It has been shown
to be safe, reliable (r = .90), and practical. Results should be
obtainable within 15 minutes. The JSES is widely recognized as a
reliable
and valid indicator of the amount of static strength possessed by
individuals.
At the present time many industrial medical clinics and employers are
obtaining
the JSES. The test is relatively inexpensive (it can be obtained for
less
than $4,500), it is practical, safe and portable. Normative data for
the JSES can be viewed by clicking
here.
In 1995, the EEOC
issued guidelines which attempted to clarify the difference between a
medical
test and a physical ability test. According to the EEOC a medical test
was more likely to measure an individual's "physiological response"
to performing a task whereas a physical ability test measured task
performance
directly. MED-TOX questioned the EEOC as to whether the JSES would be
considered
a medical or physical test since it measured an individual's strength
(a
physiologic response) and was not a direct measure of a particular task
but could predict performance on a variety of lifting tasks. In a
December
7, 1995 letter to MED-TOX, the EEOC stated:
The answer to your
inquiry concerning the Jackson Strength Evaluation System (JSES)
depends
on the context in which this test is given. For example, if the JSES is
used simply to determine whether a person is capable of lifting a
thirty
pound box and carrying it twenty feet, the test would not be considered
"medical" and could be administered pre-offer. In this context,
it would not be dispositive that the test is used or interpreted by a
health
professional. Similarly, the score one achieves on the test -- provided
the only thing being measured is the amount of weight the person can
lift
-- does not render the test a medical one even though, as you put it,
"strength"
is being measured. As the Enforcement Guidance illustrates at pages 14
and 15, an assessment of whether a person can lift a fifty pound box is
a physical ability test as distinguished from a medical exam. If you
were
to measure the person's heart rate after the act of lifting, you would
then be engaging in a medical examination. While the distinction may
appear
subtle, it is legally significant and constitutes the difference
between
what can and cannot be done under the ADA before a conditional offer of
employment [EEOC, personnel communication to MED-TOX, December 7, 1995].
In other words,
the EEOC would permit the use of the JSES (in the pre-employment
context)
to the extent that test performance was related to the ability of an
individual
to perform a specific task or group of tasks. In order to establish a
relationship
between the JSES and work performance, it is necessary to conduct field
testing with workers (participants).
Field Testing
and Data Analysis
A stratified random
sample of experienced workers is typically chosen for testing. The
sample
should consist of individuals from various ages, racial groups and both
genders. Of course, many organizations will not have a significant
number
of females for testing nor will they have individuals employed who
cannot
perform the job. Without representatives from these groups, it is more
difficult to set a defensible cut-off score. Therefore, we suggest that
administrative and clerical workers participate in field testing as
well.
Field testing
consists
of a brief medical screening, informed consent, an explanation of the
testing,
and height and weight measurements. Next the participants are
administered
the JSES. Participants exert a constant force for three seconds on the
four tests which used the lifting bar and for three seconds using the
Jamar hand
dynamometer in accordance with the manufacturer's instructions.
The electronic
monitor
connected to each load cell records the amount of force exerted in
pounds
of force. Peak and average force is recorded.
For the Grip
Strength
test, participants squeeze on the hand dynamometer first with the
dominant hand and then with the nondominant hand. For this test, peak
grip strength is recorded.

During the Arm Lift, participants
stand erect with palms up, their elbows at the side,
and forearms at a 90 degree angle to pull up on the lifting bar.

The Shoulder Lift
also requires the participants to stand erect but with their palms
down. The participants
then pull up on the bar as if lifting a jackhammer.

The Torso
Pull requires the participants to sit on the ground with their legs
extended and their feet flush against the lifting platform which is
placed against a wall. Participants pull back with their arms and legs
extended.

The Leg Lift test
requires the participant to squat with the arms extended downward. The
lifting motion is entirely in the legs as they are straightened.
Three trials are
conducted for each participant on each of the five tests, with the
average
of the last two trials used as the score. Scores are recorded for each
trial.
Next, the
simulations
are performed by the participants. The simulations consist of actual
work
samples of the job. Several events such as the Five Gallon Bucket Stack
described above will have been constructed. Participants are given
ample
time to rest between events and to decline testing at any time. Two
timers
use stop watches to record the time it takes for each participant to
complete
each work sample. Times are averaged for both stop watches and recorded
as the score.
Participants are
instructed not to run or to perform the work at an unnatural
pace.
Participants are asked to envision a day in which they had a lot of
different
tasks to perform. When one task was completed, other important tasks
are
to follow. Participants are instructed to work at what might be
considered
a heavier than average pace, but not one that was unrealistic or
unrepresentative
of the pace at which they might work on a busy day.
Following testing,
participants estimate their personal fitness level, the minimum level
of
performance that they would consider acceptable for each work sample,
how
realistic each work sample is, and additional questions that are
utilized
to assist in setting the cut-off score.
Statistical
Analysis
Reliability of
the JSES is assessed by comparing the scores of the two recorded trials
on each test. Reliability typically varies from a low of .94 to a high
of .97.
Correlation
coefficients
are computed for all tests to determine their interrelationships and
lack
thereof. Multiple regression analysis is used to derive equations to
predict
the performance of individuals on the work sample test who have only
taken
the JSES.
Validity is
assessed
by statistical analysis as to how well each regression equation is
predictive
of work sample performance. A perfectly predictive equation would have
an R-squared of 1.0 and a R-squared of 0.0 would indicate that the
equation
had no ability to predict job performance at all.
Passing Levels
(Cut-off scores)
Setting cut-off
scores is a particularly complex area of test construction. MED-TOX
utilizes
multiple forms of evidence to arrive a cutoff level that is consistent
with business necessity. The cut-off scores permit the selection of
qualified
workers, are based on the results of the task analysis, and on the
performance
of currently employed workers and their judgments as to what
constitutes
acceptable performance. As each test validation situation is unique, no
perfect formula can be offered in advance here.
Conclusion
MED-TOX offers
services in the criterion-related validation of physical ability tests.
The tests are based on a comprehensive job analysis and field testing
of
workers performing work samples and their scores on the JSES. The tests
will permit the inclusion of individuals most likely to be able to
perform
the tasks without undue risk of injury to themselves and to screen-out
persons who do not possess sufficient physical ability to adequately
perform
the job.
|