Introduction
The need for physical
testing for workers in manual materials handling jobs has been recognized
by risk managers, personnel specialists, physiologists, occupational physicians
and ergonomists increasingly over the past few years. Each of these groups
have independently come to recognize the benefits to both the individuals
being tested and the organizations themselves. Risk managers have an interest
in job safety and reducing workers' compensation costs. Personnel specialists
seek to hire the most qualified individual available, reduce absenteeism
and sick leave, and yet comply with state and federal EEO mandates. Physiologists
and physicians seek to reduce unnecessary injuries and find ways to better
predict in advance those most likely to become injured. Ergonomists seek
to study individual jobs and find ways through either job redesign or job
selection systems that can better match the worker to the work.
Cognizant of these
issues, MED-TOX Health Services has developed an approach to assist employers
in the validation of physical ability tests for new hires. Since overexertion
injuries account for a large number of all work-related back injuries,
it makes sense to reduce the potential for overexertion. Hiring workers
with the adequate strength to perform the job is one way of reducing these
injuries. A valid strength test, therefore, can reduce injuries in jobs
for which high levels of strength are required. The MED-TOX approach has
two goals:
Provide employers
with a valid and legally defensible job analysis of the critical and essential,
frequently performed, and physically demanding tasks associated with the
occupation.
Provide employers
with a physical ability test that is job-related, valid, and reliable that
can confidently be used in the selection of individuals for physically
demanding jobs.
The MED-TOX approach
is chosen on the basis of safety, reliability, and validity. Validity
and reliability is discussed below. Ability tests are safer than work sample
tests because it is preferred to determine how much weight an applicant
can lift rather than asking the applicant to lift a heavy weight. If the
applicant does not have the necessary strength to lift the weight he can
become injured during the test. Using an ability test allows the employer
to determine if the applicant can only lift 20 lbs or 200 lbs. in a safe,
efficient and standardized manner.
The MED-TOX approach
presented here is a criterion-related validation methodology. The approach
is designed to ensure that the ability test (selection device) is empirically
demonstrated to be related to the job. The empirical linkage between a
given job and a given test is not insignificant. Often testing (physical
or otherwise) is based on "face validity." Face validity is not
recognized by the courts and has been described by experts as a claim made
in the absence of meaningful data.
Because physical
ability tests can be subjected to a high standard of legal and administrative
review, empirical evidence is usually necessary to show job-relatedness.
A high standard of evidence is also necessary since all tests of strength
show adverse impact against females. Since a showing of adverse impact
requires the employer demonstrate job-relatedness, documentation confirming
such a relationship becomes crucial.
Job Analysis
Once and employer's
job has been selected for study, it is necessary to conduct structured
group interviews with workers from that job. The job analysis inquiry is
directed at collecting tasks which require static strength. Static strength
involves the continuous exertion of maximum muscle force for a brief period
time. Tasks that involve the lifting, pulling, pushing, or carrying of
objects and materials require static strength.
Following the structured
group interview, workers and the MED-TOX representative go to warehouses,
storerooms, and other work areas to directly examine tools, equipment and
materials that had been described by workers during the meeting. An industrial
scale and/or a force gauge is used to directly weigh as many of the relevant
objects as possible. If additional materials or tools are found that are
also lifted, these objects are weighed, the weights recorded, and the lifting
tasks added to the task listing. Multiple job analysis meetings may be
necessary if there are several different geographical locations or significant
differences facilities or number of employees in each location. Once these
meetings have been conducted, a task inventory is produced. A task inventory
is a listing of all the task collected. Task inventories are generally
subject to several review phases prior to worker surveys.
In order to measure
a job, one needs a measuring tool. Rating scales are the most useful measuring
tools when performing job analysis activities with task inventories. MED-TOX
rating scales can have a number of customized features depending on the
job and specific organizational needs. To validate a strength test, however,
at a minimum it is important to illicit from workers:
Whether or not
the task is performed;
How far the object
is carried;
How often the task
is performed;
How important the
task is to the job; and
Whether persons
who have an easy time performing the task are more capable than workers
who have a difficult time performing the task.
A random sample
of workers completes the task inventory. The responses are entered into
a statistical software program for data analysis. The first step in data
analysis is the computation of the percentage of workers that perform each
individual task. Tasks performed by less than 50% of the workers are eliminated
from further consideration. Means are next computed for each dimension
(How Far, Frequency, Importance and Ease). Next, a criticality index is
calculated. This index is the product of the mean Importance rating and
the mean Frequency rating. This product is then multiplied by the percentage
of workers who indicate they performed the task and divided by 100. Thus,
greater weight is given to tasks performed by all workers and less weight
is assigned to tasks performed by fewer workers. Tasks performed by all
workers are more likely to be critical and essential core job tasks than
those performed by only some workers. Tasks with high criticality ratings
are identified as essential job functions.

Work Sample
Development
Having determined
which static strength tasks are critical for the job, it is next necessary
to determine which tasks are suitable for utilization as work samples.
Ideally, the tasks selected should be among the most demanding tasks workers
are expected to perform. Additionally, other criteria should be considered
including:
Safety to incumbents.
Tasks selected should be safe to perform in a testing situation. Some tasks
might not be dangerous to experienced workers, but could be to a novice.
Reasonable time
to administer. The tasks selected for work sample development should
be those which can be completed in a reasonable amount of time.
Unambiguous
scoring and clarity of results. Tasks selected should be amiable to
an unambiguous scoring or rating system. There should be no disagreement
as to what constituted various levels of performance. Subjective ratings
on "style of lifting" or "ease of lifting" are less
suitable when objective measures are possible.
Regional availability
and low cost of equipment. The materials necessary for task performance
should be readily available and inexpensive.
Simplicity.
The tasks selected should be as simple as possible from both the point
of view of instruction to incumbents and administration of the work sample.
Commonality.
The tasks selected should be commonly performed by as many workers as possible.
Critical tasks
that meet the criteria can be categorized in a variety of ways. For example,
all tasks involving the use of a wheelbarrow might form a group or task
category. Alternatively, all tasks that involve work at particular work
site, or all tasks performed while unloading a box car or repairing heavy
equipment could form other groups. The nature of the job and tasks performed
typically lend themselves to the selection of appropriate task categories.
Task categories are important because they help the analyst organize the
work and ensure that a variety of lifting tasks can be used to construct
work samples. An example of a category might be:
Five
Gallon Container (Paint, Joint Compound, Floor Sealer) Tasks
Lift/carry a five
gallon can of floor sealer (approx. weight 46.3 lbs.).
Carry a five gallon
bucket of paint (approx. weight 55.4 lbs.).
Lift/handle a five
gallon bucket of joint compound (approx. weight 51 lbs.).
Lift a five gallon
bucket of paint into the back of a vehicle (approx. weight 55.4 lbs.).
Lift a five gallon
bucket of paint up onto a stack of other five gallon paint buckets (approx.
weight 55.4 lbs.).
Work samples may
then be developed from these categories of common critical tasks. For example,
a work sample constructed from these tasks might be constructed as:
Five Gallon
Bucket Stack
Approach a row
of four five-gallon buckets of material. Stack three of the buckets on
top of one of the buckets of paint. Take the top bucket of paint off the
stack and carry it to the truck bed. Set it down and release grip. Regrip
paint can and return to the stack of three. Place the can beside the stack
and replace the two remaining cans on the ground in a row, as they were
initially.
Selecting Appropriate
Static Strength Tests
MED-TOX has used
the Jackson Strength Evaluation System (JSES) in several projects and has
found it to be a valid and reliable predictor of the ability of individuals
to perform lifting, push, pulling, and carrying task.
The JSES was developed
by Dr. Andrew S. Jackson of the University of Houston. It features an electronic
load cell to ensure accurate and reproducible readings of isometric strength.
Large readouts allow determination of both peak and average strength in
pounds. The system includes the control and load cell, a hand dynamometer
fixture for the measurement of grip strength, and a heavy duty lifting
platform, bar and chain. The manufacturer reports that the JSES is widely
used to measure static strength using the National Institute of Occupational
Safety and Health (NIOSH) protocol.
The JSES has three
qualities that make it ideal for employment testing. It has been shown
to be safe, reliable (r = .90), and practical. Results should be
obtainable within 15 minutes. The JSES is widely recognized as a reliable
and valid indicator of the amount of static strength possessed by individuals.
At the present time many industrial medical clinics and employers are obtaining
the JSES. The test is relatively inexpensive (it can be obtained for less
than $4,500), it is practical, safe and portable. Normative data for the JSES can be viewed by clicking here.
In 1995, the EEOC
issued guidelines which attempted to clarify the difference between a medical
test and a physical ability test. According to the EEOC a medical test
was more likely to measure an individual's "physiological response"
to performing a task whereas a physical ability test measured task performance
directly. MED-TOX questioned the EEOC as to whether the JSES would be considered
a medical or physical test since it measured an individual's strength (a
physiologic response) and was not a direct measure of a particular task
but could predict performance on a variety of lifting tasks. In a December
7, 1995 letter to MED-TOX, the EEOC stated:
The answer to your
inquiry concerning the Jackson Strength Evaluation System (JSES) depends
on the context in which this test is given. For example, if the JSES is
used simply to determine whether a person is capable of lifting a thirty
pound box and carrying it twenty feet, the test would not be considered
"medical" and could be administered pre-offer. In this context,
it would not be dispositive that the test is used or interpreted by a health
professional. Similarly, the score one achieves on the test -- provided
the only thing being measured is the amount of weight the person can lift
-- does not render the test a medical one even though, as you put it, "strength"
is being measured. As the Enforcement Guidance illustrates at pages 14
and 15, an assessment of whether a person can lift a fifty pound box is
a physical ability test as distinguished from a medical exam. If you were
to measure the person's heart rate after the act of lifting, you would
then be engaging in a medical examination. While the distinction may appear
subtle, it is legally significant and constitutes the difference between
what can and cannot be done under the ADA before a conditional offer of
employment [EEOC, personnel communication to MED-TOX, December 7, 1995].
In other words,
the EEOC would permit the use of the JSES (in the pre-employment context)
to the extent that test performance was related to the ability of an individual
to perform a specific task or group of tasks. In order to establish a relationship
between the JSES and work performance, it is necessary to conduct field
testing with workers (participants).
Field Testing
and Data Analysis
A stratified random
sample of experienced workers is typically chosen for testing. The sample
should consist of individuals from various ages, racial groups and both
genders. Of course, many organizations will not have a significant number
of females for testing nor will they have individuals employed who cannot
perform the job. Without representatives from these groups, it is more
difficult to set a defensible cut-off score. Therefore, we suggest that
administrative and clerical workers participate in field testing as well.
Field testing consists
of a brief medical screening, informed consent, an explanation of the testing,
and height and weight measurements. Next the participants are administered
the JSES. Participants exert a constant force for three seconds on the
four tests which used the lifting bar and for three seconds using the Jamar hand
dynamometer in accordance with the manufacturer's instructions.

The electronic monitor
connected to each load cell records the amount of force exerted in pounds
of force. Peak and average force is recorded.
For the Grip Strength
test, participants squeeze on the hand dynamometer first with the dominant hand and then with the
nondominant hand. For this test, peak grip strength is recorded.

During the Arm Lift, participants stand erect with palms up, their elbows at the side,
and forearms at a 90 degree angle to pull up on the lifting bar.

The Shoulder Lift
also requires the participants to stand erect but with their palms down. The participants
then pull up on the bar as if lifting a jackhammer.

The Torso
Pull requires the participants to sit on the ground with their legs extended and their feet flush against the lifting platform which is placed against a wall. Participants pull back with their arms and legs extended.

The Leg Lift test
requires the participant to squat with the arms extended downward. The lifting motion is entirely in the legs as they are straightened.
Three trials are
conducted for each participant on each of the five tests, with the average
of the last two trials used as the score. Scores are recorded for each
trial.
Next, the simulations
are performed by the participants. The simulations consist of actual work
samples of the job. Several events such as the Five Gallon Bucket Stack
described above will have been constructed. Participants are given ample
time to rest between events and to decline testing at any time. Two timers
use stop watches to record the time it takes for each participant to complete
each work sample. Times are averaged for both stop watches and recorded
as the score.
Participants are
instructed not to run or to perform the work at an unnatural pace.
Participants are asked to envision a day in which they had a lot of different
tasks to perform. When one task was completed, other important tasks are
to follow. Participants are instructed to work at what might be considered
a heavier than average pace, but not one that was unrealistic or unrepresentative
of the pace at which they might work on a busy day.
Following testing,
participants estimate their personal fitness level, the minimum level of
performance that they would consider acceptable for each work sample, how
realistic each work sample is, and additional questions that are utilized
to assist in setting the cut-off score.
Statistical
Analysis
Reliability of
the JSES is assessed by comparing the scores of the two recorded trials
on each test. Reliability typically varies from a low of .94 to a high
of .97.
Correlation coefficients
are computed for all tests to determine their interrelationships and lack
thereof. Multiple regression analysis is used to derive equations to predict
the performance of individuals on the work sample test who have only taken
the JSES.
Validity is assessed
by statistical analysis as to how well each regression equation is predictive
of work sample performance. A perfectly predictive equation would have
an R-squared of 1.0 and a R-squared of 0.0 would indicate that the equation
had no ability to predict job performance at all.
Passing Levels
(Cut-off scores)
Setting cut-off
scores is a particularly complex area of test construction. MED-TOX utilizes
multiple forms of evidence to arrive a cutoff level that is consistent
with business necessity. The cut-off scores permit the selection of qualified
workers, are based on the results of the task analysis, and on the performance
of currently employed workers and their judgments as to what constitutes
acceptable performance. As each test validation situation is unique, no
perfect formula can be offered in advance here.
Conclusion
MED-TOX offers
services in the criterion-related validation of physical ability tests.
The tests are based on a comprehensive job analysis and field testing of
workers performing work samples and their scores on the JSES. The tests
will permit the inclusion of individuals most likely to be able to perform
the tasks without undue risk of injury to themselves and to screen-out
persons who do not possess sufficient physical ability to adequately perform
the job.
© 2007 MED-TOX HEALTH SERVICES