Understanding the Bayley Scales of Infant Development: A Comprehensive Assessment Tool

Sam Tuffun , PT, DPT
May 8, 2026

Download Bayley Scales of Infant and Toddler Development Pdf

When it comes to evaluating the developmental progress of infants and toddlers, healthcare professionals rely on standardized, evidence-based tools to ensure accurate assessments. Among these, the Bayley Scales of Infant and Toddler Development stands as the gold standard, offering clinicians and researchers a comprehensive framework for identifying developmental delays and planning appropriate interventions.

Developmental disabilities affect approximately one in six children in the United States, making early identification crucial for optimal outcomes. The Bayley Scales provide professionals with a reliable method to assess young children during their most critical developmental period—from 16 days to 42 months of age.

What Are the Bayley Scales?

The Bayley Scales of Infant and Toddler Development (BSID) represent an extensive formal developmental assessment tool specifically designed to diagnose developmental delays in early childhood. Originally developed by psychologist Nancy Bayley and first published in 1969, these scales have evolved through multiple editions to reflect current research and clinical best practices.

The assessment consists of developmental play tasks that typically take between 30 to 70 minutes to administer, depending on the child's age and cooperation level. Unlike intelligence tests that produce an IQ score, the Bayley Scales generate a developmental quotient (DQ) that reflects how a child's performance compares to typically developing peers of the same age.

Billing for Bayley-4 Assessments: CPT Codes Clinicians Need to Know

One of the most practical gaps in most Bayley-4 resources — including clinical training materials — is the absence of billing guidance. For occupational therapists, physical therapists, and speech-language pathologists administering this assessment in outpatient or early intervention settings, knowing the correct CPT codes is as important as knowing the domains themselves.

The two codes that apply to standardized developmental testing like the Bayley-4 are CPT 96112 and CPT 96113.

CPT 96112 covers the first hour of developmental test administration, including interpretation and report generation. This is the primary code used when a qualified clinician conducts the Cognitive, Language, and Motor scales in a single session. CPT 96113 is an add-on code billed for each additional 30 minutes of testing beyond the first hour — relevant for longer Bayley-4 sessions with younger or more complex patients where administration extends past the 60-minute mark.

A few critical billing notes for 2025–2026 practice: these codes require the administering clinician to hold appropriate credentials (master's or doctoral level with standardized assessment training, per Pearson Level B qualification requirements). Documentation must include the assessment date, total administration time, domains tested, scores obtained, and clinical interpretation. Payer policies vary — some commercial insurers require prior authorization for developmental testing, and Medicaid coverage differs by state. Always verify coverage before scheduling a full Bayley-4 evaluation, particularly for children who have already received a recent developmental assessment from another provider.

For early intervention settings operating under Part C of IDEA, the Bayley-4 is commonly used for eligibility determination, and billing may flow through the state's early intervention system rather than standard insurance claims. Familiarize yourself with your state's specific evaluation reimbursement pathway.

Bayley-4 and Telehealth: What You Can and Cannot Do Remotely

The expansion of telehealth in pediatric therapy has created significant confusion around which components of the Bayley-4 can be administered remotely. The answer matters both for clinical accuracy and compliance.

Scales that can be completed remotely: The Social-Emotional Scale and the Adaptive Behavior Scale are both caregiver-report questionnaires. Pearson supports remote administration of these two scales — they can be emailed to caregivers as a secure digital link through Q-global, Pearson's digital delivery platform, or completed via a telehealth portal. This makes hybrid workflows practical: a caregiver completes the questionnaires remotely before or after an in-person visit, streamlining total session time.

Scales that must be completed in person: The Cognitive, Language, and Motor scales require standardized in-person administration. There are no validated remote protocols for these three scales. Administering them over video would violate the standardized conditions that make the normative comparisons valid — meaning scores obtained remotely could not be used for diagnostic or eligibility purposes.

This distinction is increasingly important. A 2025 study published in the International Journal of Telerehabilitation found that over half of pediatric occupational therapists continue using telehealth regularly post-pandemic, with outpatient and early intervention settings showing the highest rates of continued use. Clinicians running hybrid models should document clearly which components were completed in-person versus remotely, and ensure caregiver questionnaires are attributed correctly in the record.

Evolution Through the Editions

First Edition (1969)

Nancy Bayley's original scale assessed motor and mental domains in children aged 3 to 28 months, establishing the foundation for standardized infant assessment.

Second Edition (BSID-II, 1993)

The second edition, published shortly before Bayley's death in 1994, added a behavior rating scale and expanded the age range from 1 to 42 months. It reported two main scores: the Mental Development Index (MDI) and the Psychomotor Development Index (PDI).

Third Edition (Bayley-III, 2006)

This significant revision expanded from three to five domains, providing more comprehensive developmental assessment:

  • Cognitive Scale
  • Language Scale (Receptive and Expressive)
  • Motor Scale (Fine and Gross Motor)
  • Social-Emotional Scale
  • Adaptive Behavior Scale

The Bayley-III used dichotomous scoring (1 for success, 0 for failure) and became widely recognized as the most frequently used test in infant developmental assessment.

Fourth Edition (Bayley-4, 2019)

The current edition, published in 2019, offers several improvements:

  • 30% faster administration compared to Bayley-III
  • Polytomous scoring (2 = Mastery, 1 = Emerging, 0 = Not Present) for more nuanced assessment
  • Enhanced clinical sensitivity and accuracy
  • Updated Adaptive Behavior scale derived from the Vineland Adaptive Behavior Scales, Third Edition
  • Remote administration options for Social-Emotional and Adaptive Behavior questionnaires via telehealth

The Five Developmental Domains

1. Cognitive Scale

Assesses mental abilities including:

  • Visual preference and attention
  • Memory and learning
  • Sensorimotor development
  • Exploration and manipulation
  • Object concept formation
  • Pretend play
  • Problem-solving abilities

2. Language Scale

Evaluates communication skills through two subscales:

Receptive Language:

  • Recognition of objects and people
  • Following directions
  • Understanding of vocabulary
  • Comprehension of sentences

Expressive Language:

  • Naming objects and pictures
  • Vocabulary development
  • Sentence formation
  • Communication attempts

3. Motor Scale

Examines physical development in two areas:

Gross Motor:

  • Head and trunk control
  • Sitting and standing
  • Walking and running
  • Climbing stairs
  • Balance and coordination

Fine Motor:

  • Grasping and manipulation
  • Hand-eye coordination
  • Stacking blocks
  • Drawing and writing precursors
  • Tool use

4. Social-Emotional Scale

Based on caregiver report, this scale assesses:

  • Ease of calming
  • Social responsiveness
  • Emotional regulation
  • Imitation play
  • Social engagement
  • Attention to caregivers

5. Adaptive Behavior Scale

Evaluates daily living skills including:

  • Communication in natural contexts
  • Self-control
  • Following rules
  • Getting along with others
  • Daily life adaptations
  • Self-care abilities

How the Assessment Works

Administration

The Bayley Scales must be administered by qualified professionals with appropriate training, including:

  • Psychologists and neuropsychologists
  • Developmental pediatricians
  • Occupational therapists
  • Speech and language pathologists
  • Pediatric nurse practitioners

The assessment uses a play-based format for the Cognitive, Language, and Motor scales, while the Social-Emotional and Adaptive Behavior scales are completed through caregiver questionnaires.

Scoring and Interpretation

Raw scores from completed tasks are converted to:

  • Scale scores for individual subtests
  • Composite scores for major domains
  • Percentile ranks showing where a child falls compared to peers
  • Confidence intervals indicating the precision of scores
  • Developmental age equivalents

Composite scores are standardized with:

  • Mean: 100
  • Standard Deviation: 15
  • Range: 40-160

Interpretation Guidelines:

  • 100 (50th percentile): Mid-average functioning
  • 85-115: Within normal limits
  • Below 85 (16th percentile): Mild impairment or "at risk" of developmental delay
  • Below 70: Significant delay requiring intervention

Clinical Applications

Early Identification

The Bayley Scales excel at detecting developmental delays early when intervention is most effective. Early identification allows for:

  • Timely referral to specialists
  • Implementation of targeted interventions
  • Family education and support
  • Monitoring of progress over time

High-Risk Populations

The assessment is particularly valuable for children at increased risk of developmental delays, including:

  • Premature infants (with age adjustment up to 24-36 months)
  • Children with neonatal complications
  • Those with genetic conditions (e.g., Down syndrome)
  • Infants with prenatal exposure to substances
  • Children with diagnosed conditions affecting development

Intervention Planning

Assessment results guide individualized intervention plans by:

  • Identifying specific areas of strength and weakness
  • Establishing baseline functioning levels
  • Setting appropriate developmental goals
  • Determining service eligibility
  • Documenting progress for accountability

Research Applications

The Bayley Scales serve as a common endpoint measure in:

  • Neonatal trials
  • Developmental research studies
  • Treatment outcome evaluations
  • Population health studies
  • Cross-cultural developmental comparisons

Special Considerations

Age Adjustment for Prematurity

For children born prematurely, age correction is essential for accurate assessment:

  • Cognitive composite: Correction recommended through 24 months
  • Language and Motor composites: Correction through 36 months regardless of degree of prematurity
  • Extreme prematurity: May require correction up to 3 years when scores fall 0.33 to 0.47 SD below baseline

Cultural and Environmental Factors

Development is influenced by multiple factors including:

  • Cultural practices and values
  • Environmental stimulation
  • Socioeconomic circumstances
  • Parental education levels
  • Geographic location

The Bayley-4 normative sample was stratified according to 2017 U.S. census data by age, sex, race/ethnicity, and parent education level to ensure representativeness.

Testing Conditions

Assessment accuracy depends on optimal testing conditions:

  • Comfortable, distraction-free environment
  • Child's optimal state (well-rested, fed, not ill)
  • Rapport between examiner and child
  • Appropriate timing and pacing
  • Examiner expertise and training

Strengths and Limitations

Strengths

  • Gold standard status with rigorous psychometric properties
  • Comprehensive assessment across multiple developmental domains
  • Excellent reliability (coefficients ranging from 0.81-0.91)
  • Strong predictive validity for later developmental outcomes
  • Flexible administration accommodating diverse needs
  • Well-standardized normative data
  • Widely researched with extensive evidence base

Limitations

  • Time-intensive requiring 30-70 minutes of focused administration
  • Requires extensive training for proper administration and interpretation
  • Snapshot in time rather than continuous monitoring
  • Cultural considerations as norms primarily reflect U.S. population
  • Expensive with significant equipment and training costs
  • Limited floor effects for youngest or most delayed children
  • May underestimate delays in certain populations compared to earlier editions

The Bayley Screening Test

For settings requiring quicker assessment, the Bayley Screening Test offers a streamlined option that:

  • Takes approximately 15-25 minutes to administer
  • Screens for cognitive, language, and motor delays
  • Identifies children needing comprehensive evaluation
  • Uses a subset of items from the full assessment
  • Provides "competent," "emerging," or "at risk" classifications

While efficient, the screening test is not diagnostic and cannot replace comprehensive assessment when developmental concerns exist.

Practical Tips for Families

If your child is scheduled for a Bayley assessment:

  1. Schedule wisely: Choose a time when your child is typically alert and cooperative
  2. Meet basic needs: Ensure your child is well-rested, fed, and comfortable
  3. Bring comfort items: Familiar toys or blankets can help your child feel secure
  4. Stay calm: Your child may sense your anxiety, so maintain a relaxed demeanor
  5. Ask questions: Don't hesitate to discuss results and recommendations with the examiner
  6. Understand limitations: One assessment provides valuable information but doesn't define your child's potential

Conclusion

The Bayley Scales of Infant and Toddler Development stand as an essential tool in the early identification and intervention for developmental delays. With its comprehensive assessment across five developmental domains, strong psychometric properties, and widespread clinical and research applications, it remains the gold standard for developmental assessment in children from birth to 42 months.

Whether used for screening high-risk populations, planning individualized interventions, monitoring progress, or advancing research, the Bayley Scales provide critical insights into early childhood development. For healthcare professionals, understanding and properly utilizing this assessment tool is fundamental to supporting children during their most crucial developmental period.

Note: The Bayley Scales of Infant and Toddler Development should only be administered by qualified professionals with appropriate training. This article is for informational purposes and does not constitute medical advice.

Frequently Asked Questions About the Bayley Scales

What score on the Bayley-4 indicates a developmental delay?Composite scores below 85 (more than one standard deviation below the mean of 100) are considered "at risk" for developmental delay. Scores below 70 indicate significant delay across one or more domains and typically warrant further evaluation and referral for early intervention services.

What is the difference between the Bayley-3 and Bayley-4?The Bayley-4 introduced polytomous scoring (0 = not present, 1 = emerging, 2 = mastery), reduced administration time by approximately 30%, updated the Adaptive Behavior scale using Vineland-3 data, and improved clinical sensitivity for detecting mild delays. The normative sample was also updated. Importantly, Bayley-3 and Bayley-4 scores are not directly interchangeable — switching between editions in longitudinal monitoring requires clinical judgment.

Can a physical therapist administer the Bayley-4?Yes. Physical therapists with a master's or doctoral degree and appropriate standardized assessment training meet the Pearson Level B qualification requirements. PTs most commonly administer the Motor scale, though they can administer the full battery with proper training.

How often can the Bayley-4 be re-administered?CMS and Pearson both recommend a minimum re-administration interval to avoid practice effects. Generally, a 3-month minimum between administrations is recommended for monitoring purposes, though many neonatal follow-up clinics schedule reassessments at fixed corrected ages (12, 24, and 36 months corrected) regardless of interval.

Does the Bayley-4 work for children with physical disabilities?The Bayley-4 includes accommodations for children with motor or sensory impairments — examiners can modify positioning, allow caregiver assistance on specific items, and note accommodations in the record. However, the tool has documented ceiling and floor effects for children with significant motor impairment, and results should always be interpreted alongside clinical observation rather than in isolation.

Share on Socials:

Reduce costs and improve your reimbursement rate with a modern, all-in-one clinic management software.

Get a Demo
Table of Content

Webinar

From Claims Delays to Clean Approvals: How AI Helps Clinics Win

September 17, 2025
1 p.m. - 2 p.m. EST
Register Now