Drilled Hole Roughness (Ra) as a Metric for PWB Reliability

Ed Hare, PhD – SEM Lab, Inc.

Drilled hole quality is often overlooked as a contributing factor to long-term reliability in printed wiring boards (PWBs). In particular, surface roughness of the drilled hole wall, commonly reported as Ra (arithmetic average roughness) or Rq (root-mean-square roughness), can have a significant impact on performance and failure risk.

Why Rough Drilled Holes Matter

Fabrication Indicator
Poorly drilled holes are often a fabrication quality issue—commonly related to drill bit wear, improper feed/speed, or inadequate desmear. Excessively rough drilled hole interiors are indicative of poor process control during board fabrication.

CAF Failures
Rough hole walls can create micro-crevices that promote conductive anodic filament (CAF) formation under applied bias and elevated humidity. These sites serve as initiation points for copper migration, especially in high-density multilayer designs.

Electrical Noise
Non-uniform copper topography inside the via barrel may also contribute to electrical performance variation in high-speed or RF applications. These effects are often masked during initial testing but become problematic under long-term use.

Stress Risers
Jagged or mechanically damaged areas along the drilled wall act as stress risers, which can limit fatigue life under thermal or mechanical cycling. This often results in copper barrel cracking or interconnect failure over time.

Measurement and Analysis

At SEM Lab, we use high-resolution SEM imaging and software-based analysis to calculate Ra and Rq from drilled hole interiors. Measurements are obtained from multiple regions of interest within the via structure, and plotted against PWB lot, drill column, or via position.

This approach allows for identification of outliers caused by tool wear or board stack variation. Regions with elevated Ra values may correlate with plated copper pull-away or localized delamination when subjected to thermal cycling.

🔧 Recommendations
• Evaluate Ra and Rq from representative drilled holes on each new PWB lot.
• Use SEM or profilometry tools—optical inspection is insufficient for identifying roughness-based failure risk.
• Set internal benchmarks (e.g., Ra < 2.0 µm) for critical layers and high-reliability applications.
• Include drilled hole roughness as a QA parameter when auditing new board suppliers.

Support Available

SEM Lab Inc. now offers remote consulting services to support engineers involved in resolving product quality issues. If you’re encountering via-related failures or want to strengthen your board qualification process, I can assist by reviewing failure data and providing recommendations based on decades of laboratory experience.

📧 ehare@semlab.com
🌐 www.semlab.com

PWB Internal Short – Example C

This slide presents a failure analysis case study of a printed wiring board (PWB) that developed an internal short during burn-in testing. The cross-sectional SEM images highlight localized laminated damage, which was likely caused when a warped board was forcibly mounted flat, inducing internal mechanical stress. Comparison with an undamaged region confirms the localized nature of the failure. Additionally, the use of an unusual blind via construction appears to have exacerbated the issue, contributing to the internal short.

 

 

Investigating Internal Shorts in PWBs: Example B

In multilayer printed wiring boards (PWBs), internal shorts can arise from subtle but critical manufacturing defects. This case study highlights a failure mode linked to drill or inner layer misregistration, which compromised the electrical clearance between a plated through-hole (PTH) via and a 24V internal plane. The resulting drill-induced laminate damage, combined with radial cracking, facilitated an unintended electrical bridge. Through microsectional and optical analysis, we demonstrate how mechanical misalignments propagate into latent electrical reliability risks.

 


 

 

Power and ground plane shorts in multilayer PCBs can be difficult to trace, especially when the root cause lies deep within the laminate structure. This example highlights an internal short caused by resin fracture and delamination, which enabled copper electro-migration across a thin dielectric layer. Understanding failure mechanisms like this is essential for improving PCB reliability in demanding applications. #FailureAnalysis #PCBDesign #ElectroMigration #MaterialsScience

MLCC Flux Entrapment

MLCC short circuit failures on PCBAs are often caused by bending fractures, where internal electromigration shorts develop between opposite electrodes along the fracture.  Short circuit failures are also caused sometimes by capacitor manufacturing defects such as knit line failure, firing cracks, or dielectric porosity.  But we see more frequently in recent years cases where the capacitor short is external, growing due to electrochemical migration through flux entrapped under the MLCC from terminal to terminal.

Fortunately, our primary method of examination of MLCCs is a microsection of the capacitor as mounted on the PCBA (Fig. A).  This orientation is ideal for evaluation of bending fractures.  It also affords us a look under the capacitor.

 

Fig. A – MLCC microsection as mounted on the PCBA.

An EDS map for Sn (Fig. B) shows a large concentration of tin below the capacitor and above the solder mask.  This indicates that an electrochemical cell existed under the capacitor allowing tin to migrate from the anode to the cathode.  The PCBA was operating in a moist environment, which is a contributing factor.

 

Fig. B – EDS map for tin corresponding with Fig. A dashed box.

A second way we can look at the problem is to mechanically excavate the solder joints and remove the capacitor from the PCBA so that residue and corrosion damage can be observed on the board surface under the capacitor (Fig. C) and on the bottom surface of the capacitor (Fig. D).

 

Fig. C – Corrosion damage on the board surface under the capacitor.

Fig. D – Corrosion damage on the bottom surface of the capacitor.

Electrochemical corrosion and electromigration can involve more than corrosion of tin, as we also see corrosion of silver, copper, and nickel plating associated with metal end caps and PWB mounting pads or from SAC305 solder.

Contamination by chlorine and bromine flux activators is common and accelerates the rate of corrosion.

This is how it works …

  1. Residual solder flux absorbs moisture from the environment, creating an ionic solution under the MLCC.
  2. Under applied voltage, metals such as silver (Ag), tin (Sn), nickel (Ni) or copper (Cu) from electrodes or solder pads ionize.
  3. Metal cations migrate toward the cathode.
  4. At the cathode, metal ions are reduced and deposited as metallic dendrites.
  5. Dendrites grow until they bridge the gap between electrodes, creating a short circuit.

Even if the corrosion process doesn’t create a hard metallic short, the generation of ionic species under the capacitor can cause large enough leakage currents to generate circuit faults.

Time to failure is a function of many parameters including quantity and type of solder flux residues, operating temperature, moisture level, and electric field magnitude.  The electric field becomes a significant factor as capacitor size decreases.  As MLCC sizes decrease, the gap between electrodes (spacing under the component) also shrinks so that the electric field is higher.

So my question to circuit designers is …

How do you incorporate these considerations into your design rules?

Solder Joint Fractures

 

These images show the various types of fractures affecting solder joints in electronic assemblies, which includes mechanical overload failures, thermal fatigue due to coefficient of thermal expansion (CTE) mismatches, gold embrittlement, creep rupture failures, and vibration fatigue fractures. Understanding these failure modes is crucial for enhancing the reliability and performance of electronic components.

 

SJF1 – Alloy 42 lead, single sided through-hole solder joint, mechanical overload failure.

SJF_2 – 1W SMT resistor, thermal fatigue failure, re: CTE mismatch strain.

SJF_3 – J-lead SMD, alumina substrate, CTE mismatch, gold embrittlement, thermal fatigue.

SJF_4 – cyclic thermal stress, CTE mismatch stress, thermal fatigue.

 

SJF_5 – SMT connector, creep rupture failure.

SJF_6 – QFN device, creep rupture failure, thermal warpage.

SJF_7 – SMT connector, creep rupture failure.

SJF_8 – vibration fatigue fracture at the lead/solder interface, QFP device .

 

BGA solder joint failed due to high-strain-rate bending of the PCBA

The relationship between PCBA bending and failures is critical, as bending stresses introduced during manufacturing, assembly, or operation can lead to a wide range of failures, including cracks, fractures, and deformations. These failures often occur in solder joints, component leads, and critical connection points, compromising the reliability and durability of the assemblies. Key mitigation strategies include robust design practices such as incorporating stiffeners, careful material selection to withstand mechanical stresses, controlled manufacturing processes to minimize stress introduction, and careful handling procedures. Understanding and addressing these factors through targeted improvements can significantly reduce bending-related failures, ensuring more reliable electronic assemblies.

see …

MLCC Bending Fracture

MLCC bending fracture

Mechanical Overstress of Resistor Solder Joints

Damage in BGA Device

Why is failure analysis not testing?

“Failure analysis” and “testing” are related but distinct processes in the context of engineering, quality assurance, and various scientific fields. Here’s a breakdown of how they differ:

  1. Purpose:
    • Testing: This is generally conducted to ensure that a product, component, or system meets specified requirements before it is put into service. Tests are performed under controlled conditions and aim to evaluate performance, durability, and safety.
    • Failure Analysis: This comes into play after a product or system has already failed. The primary goal is to determine why the failure occurred, which can involve identifying the root causes of failure and the conditions that led to it.
  2. Process:
    • Testing: Involves applying pre-determined stresses or operational loads to a product to simulate actual or accelerated operating conditions. This could include stress tests, performance tests, and endurance tests.
    • Failure Analysis: Involves a detailed examination and analysis of the failed component. This might include visual inspections, microscopic examination, chemical analysis, and mechanical testing to understand the failure mechanism.
  3. Outcome:
    • Testing: Aims to confirm that the design and manufacturing meet the expected standards. It’s predictive and preventive, helping to catch potential failures before products reach the market or are used in critical applications.
    • Failure Analysis: Aims to learn from a failure after it has occurred. The insights gained can lead to design improvements, changes in manufacturing processes, or updates in maintenance procedures to prevent future failures.

Thus, while testing is about prevention and verification before failure, failure analysis is about diagnosis and correction after a failure has occurred. Both are crucial but serve different phases of a product’s life cycle and quality management.