What are the types of fail?

Failures can happen in many different forms and for a variety of reasons. Understanding the different types of failures and their causes is important for identifying issues and preventing future failures. This article will examine some of the main categories and examples of failures.

Table of Contents

Design Failures

Design failures occur when there is an issue with the actual design of a product, structure, or process. These types of failures are often unpredictable and unavoidable without proper testing and analysis during the design phase. Some examples of design failures include:

Structural failures – Buildings or bridges collapsing due to design flaws that impact stability and strength.
System failures – Computer systems or networks that crash due to bugs in the initial software or architecture.
Process failures – Manufacturing or business processes that are inefficient or prone to errors due to poor design.

Design failures can be minimized by extensive prototype testing, engineering analysis, and reviews of requirements before finalizing designs. However, sometimes flaws are not detected until after implementation.

Material and Manufacturing Failures

Failures related to materials and manufacturing occur when there are issues with the quality, properties, or dimensions of the materials being used, or problems with the actual manufacturing process. Examples include:

Material defects – Cracks, impurities, or variation in material properties leading to premature failure.

Dimensional errors – Incorrect machining, cutting, or shaping causing parts to be out of specification.
Assembly errors – Mistakes in assembling or connecting components during manufacturing.
Overloading – Exceeding the load capacity or stress limits of a material.

Thorough quality control and process monitoring during sourcing, production, and assembly are key to reducing material and manufacturing-related failures. However, some defects may still escape detection until after products are finished.

Operational Failures

Operational failures are failures that occur during the normal use or operation of a product, system, or process. These are typically caused by improper maintenance, unexpected operating conditions, misuse, or accidents. Examples include:

Overheating – Insufficient cooling leads to overheating and component damage in machinery.

Overloading – Exceeding rated load capacities during operations.
Misalignment – Shaft or component misalignment leading to excessive wear and breakdown.
Fatigue – Accumulated damage over time from vibrations, cyclic loading, etc.

Corrosion – Gradual environmental degradation during operations.
Abuse – Deliberate misuse or mistreatment by end users.
Software failures – Crashes, bugs, or errors that occur while computer software is running.

Adhering to operating limits, preventive maintenance, operator training, and safe usage guidelines can help avoid operational failures.

External Factor Failures

Sometimes failures occur due to unforeseen external factors outside of the design, materials, manufacturing, or operation of a product or system. These include:

Environmental exposure – Damage over time from weather, chemicals, radiation, etc.

Interactions – Failures arising from unanticipated interactions with other systems.
Random events – Freak accidents, natural disasters, human errors.

While it is difficult to predict failures from external factors, steps can be taken to make systems more robust and fault tolerant. Also, conducting thorough hazard analysis and failure mode analysis helps identify potential external risks.

Complex System Failures

For complex systems with many interconnected components, failures can arise in complex ways as small initial faults cascade through the system. Examples include:

Cascade failures – An initial failure shifts loads to other components causing overloading and sequential failures.
Feedback loop failures – Disruptions to one part reverberate back and amplify.

Software/sensor failures – Faulty data or signals lead to improper automated control actions.
Human operator error – Incorrect responses by operators monitoring complex systems.

Redundancy, fail-safes, simulation modeling, and operator training on complex systems are important to manage these potential failure modes.

Premature Failures

Premature failures occur when a product or component fails earlier than expected, before reaching its design life. Common causes include:

Defective raw materials
Flaws introduced during manufacturing

Incorrect installation or handling
Overstressing components
Inadequate maintenance

Exposure to extreme conditions

Robust quality control, careful handling, controlled operating conditions, and preventive maintenance help avoid premature failures.

Wear-Out Failures

Wear-out failures occur when a product or system fails after extensive time in use. The causes include:

Component fatigue from cyclic stresses over time.
Erosion and corrosion wearing down materials.
Accumulation of small amounts of damage.

Depletion of lubricants or fluids.
Wear from friction.
Deterioration of seals, gaskets, insulation.

As components age, the likelihood of wear-out failures increases. Regular inspections, replacements, and overhaul refurbishments are needed to avoid wear-out failures in older systems.

Creep Failures

Creep failures refer to slow, progressive deformation that occurs over time under constant stress levels. Susceptible materials like plastics and metals at high temperatures gradually deform until they crack or rupture. Maintaining stresses well below limits, using creep-resistant materials, and inspections for deformation are means of preventing creep failures.

Brittle Fracture Failures

Brittle fracture describes sudden, catastrophic cracking of normally ductile materials when exposed to certain conditions. Metals subjected to very cold temperatures or corrosion damage become brittle. Ceramics and glasses are inherently brittle. Preventive measures against brittle fracture include avoiding high stresses, corrosive environments, and cold temperatures that can make materials susceptible.

Corrosion Failures

Corrosion, such as rust or other material degradation, can lead to both sudden and gradual failures as component strength is compromised. Using corrosion-resistant alloys, coatings, lubricants, and sealants helps control corrosion. Butregular inspection and cleaning are also essential to identify corrosion and stop it before failures occur.

Common Causes of Failure

While the type and specifics of different failures vary, most failures originate from a few common root causes:

Defective raw materials

Mistakes in manufacturing and assembly
Wear from usage over time
Improper maintenance

Unforeseen operating conditions
Poor initial design
Random chance and bad luck

Understanding these common causal factors helps focus prevention efforts on quality control, maintenance, operating procedures, and conservative designs using safety factors and redundancy.

Failure Analysis

When failures do occur, performing detailed failure analysis is key to identifying the root causes and preventing future occurrences. Some common failure analysis techniques include:

Visual inspection – Looks for obvious damage and clues

Non-destructive testing – X-ray, ultrasound, dye penetrate tests
Metallurgical analysis – Examines material microstructures
Chemical analysis – Identifies corrosion products or contaminants

Fractography – Studies fracture surface patterns
Stress/strain analysis – Determines stresses and loads
Dimensioning – Checks for incorrect tolerances

Disassembly – Inspects individual components

The evidence collected guides conclusions on the failure mechanism and root cause. This information then drives corrective and preventive actions.

Failure Rate Data

Analyzing failure rate data provides useful information for reliability engineering and prevention. Some key failure rate metrics include:

Failure intensity – Failures per unit of time
Failure density – Failures per unit of distance or area
Failure probability – Likelihood of failure in a certain time

Mean Time Between Failures (MTBF) – Average time between failures
Bathtub curve – Failure rate vs time relationship

Failure rate data allows quantification of reliability and guides maintenance scheduling, warranty terms, inventory stocking, and assessments of preventive measures.

Reliability Metrics

Reliability metrics provide insights into the probability of failure-free performance. Common reliability metrics include:

Reliability function – Probability of surviving beyond a time t
Failure function – Probability of failing before time t

Instantaneous failure rate – Failures per unit time at a given age
Availability – Proportion of time a system is operational
Mission success probability – Chance of completing a mission or task

These metrics help set reliability goals and benchmarks for products and systems.

Preventing Failures

Many techniques are available for predicting, detecting, and preventing failures:

Design analysis – FEA, fatigue analysis, fluid dynamics, thermal analysis

Prototype testing – Strength, life cycle, environmental testing
Modeling – Physics-based, empirical models to simulate performance
Inspections – Visual, radiographic, pressure tests, leak checks

Condition monitoring – Vibration, temperature, lubricant analysis
Redundancy – Backup components for critical functions
Process controls – Statistical process control, Six Sigma

Maintenance – Preventive, predictive, scheduled overhauls

A combination of design margins, testing, monitoring, redundancy, and maintenance is optimal for preventing failures. Training personnel to operate and maintain systems properly also helps avoid failures.

Remediating Failures

Once failures occur, prompt action is required to minimize impacts. Typical remedial measures include:

Isolating failures – Disconnecting or deactivating failed components
Redundancy – Switching to backup components
Workarounds – Interim procedures to restore function

Repairs – Fixing or replacing components
Redesigns – Long-term solutions to correct root causes
Process changes – Altering operating parameters to avoid failures

Having contingency plans for likely failure scenarios speeds response and recovery. The goal is to restore functionality quickly while also implementing lasting solutions.

Learning from Failures

While failures are undesirable, they provide valuable opportunities for improvement. Some key lessons learned from failures include:

Flaws in existing designs, processes, and procedures

Gaps in training for operators and maintainers
Insufficient quality control and preventive measures
Areas needed for model and metric enhancements

Real-world validation of analysis and testing methods
Weaknesses in maintenance practices and intervals
Unanticipated failure modes and mechanisms

Analyzing failures leads to identification of corrective actions to implement. Shared lessons learned also prevent recurrence of similar failures on other systems.

Conclusion

Failures result from diverse mechanisms under various conditions. Understanding the sources and modes of failure is crucial to preventing and responding to failures. With diligent design, testing, monitoring, maintenance, and continuous improvement, the risk of failures can be minimized though rarely eliminated entirely. A proactive approach to failure prevention balanced with preparations for responding quickly to failures that do happen is the most effective strategy for enhancing reliability.