Many companies associate Reliability, Availability, Maintainability and Safety activities to a certification milestone of aircraft and other safety critical technology manufacturers. Nonetheless, RAMS engineering is an inseparable companion to design, validation, operations and maintenance activities for many actors in the supply chain, from Tier 1 OEMs to integrators, operators and maintainers. As we can see in the V-diagram for the development of a technological product, RAMS assessments are developed throughout the whole lifecycle of technical products.

RAMS engineering in the Design phase

Let’s picture an aircraft integrator company who intends to develop and put a new aircraft type in the market. At the beginning of the design phase, when the requirements of the product are specified, the functions which the product is expected to perform must be defined both at integration and system level. A function, for instance, could be “To provide a correct fuel to air ratio”, which would be allocated to the propulsion system.

 A Functional Hazard Assessment (FHA) is the next step after system functions definition. In the FHA, all possible scenarios where a hazard might have an effect on the aircraft safety are defined. These scenarios are known as Failure Conditions, and one of the jobs of the RAMS engineer is to identify and classify them according to their severity. Each level of severity will imply a specific safety target for each failure condition.

While the preliminary architecture of our aircraft keeps developing on track and, possibly, a prototype is already on the making, several RAMS analyses are performed in order to evaluate the compliance of our design with the safety target derived from the FHA.

From the Maintainability point of view, once the preliminary architecture of the aircraft has been defined, those parts which the system engineer forecast will need periodical maintenance are included in the Aircraft Maintenance Manual. From there, the RAMS engineer launches a Reliability Centered Maintenance analysis such as the MSG-3, where maintenance tasks and intervals are defined for each item. This is only a preliminary list of maintenance actions, which is updated after the safety assessment is finished and during the whole life of the aircraft, as new maintenance procedures (e.g., availability increasing procedures) may arise.

From the Reliability side, one of the first analysis to be carried out is the Reliability Prediction Analysis (RPA), where the expected reliability of each component, assembly, and system is calculated with the aid of reliability prediction standards (e.g. FIDES 2009, MIL-HDBK-217 or NSWC-11), historical reliability databases (e.g. EPRD, NPRD), or, with reliability data extracted directly from the field. The latter, being the most desirable option, is usually unfeasible due to the inability to access to the data from OEM manufacturers or simply because the items are being used for the first time (imagine a new battery for an eVTOL). By conducting an RPA, RAMS engineers can estimate the overall reliability of the system and made a first estimation of the system compliance with the safety targets and see which items shall be improved or eventually replaced by more reliable components. The assumptions, calculations, results and conclusions of the Reliability Prediction Analysis are written in the Reliability Prediction Report (RPR).

One of the most renown RAMS methodologies is the Failure Modes, Effects and Criticality Analysis (FMECA), performed for each system, where the failure modes of each component belonging to that system are identified. Each item may have multiple failure modes, each of them with specific causes and effects. The FMECA, an analysis mixing Reliability and Safety, is used by RAMS engineers to link component failures to system failure conditions and estimate, with higher accuracy, if each failure condition will meet the safety targets defined in the FHA. It is important to mention the link between the Reliability Prediction Analysis and the FMECA, which is the failure rate of each component. While the RPA is used to estimate the failure rate of an item, the FMECA distributes this failure rate throughout the different failure modes of the item. If the distribution is unknown, a well-accepted assumption is to equally distribute the predicted failure rates.

FMEA table

Next, a Fault Tree Analysis (FTA) or similar analyses (i.e., Dependence Diagram, Markov Analysis) will allow our RAMS engineers to evaluate and determine the compliance of each failure condition with the safety target. This kind of analyses are usually performed only on those failure conditions which shall be quantitatively assessed (e.g. Catastrophic FCs, Hazardous FCs and, eventually, Major FCs) and are fed by the output of the FMECA, that is, we will use the failure modes identified for each item as basic events contributing to our top event (our Failure Condition). In the Fault Tree Analysis, the Unavailability of each FC is computed, a value which is strongly dependent on the average flight time. The longer our flight, the higher the unavailability, thus the higher difficulty for the system to comply with the safety target. Furthermore, it is worth noting that one of the key advantages of the Fault Tree Analysis is to identify the single points of failure (SPoF) of the design, i.e., those failure modes which cause the entire system to fail.

It is important during the FTA to carefully distinguish between Evident failures, Dormant or Latent failures, and Hidden failures. Dormant failures are not detected when they occur and remain in this state until an inspection/overhaul is performed. In this context, the Maintainability aspect resurfaces. If the planned time interval between two periodic inspections is too wide, the probability that a dormant failure might be combined with another failure, thus leading to a system failure increases, and therefore the system might not be compliant with its safety targets. In that case, the RAMS engineer tries to look for solutions which allow the system to become compliant, and tries to look for the appropriate maintenance interval. This time intervals, as well as the maintenance actions (defined in a second iteration of the MSG-3 analysis), are added as Certification Maintenance Requirements (CMR) in the ATA chapter 4 of the Aircraft Maintenance Manual.

Fault Tree Analysis in Robin RAMS

Once the FTA is completed and before putting hands on work with the manufacture of our first aircraft prototype, a Zonal Safety Analysis, a Particular Risk Analysis (PRA) and a Common Mode Analysis (CMA) are developed. Common Cause Analysis (CCA) is the term encompassing these three analyses and is conducted to identify individual failure modes or external events which can lead to Catastrophic or hazardous failure conditions.

In the Common Mode Analysis, our RAMS engineers will verify that AND gates (redundancies in our design) do not have generic faults which could cause simultaneous failures. Some common mode sources are, for instance, “Being produced by the same manufacturer” or “Share common external source”.

In the Particular Risk Analysis (PRA), the RAMS engineer identifies which events outside of the system analysed (e.g., lightning strikes, bird strikes, fire, leaking fluids from other systems…) can cause a system failure and impact the airworthiness of our aircraft. From this analysis, a list of safety requirements is derived.

Finally, in the Zonal Safety Analysis, we will verify that equipment installations within each zone of the aircraft are at an adequate safety standard. Interference between systems is evaluated and design and installation guidelines are developed. One the aircraft is manufactured, the ZSA is closed with an inspection of each zone, where the implementation of corrective actions is verified.

Iterate until perfection

It is usual that during the different stages of development of our aircraft, e.g., Preliminary Design Review, Critical Design Review, more than one issue of each analysis is developed, in order to, first, identify the safety requirements and development assurance levels (DAL) for each assembly (information included in the Preliminary System Safety Assessment (PSSA)) and, then, to implement design changes and verify that each system meets the safety requirements, information integrated in the System Safety Assessment (SSA) and, at a higher level, in the Aircraft Safety Assessment (ASA). During these safety iterations, the design and the manufacturing team have already started to build, install and prepare the testing environment for the first production units.

Safety Assessment Cyclic Approach

Ready to be approved

By the time all SSA and the ASA are completed, we’ve reached the Validation phase of the lifecycle and we are ready to hand the RAMS documentation to the certification entity for approval.

In the meantime, a battery of tests has been defined and planned in order to provide additional acceptable means of compliance for the certification entity. Testing is a very important part of the validation phase, whereas it is aimed at certification or at customer approval of initial requirements for specific fleets.

If the safety, reliability and maintainability studies have been thoroughly completed and testing proceeds as per schedule, our aircraft will receive their Type Certificate and so the service and maintenance activities will begin in an operational environment.

What is the role of the RAMS engineer after certification?

Firstly, it is paramount that reliability is tracked during the operational life of a technical system, as the theoretical values calculated during design must be verified. Keeping track of reliability parameters allows manufacturing organizations to thoroughly control supplier quality and to offer the product customers harmonized integrated logistics support for spare parts and maintenance supplies. The tool to keep track of every part’s reliability is a Failure Reporting and Corrective Actions Systems (FRACAS) which, by collecting failure data from global operations, is capable of deriving reliability parameters such as Mean Time Between Failures to engineering teams. The RAMS engineer verifies the parameters against guaranteed values or expected outcomes. A solid FRACAS system will provide invaluable insights into the reliability performance of fleets, product types and single subsystems and parts while allowing for a systematic approach to implementing corrective actions aimed at improving the reliability of the aircraft. You can find all the benefits of implementing a Failure Reporting, Analysis, and Corrective Action System (FRACAS) effectively here

Secondly, on the maintainability field, the RAMS engineer can bring their expertise in building maintenance plan to operator organizations, such as fleet owners, airlines or charter flight companies. For instance, a reorganization of all the maintenance activities required for each aircraft in the frame of a structured Master Maintenance Plan can provide a comprehensive solution for fleets. The MMP includes the development of the process and the tools to organize and maintain in an organic way the single maintenance plans, as well as a structured process to introduce changes in the Scheduled Maintenance section of the Aircraft Maintenance Manual. The effect of a Maintenance Plan re-organization is reducing costs, resources and manpower optimization.

Finally, safety engineers are needed at all levels of the operational life of tech products in order to keep hazards at bay. A comprehensive Hazard Log is a tool required to all operators of aircraft fleets or equivalent technical systems, such as industrial machinery, pharmacological laboratories, etc. The Hazard Log is a collection of operational hazards and their corresponding Risk Assessment. The safety engineer needs to study the risks of every operation, both in service or maintenance, with the potential to cause harm to personnel health or damage to equipment. Mitigations have to be applied in order to bring residual risk to an acceptable level as defined in the initial operational requirements.