THE MAGAZINE FOR COMPUTER APPLICATIONS
Circuit Cellar Online offers articles illustrating creative solutions
and unique applications through complete projects, practical
tutorials, and useful design techniques.
A Guide to online information about:
As part of my former Day Job, in August of 1999, I attended the workshop "Programmable Electronic Mining Systems: An Introduction to Safety". Since the software I write for a living controls large (tens of tons) mobile equipment, I wanted to know more about how to implement my systems in a safer manor for all concerned. Sometimes relying on a simple Watchdog Timer is not enough. After all, I don't want to be on the witness stand explaining to lawyers why my system went off and killed and maimed people. How would you feel about being is such a position?
So, this month's Resource Page looks at "System Safety". As was pointed out in the Workshop, a common misconception is to think of the problem as Software Safety. Software does not exist in isolation from the rest of the system. The system as a whole must be designed to be safe.
Mr. Sammarco recommended the following Books:
"Safer C" by Less Hatton.
Something that all of us in this field know in the back of our minds: Forty to Eighty percent, depending on which study was being quoted in the Workshop, of all system failures where caused by management!
While we'd all love to place 100% of the blame on the Boss Person Hierarchy, it actually means the management of the project, not to say bosses don't contribute to the problem in many forms.
Most causes of system faults are created before the first line of code is ever written or first schematic is ever drawn. The errors are caused by not understanding the requirements of the system.
One simple way of understanding the requirements is to ask yourself how you would test this requirement. If you can't specify a test that can clearly show that the requirement has been meet, then the requirement or the understanding needs refinement.
Some Safety Myths:
The earlier in the design cycle that the requirements are clearly understood the lower the cost of any needed changes.
Within the specification, is there a clear and concise statement of:
(i) each safety-related function to be implemented?
(ii) the information to be given to the operator at any time?
(iii) the required action on each operator command including illegal or unexpected commands?
(iv) the communications requirements between the embedded system and other equipment?
(v) the initial states for all internal variables and external interfaces?
[A big help here is to use that nitt-pickiest of all programs, Lint, on the variable issue.](vi) the required action on power down and recovery? (e.g. important data saved in non-volatile memory.)
(vii) each modes and the initiators of mode transition? (e.g. start-up, normal operation, shutdown)
[If you go from the Auto Mode, to the Maintenance Mode, then back to the Auto Mode, starting over where you left off in the Auto Mode, could be deadly in some cases.](viii) the anticipated ranges of input variables and the required action on out-of range variables?
[What happens when you have channel-to-channel leakage in your A/D because the input that you are not reading is driving the entire mux into saturation?](ix) the required performance in terms of speed, accuracy, and precision?
(x) the constraints put on the software by the hardware? (e.g. speed, memory size, word length)
(xi) internal self-checks to be carried out and the action on detection of a failure?
Does the software contain adequate error detection facilities allied to error containment, recovery. or safe shutdown procedures?
Are safety critical areas of the software identified?
"Design for Assessment"
An easy place
to start designing a safe system actually has little to
do with any style of coding or design. It is a
method of documentation/source code archiving, which is
called by several names. The two most common are
Version Control and Configuration Management. The
hardest part about this is establishing the discipline
to use it. Once you start using it, you'll find
you become dependent on it and wonder how you got along
without it. I know using Version Control has
saved me many hassles a few times.
Susan Dart once wrote : "The goals of using CM (Configuration Management) are to ensure the integrity of a product and to make its evolution more manageable. Although there is overhead involved in using CM, it is generally agreed that the consequences of not using CM can lead to many problems and inefficiencies. The overhead of using CM relates to time, resources, and the effects on other aspects of the software lifecycle."
Several Books/Reports summaries on Configuration Management can be found here.
Versions System (CVS) provides network-transparent
source control for groups of developers. CVS, the
most popular version control system in the Open Source
community, is available on many platforms.
• Maintains a history of all changes made to each directory tree it manages.Version Management with CVS by Per Cederqvist, et. al.
Considered the main manual for CVS, Version Management with CVS (170 pages) explains underlying concepts, describes how to use all documented features of CVS, and gives examples of command use.
CVS is normally a command line program, but there are versions available for Windows and Mac's.
"CVS is a version control system. Using it, you can record the history of your source files.Revision Control System (RCS )
RCS is a version control system. It offers a basic level of functionality (for example, it operates on one file at a time). We generally recommend a more powerful system, such as CVS, even for beginners.
RCS has been widely ported and reimplemented. The free version of RCS is often called GNU RCS to distinguish it from the non-free implementations.
"RCS is [analogous to using] assembly language, while CVS is [like using] Pascal."
Practical Software Configuration Management by Tim Mikkelsen and Suzanne Pherigo, 1997, gets individuals and small teams started with configuration management. The book covers basic RCS usage and also discusses the larger issues that are illustrated by the RCS examples.
There are several commercial companies offering Configuration Management tools:
I have used
TLIB and always been happy with it.
Do you think they'll put links to Burton Systems Software on their Web sites?:
[I've used this product as well, and personally, I like TLIB better. TLIB never left me wondering, "Why did it do that?."]
Conversion Tool are now available.
Still asking what version control is and why you should use it? One nice introduction is the book Practical Software Configuration Management, which discusses storing your software in version control and handling basic situations, like coordinating edits by several people. It uses RCS in the examples, but many of the concepts would apply to CVS or other version control systems as well.
Read a description of the complete SourceForge package available free to opensource developers.
Who are we?
What are we doing? Why are we doing it? There is
too much information about this project to fit in this
introductory page. You should really take the time to
visit our Frequently
From: "Jones, Paul L." [of the FDA].
"Nancy Leveson Professor of Aeronautics and Astronautics started a new area of research, software safety, which is concerned with the problems of building software for real-time systems where failures can result in loss of life or property. One advantage of this topic is that nobody questions its goals, except for a few misanthropes (who don't matter anyway). She and her students produced a formal requirements specification for TCAS II, a real collision-avoidance system required on all commercial aircraft in U.S. airspace. One of the lessons she learned from this project is never to do anything like it again. The FAA was pleased with it though, and adopted it as their official specification. She claims you should not read anything into the fact she has been taking the train a lot lately."
The System and Software Safety Research Project is also working on modeling and analysis of various aerospace and transportation systems. Subtopics in this research area include modeling and analysis of safety, system and software requirements specification, safe software design, software fault tolerance, and verification and validation of safety.
and some students (both former and present) started a
company to commercialize their ideas. Safeware Engineering
Corporation began as a partnership in 1991 to
perform applied research and development for real-time,
safety-critical systems. The goal is to act as a
technology transfer conduit from university research to
industrial research and practice.
Software has become the driving force behind most new technologies. But, the engineering of software is becoming increasingly complicated. A software engineer must balance a variety of competing factors, including functionality, quality, performance, safety, usability, time to market, and cost. Moreover, the size of software systems being built is rapidly growing.
The Software Engineering Research Laboratory (SERL) in the Department of Computer Science at the University of Colorado at Boulder is pursuing the discovery of principles and the development of technologies to support the engineering of large, complex software systems. The challenging targets for this work are organizations and software systems operating in the wide-area, heterogeneous, distributed, and decentralized context of wide-area networks such as the Internet.
Broadly speaking, there are six themes that underlie their research:
The purpose of
Validation Assurance (TVA) is to provide a
short-cycle, low-cost approach to building in quality
through design optimization, and quality and
reliability analysis for critical electronic hardware.
Concurrent engineering, theoretical analysis, and
empirical data are used to produce information about a
technology's mechanical, thermal, and electrical
limitations and its most likely failure
mechanisms. If you want reliability, you've come
to the right place!
The Handbook now has more than 650 pages and contains a compilation of 101 analysis techniques and methodologies, plus other related information for the seasoned safety veteran as well as the new practitioner. New material includes 11 added techniques, an updated software system safety section, and a new section on the application of fuzzy and hybrid mathematics to safety analysis. A glossary is now included. The handbook is in an 81/2 X 11 loose-leaf binder so that additions, changes, and your own personal notes can be easily accommodated.
Since I played a small part in the production of "MicroC/OS-II The Real-Time Kernel; A complete portable, ROMable scalable preemptive RTOS", ISBN: 0-87930-543-6, Copyright 1999, (see page XIX of the book) I found particular interest that µC/OS-II is being readied for Safety Critical Systems.
Work is under way to generate a complete validation suite for µC/OS-II. The validation suite provides all of the documentation necessary to prove that µC/OS-II is suitable for use in safety critical systems common to aviation (FAA standard DO-178B level B) and medical products (FDA/CDRH "General Principles of Software Validation, Draft Guidance, Version 1.1"). The validation suite was produced by Validated Software Corp http://www.validatedsoftware.com. When the suite is formally released, µC/OS-II will have been certified for use in an avionics device to FAA standard DO-178B level B. A follow-on package in the 1st quarter of 2000 will bring µC/OS-II and the suite up to full DO-178B level A certifiability.
The Real-Time Software Engineering Branch develops ground data systems for integration and test and on-orbit operations of Earth and space science missions. Branch personnel participate in teams with flight projects, principal investigators, AETD centers, and other organizations to develop integrated hardware and software systems for real-time mission support. The system functionality includes spacecraft, instrument, ground system monitoring and control, launch and tracking services, and data display and analysis.
Software Safety standard
Ultra Long Duration Balloon Flight Software
Recent advances in composite super-pressure balloon materials greatly enhanced the prospects for long duration balloon flights on Earth, as well as possible use for planetary exploration. NASA is embarking on the development of technologies to support extended balloon missions lasting up to 100 days (~5 circumnavigations of the globe) above 99.9% of Earth's atmosphere.
The purpose of
the ULDB flight software effort is to process, monitor,
and control data received and collected on the airborne
instrumentation package. The flight software will
facilitate all communications with the instruments on
board and the ground mthrough continuous line of sight
and over the horizon communications.
The goal of this project is to develop formal-method based system design tools with an emphasis on reliability, safety, and security. The tools will utilize models, generic in construct but domain specific for each application. This project is a joint effort with Sandia National Laboratories.
Objectives:Additional Information can be found by download the following papers:The primary objectives of this project involve increasing the reliability, safety, and security of systems by performing analysis at design time. Analyzing these aspects of a system can greatly reduce the time required to produce a safe, secure, and reliable system. The analyses will rely heavily on formal verification and validation methods.Technical Overview:
"Multi-Domain Surety Modeling and Analysis for High Assurance Systems," Proceedings of the Engineering of Computer Based Systems Conference, Nashville, TN, March 1999.
"Formal-Methods" might deserve their own Resource
Page, but in the meantime, the work of Associate
Professor Steven D. Johnson
"Formal Methods: < mathematics, specification> Mathematically based techniques for the specification, development, and verification of software and hardware systems." from --- Free On-Line Dictionary of Computing.
Phyllis Frankl is an Associate Professor in the Department of Computer and Information Science at Polytechnic University, in Brooklyn, NY. His main research interests are in the area of software testing, including development of software testing tools and comparing the effectiveness of software testing techniques. He has many links to areas such as Effectiveness of testing techniques, Testing Object-oriented Programs.
"Adaptive Testing of Non-deterministic Communications Protocols," 6th International Workshop on Protocol Test Systems, Sept. 1993, (with M. Ghriga).
"Ada Reports & Papers (Part 8)" has many abstracts covering Software Safety and Engineering, mostly done by or for the Military.
Q. Does DoD offer guidance for programming language selection?A. Yes. The DoD Joint Technical Architecture V2.0 references the Information Technology Standards Guidance V3.1, which includes the following guidance:
Language Issues Concerning "C":
" While C is a popular programming language, it is not the best for safety-critical systems.
particular, care needs taken in the software design
phase with regards to:"
If your unsure of what Ada is about then check out this free compiler:
GNAT Ada 95 compiler for MS-DOS
ezl1998a.zip GNAT Ada 95 compiler
readme files, install...
EZ2LOAD 1998 edition is a distribution of the
famous GNAT Ada 95
Delorie Software provides DJGPP, which is a complete 32-bit C/C++ development system for Intel 80386 (and higher) PCs running DOS. It includes ports of many GNU development utilities. The development tools require a 80386 or newer computer to run, as do the programs they produce. In most cases, the programs it produces can be sold commercially without license or royalties. CVS has been ported to DJGPP as well.
Empirical studies of software engineering (ESSE) refers to the application of the experimental method of science to the study of software engineering. This method involves generating hypotheses to test theories and then collecting and analyzing data to refine the theories. The method is stepwise in that theories are successively refined until an accurate model of the studied phenomenon is achieved. For software engineering this means using the process and products of software engineering as the objects of study. In practice, the empirical study of software engineering is so new that few theories exist to test and/or expand. Thus, much of the work in ESSE is exploratory in nature. They are interested in first mapping out what it is that software engineers really do, and from that knowledge extracting patterns of behavior that are general across different work environments and types of software engineering (e.g., safety-critical systems, telecommunications, etc.).
[Update Jan/2007: The reliability analysis center is now the Alion System Reliability Center (SRC). They operated the Reliability Analysis Center (RAC) under contract to the Department of Defense for 37 years.]
The Reliability Analysis Center (RAC) is one of 13 DoD Information Analysis Centers chartered to collect, analyze, and disseminate information on reliability, maintainability, and related assurance technologies such as quality, availability, and now supportability. RAC maintains a technical library, a staff of technical specialists, and computerized databases. It performs special studies under approved technical area tasks (TATs), publishes a current awareness bulletin (The RAC Journal), and has produced a variety of technical products for sale. RAC also conducts standard and customized training courses in assurance related technologies.
The Reliability Analysis Center (RAC) is DoD Information Analysis Centers (IACs). The IACs are chartered by the DoD to collect, analyze, and disseminate data and information in a designated technical area of specialization. Information is distributed to DoD and industry via databases, methodology handbooks, state-of -the-art technology reviews, training courses, and consulting services.
RAC's scope is the reliability, maintainability, quality, and supportability of microcircuits, semiconductors, electromechanical and mechanical parts, and equipment/systems employing these parts.
Web Sites" page presents a representative selection
of World Wide Web sites that provide an excellent
starting point for reliability related information
available on the Web.
"Applying Software Reliability Engineering (SRE) to
Build Reliable Software (Start
This START attempts to clarify the common goal, to identify the essential components of the software reliability engineering discipline which, if implemented, can guide an organization to developing more reliable software.
PRISM is the new Reliability Analysis Center (RAC) software tool that ties together several tools into a comprehensive system reliability prediction methodology. The PRISM concept accounts for the myriad of factors that can influence system reliability, combining all those factors into an integrated system reliability assessment resource.
PRISM was developed to overcome inherent limitations in MIL-HDBK-217, which is no longer being actively maintained or updated by the Department of Defense (DoD).
Quarter-1999 issue of "The Journal of the
RAC Reliability Analysis Center" had a great
deal of information on PRISM.
contains pointers to information on Safety-Critical
Systems, where human lives may be at risk,
especially involving software and computers, available
around the world on the World Wide Web (WWW).
Pointers to relevant newsgroups, books, journals,
repositories, and mailing lists can be found here as
first developed in 1994, contains a listing of over
2,000 health and safety Internet based
critical systems are defined briefly as those where
software is used in a way which might affect
safety. The Institution has undertaken a number
of tasks relating to it, details of which can be found
Software Engineering Institute (SEI) is a federally
funded research and development center established in
1984 by the U.S. Department of Defense with a broad
charter to address the transition of software
engineering technology. The SEI is an integral
Carnegie Mellon University and is sponsored by the
Office of the Under Secretary of Defense for
Acquisition, Technology, and Logistics
[OUSD (AT&L)] .
The first International Conference on the Safety of Industrial Automated Systems welcomed over 200 participants. The quality of the presentations, the range of subjects, and the presence of 15 exhibitors made this event a truly valuable survey of the field for the researchers, equipment designers, industry and prevention specialists in attendance.
Proceedings, containing the full text of 44
presentations, are now available at a price CAN $50.00
Maybe not exactly in line with the ideas of this "System Safety" page, but you never know when you might need some type of potting compound or heat-sink compound for your latest project. Make sure you have a "Material Safety Data Sheet" for any chemical, even a small sample, that comes in your door. Preferably, get the MSDS before you get the sample in your door. Proper disposal of some items costs far more than a small "free sample". If your supplier can't supply you with the MSDS then you don't need their product.
Toxic Substance Control Act
Material Safety Data
I'll leave you with one last parting thought from Shakespeare -
2 KING HENRY VI Act 4, Scene 2; Blackheath.
DICK, the butcher:
"The first thing we do, let's kill all the lawyers."
you want to be on the witness stand after
something goes wrong,
The fact that an item is listed here does not mean we promotes its use for your application. No endorsement of the vendor or product is made or implied.
If you would like to add any information on this topic or request a
specific topic to be covered, contact
provides up to date information for engineers, www.circuitcellar.com
for more information and additional
About ChipCenter Contact Us Hot Jobs at ChipCenter Privacy Statement Advertising Information