Many commentators and debunkers of year 2000 computer problems fail to appreciate its true nature and scope. As a result, they come up with calming statements that do not address the full issue. It is worrying when it is programmers and others in the computer field who fail to understand the cause and implications of this very real threat to us all. Because they are closest to the problem, their opinion carries a lot of weight with management and the general population. Their view is especially welcome when it is good news that we all would like to believe.
Programmers have in the main been responsible for creating the year 2000 problem because of a lack of discipline in the computer industry, which allowed them to decide individually how to handle dates. Ironically, the result is that they are now able to cash in on fixing the problem and charging increasingly higher rates to do so. Contrary to some people’s opinion, I wouldn’t give programmers or Bill Gates the credit for having the foresight to have deliberately created these problems.
The problem is not limited to the practice of holding only the last two digits of the year in data files. Most systems written over the past decade record the full four-digit year in one form or another, yet year 2000 issues still occur in the software. Some commentators state that developers knew they were creating a time bomb but thought the software would be obsolete or re-written before the turn of the century. However, in the case of exceptions I have found, compliant code would have been just as easy, if not easier, to produce. This confirms to me that most programmers, like most of the population, haven’t given this matter any thought at all.
An exception I’ve found several times during Y2K software audits is in calculating with two-digit years to find next year’s anniversary date, last month-end date, next financial year, etc. Instead of using a readily available four-digit year value to add or subtract from, I found several cases where the programmer has arbitrarily decided to use just the last two digits of the year. In one case the two-digit year value was prefixed with a hard-coded 19 at the end of the calculation. Ninety-eight plus one works okay but 99 plus one gives 100. Prefixed with 19, what should be 2000 appears as 19100. Invalid years such as this may fail in date conversion routines, give faulty elapsed time values, halt a program, corrupt data or have some other unplanned-for outcome. Each produces its own unique consequences.
Many computer people would say: “But no good programmer would do that”. Well, good, bad or indifferent they have and will continue to do so until it is brought to their attention. The date-handling issue has appeared straightforward, so we programmers coded it up whatever way we liked, possibly doing it differently each time. When we produce code there is no one way of doing it. We make it up as we go along and pride ourselves on our creativity. Often programming languages include standard routines to manipulate dates but there is no guarantee that a programmer will use them. Whether the language is Cobol, Access, Dbase, Visual Basic, C, Foxpro, Paradox or Revelation and whether the operating system is Windows, DOS, Unix, Pick or Apple Macintosh, exceptions can and do occur.
In my many years’ experience as a programmer I have very rarely had new or changed code checked by anyone else. This is typical for most programmers and therefore bad coding habits are never corrected. If the code appears to be working it is assumed to be okay. Many programmers leave it to the software users to find the problems for them. Are these users expected to test for year 2000 exceptions?
An exception I found recently was the validation on entry of the year value. The programmer arbitrarily specified 1988 to 2000 as being the valid range for an acceptable year value. Back in 1988, the programmer thought 2000 would never happen. If you asked the programmer then if they thought the software would still be around in 12 years, the answer would probably have been yes. This particular problem, and many others, would be unlikely to be picked up by a Y2K audit involving software testing only. A full code check is necessary.
Other examples include using the two-digit year format for DOS file names (affecting sort, archive, selection functions and application interfacing), file indexes (particularly secondary indexing and index--only files), specifying the accounting month in YYMM format, incorrect calculation of a leap year (1900 isn’t a leap year, 2000 is a leap year — some programmers only coded half of the century rule into their leap-year calculation), hard-coded 19, incorrect handling of interfaces with other systems and so on. The consequences of Y2K exceptions vary from the very minor to the very serious. It is the compounding effect of all these problems happening at the same time that is the real threat.
It is ironic that many programmers who are currently telling their management, customers, fellow workers, families and friends that this year 2000 issue is just hype are probably the ones who have incorporated this oversight into their code. If they haven’t the logic to research and work through the year 2000 issue and then draw the only possible logical conclusion, ie we have a very big problem here, then they are probably the ones who overlooked the fact that 2000 was closing in. They are also likely to think that an audit of their past code would be unnecessary.
My recent year 2000 work in the banking industry and my 20 years working as an independent computer consultant with more than 100 companies using many types of hardware, software and coding in more than 10 computer languages, has led me to believe that this is not a problem we should be downplaying. I have seen how easily these exceptions slip into code and how tedious the work is finding these problems. Fixing the problems is relatively easy and usually straightforward. The difficult part is (a) management awareness and support, (b) understanding the nature of the problem, (c) knowing what needs to checked and systematically checking it, (d) adequate testing of fixes and (e) having the skilled resources and time available to do the job properly.
Of particular concern are problems compounded by:
• Loss of program source code.
• Object code that does not match any existing source code.
• Missing compilers.
• Complex systems that require considerable information flow between various parties to resolve, and/or extensive integration testing.
In the mid-1990s I contracted at a site employing 200 programmers and analysts. As quality inspector, I coordinated and facilitated analyst and programmer meetings so we could work through all new maintenance and development specifications to detect any problems and oversights. In the six months I held this position I worked with 90% of all programming staff, reviewed 90% of all work to be done and regularly attended meetings with management on quality issues. In that time the year 2000 issue was never raised — not by myself, not by management and not by any of the programmers or analysts.
This delay in taking action is typical of most organisations. If decision-makers had been made aware of and understood the problem a few years back, then organisations could have saved millions by integrating the changes required with normal maintenance and prevented new problems by including a check in their quality procedures. They also could have gained a business advantage over their competitors by being first to announce their compliance. Although there were compelling reasons for companies to take action, they didn’t. They failed to see 2000 on the horizon and had no idea of the scope of the problem it posed. Many organisations are still not taking any action and it is quite possible they will be unable to avert disaster.
Ask the infrastructure companies you depend on — for example, electricity, gas, power, tele-phone, local government — the following questions.
• When did you become aware of the problem?
• Excluding meetings and writing reports, when did your organisation take action?
• Have you ensured that your suppliers will make it? (In particular if they are Asian or European based).
• When do you plan for all work to be completed?
You may be very surprised and alarmed by the answers you receive. It is a myth that most of these companies have known about the problem for some time now. Have they left themselves enough time? Have their good people already left leaving the not-so-competent to fix the problem? How often do even good programmers get it wrong? How often do computer departments fail to meet deadlines? How often do computer projects fail completely?
The more serious threat to infrastructure is its reliance on embedded chips, also known as microchips, embedded systems and PLCs. This aspect of the problem is far more prevalent than the problems posed by relatively accessible computer software. Embedded chips contain code, just like the software written for “normal” computers, but it is buried on a tiny chip which is difficult to check for compliance. In most cases, compliance can only be determined through extensive testing. Any stupid problem that we are capable of making as programmers of “normal” software is just as likely to happen, and has, with embedded software. Testing to determine if there is a problem, rather than checking the code, gives no guarantee of finding all possible problems. This is an area known to relatively few people, not something ordinary computer people are able to deal with. It is this aspect of the problem that will cause the greatest disruption to the lives of everyone on this planet. For example, many thousands of embedded microchips are used in typical power generation systems and the failure rate for these has been estimated at 3% to 10%. Billions of microprocessors are sold every year and are used in oil refineries, telephone exchanges, sewerage treatment plants, traffic light control systems, emergency vehicles, PABX systems and some water reticulation systems.
The issue is not how simple, stupid or boring this problem is, not whether we programmers were incompetent or how great we may be individually as programmers, not whether lawyers and computer people will amass a fortune from it, not whether the millennium starts with 2001 and not 2000 and it has little to do with technical explanations of how computers work. The very real fact is that the problems are there and each problem is unique, so there will be no simple answers, just a lot of very laborious and mostly boring work.
The effect of each exception in most cases is not the problem, just a computer programmer’s mistake which, in normal circumstances, is reported and fixed. En-masse, however, we have a very large headache that will start niggling us as we lead up to the end of 1998, begin to give us some real pain in January 1999 and then hit with real force on January 1, 2000 and there-after. With a lot of luck and much more effort than we are currently seeing we could avert the worst effects.
Amon is a computer consultant based in Nelson.