Business Objects

Delivering Cooperative Objects for Client/Server

 

 

 

By

Oliver Sims

10th December 1993

Re-published 12th January 2004

oliver.sims@simsassociates.co.uk

 

 


 

Edition Notice:

This edition is based on the manuscript sent to the then publisher (McGraw-Hill) in December 1993.  There may be minor wording variations from the text published in 1994, although nothing of any substance has been changed.  Both the layout and the punctuation style have been retained unchanged.

Copyright © 2001-2004 Sims Associates. All rights reserved. No part of this document may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior permission of Sims Associates, with the exception of quoted text of not more than half a page.

 

 


Contents

Foreword. ix

Preface (2004) x

Preface (1993) xi

Trademarks. xvi

Introduction and Management Summary. xvii

 

Part One  The user—exploiting the PC.. 1

1      Usability—the new system bottleneck. 3

1.1    The usability challenge. 3

1.2    A vital business need. 4

1.3    The application problem.. 5

1.4    Usability. 7

1.5    New user interfaces. 10

 

2      Making computers familiar:  object-based user interfaces. 11

2.1    The character-based terminal approach. 11

2.2    A PC-based approach. 12

2.3    Presenting objects on the screen. 15

2.4    The ‘workbench problem’ 21

2.5    The object-based user interface. 24

2.6    Summary. 26

 

3      Cooperative business objects. 27

3.1    The ‘integration’ problem.. 27

3.2    Objects and Messages. 31

3.3    A re-statement 34

3.4    Summary. 38

 

Part Two:  The programmer—application structures. 41

4      Structural overview.. 45

4.1    Base requirements. 45

4.2    Application code structure. 50

4.3    The user vs. the data. 69

4.4    Summary. 72

 

5      The user interface domain. 74

5.1    User logic. 76

5.2    ‘Models’ and ‘views’ 78

5.3    The view ‘layout’ 83

5.4    Connecting to the server 85

5.5    Summary. 88

 

6      The shared resource domain. 89

6.1    The nature of the SRD.. 90

6.2    SRD design points. 93

6.3    Overview of SRD structure. 94

6.4    SRD components. 96

6.5    SRD CBOs. 99

6.6    Summary. 108

 

7      End-to-end summary. 110

7.1    Domain interaction. 110

7.2    Components of the end-to-end model 110

7.3    Component Behaviour 111

7.4    Model subsets. 112

 

8      The CBO infrastructure. 114

8.1    Why an Infrastructure?. 114

8.2    Common Components. 117

8.3    CBO aspects. 123

8.4    Binding. 138

8.5    Code page and data conversion. 144

8.6    ‘Alien’ objects. 146

8.7    Operating system implications. 147

 

Part Three:    Design issues. 151

9      Data placement 155

9.1    Scope of data. 155

9.2    Kinds of Data. 157

9.3    Local data and availability. 161

 

10    Data integrity. 166

10.1      Data on the PC.. 167

10.2      Units of work. 168

 

10.3      To lock or not to lock. 173

10.4      Commit scope start and end. 176

10.5      Update control (multiple SRDs) 177

 

11    The ‘megadata’ problem.. 180

11.1      The problem.. 180

11.2      Approaches to solutions. 183

 

12    Business processes. 189

12.1      Objects and business rules. 189

12.2      Processes and units of work. 193

12.3      Workflow management 195

12.4      Aspects of objects. 197

12.5      Location of business logic. 200

 

13    Common design concerns. 202

13.1      Introduction. 202

13.2      UID vs. SRD.. 202

13.3      Design in the SRD.. 203

13.4      Design in the UID.. 206

13.5      Methodologies and techniques. 216

 

Part Four Management implications. 221

14    Technical implications. 223

14.1      Client/server infrastructure. 223

14.2      Network considerations. 224

14.3      Systems management 228

 

15    People Implications. 233

15.1      Myths, dreams and panaceas. 233

15.2      The ‘end-to-end view’ 236

15.3      Object orientation skills. 238

15.4      Impact on organizational structure. 239

 

16    Getting started. 241

16.1      The ‘kickoff’ process. 241

16.2      The project 243

16.3      The client/server spectrum.. 245

 

 

Part Five The Future of CBOs. 251

 

17    CBOs today. 253

17.1      Customization. 253

17.2      Portability. 253

17.3      IT advantages. 254

 

18    CBOs tomorrow.. 256

18.1      IT as the facilitator of change. 256

18.2      User-written function. 256

18.3      ’Out-of-the-box’ integration. 258

18.4      A market in objects. 260

18.5      Looking forward. 260

 

 

Appendix 1        Object orientation. 261

A1.1     Introduction. 261

A1.2     What is an object?. 262

A1.3     Classes. 266

A1.4     Polymorphism.. 268

A1.5     Inheritance. 270

A1.6     An example. 272

A1.7     Objects, and data on disk. 273

A1.8     Summary. 275

 

Appendix 2        A technical description of CBOs. 277

A2.1     Introduction. 277

A2.2     The definition of ‘CBO’ 278

 

Appendix 3        Messaging—‘send’ vs. ‘send/post’ 293

A3.1     Single API 294

A3.2     Two APIs. 295

 

Appendix 4        Local model vs. model 297

A4.1     A Single model object?. 297

A4.2     The Need for a local model 302

A4.3     Conclusion. 302

A4.4     Advantages. 303


 

 

Appendix 5        A UID prototyping technique. 304

A5.1     Some initial considerations. 304

A5.2     Overview.. 306

A5.3     The design process. 307

A5.4     Build the prototype. 317

 

Appendix 6        Sample project proposals. 318

A6.1     The ‘pathfinder’ project 318

A6.2     The ‘vision prototype’ project 327

 

 

References and bibliography. 331

 

Glossary. 333

 

Index  343

 



 

 

 

 

 


 

 

 


Foreword

 

The IBM McGraw-Hill Series

IBM UK and McGraw-Hill Europe have worked together to publish this series of books about information technology and its use in business, industry and the public sector.

The series provides an up-to-date and authoritative insight into the wide range of products and services available, and offers strategic business advice. Some of the books have a technical bias, others are written from a broader business perspective. What they have in common is that their authors—some from IBM, some independent consultants—are experts in their field.

 Apart from assisting where possible with the accuracy of the writing, IBM UK has not sought to inhibit the editorial freedom of the series, and therefore the views expressed in the books are those of the authors, and not necessarily those of IBM.

Where IBM has lent its expertise is in assisting McGraw-Hill to identify potential titles whose publication would help advance knowledge and increase awareness of computing topics. Hopefully these titles will also serve to widen the debate about the important technology issues of today and of the future—such as open systems, networking, and the use of technology to give companies a competitive edge in their market.

IBM UK is pleased to be associated with McGraw-Hill in this series.

                                                                                    Sir Anthony Cleaver

                                                                                    Chairman

                                                                                    IBM United Kingdom Limited

 

 


 

Preface (2004)

Business Objects is now out of print, and the publishers returned the copyright to me a time ago. I thought it might be interesting to make the book available (possibly only to niche IT historians!) in PDF format. It is presented here as submitted to the publisher in 1993, and before copy-editing. Hence there are some trivial differences in wording between this edition and the published 1994 edition.

While the book is in some respects dated, I believe much of it applies today—only the language and terminology has changed somewhat. The main change has been the term ‘component’. In 1993, component had not emerged to mean software artifacts such as COM, EJB, and CORBA Components—it was used indiscriminately for any piece of software or indeed hardware.

However, what I was indeed writing about—quite consciously—was software components, autonomously developed, pluggable (into a component container), and composable. Lacking a generally acceptable term, I coined my own—cooperative business object or CBO. Hence when you read ‘CBO’ or ‘object’, please think today’s ‘component’; and when you read ‘component’ please read it in its older meaning.

For those who may have read Business Component Factory by Peter Herzum and myself (Wiley 2000), the CBO concept maps exactly to a “Distributed Component”.


Preface (1993)

Complex problems have simple solutions, which are wrong.

Client/server systems promise major benefits for businesses, organizations and people.  Promises include exceptional ease-of-use, application integration and taking IT off the critical path of business change.  The question is, how do we realize these benefits?

This book describes an approach to developing application systems that can play a significant part in turning these promises into achievable goals.  It is a book about realizing the dramatic advantages to be gained from the application of object orientation to client/server systems.  It describes a new approach to structuring application software.  By applying object orientation to the end product of the application development process (rather than to components used within. the process) we provide both ease of programming for distributed systems and a foundation for application integration and rapid development.

Based on experience with industrial-strength systems, this book describes how delivering business-sized objects instead of applications best meets the application design and coding problems otherwise found in distributed and client/server systems.  Such business-sized objects are independently developed using either procedural or object-oriented programming languages.  They cooperate with each other in providing business function, and hence we call them ‘cooperative business objects’, or CBOs.  The term ‘CBO’ is used to distinguish between a business-sized object as an end product, and an object used by a developer as a component of an object-oriented application.  Such an application is not itself an object. A CBO is.

Just as transaction programs rather than batch suites were the best ‘shape’ for the on-line systems of the 70s, so CBOs are the best shape for the systems of the 90s.  Providing this new shape of deliverable is, perhaps, the real business of objects.

The book is also about ease of programming.  CBOs make client/server systems with advanced object-based graphical user interfaces easy for the average application programmer; and, indeed, for the casual programmer.  This ease of programming is achieved because of the new application structures introduced.

This book is written for IT managers, AD managers and technical professionals interested in implementing effective client/server systems.  Although the concepts developed are widely applicable, the context for the book is medium-to-large IS systems, typically using at least one mainframe (or mini) computer, running core business systems.  In particular, it is assumed that ‘client/server systems’ means systems where powerful desk-top PCs are connected to one or more larger systems which manage a company's shared computer-based resources (typically data). 

This is not a book about application design methods; nor does it address in any detail the various aspects of connectivity (line protocols, connection APIs, etc).  It is not a book about systems management, or the selection of hardware and operating system components; neither does it review existing client/server software, or OOPLs.  It is a book which addresses these three questions:     

        What does it mean to ‘exploit fully’ the client/server system structure—from the point of view of the users of such a system?

        What ‘shape’ of application-level code should we be aiming at, and what design approaches should we be adopting if we are to meet our ease-of-programming objectives?

        What additional system-level code (or “middleware”[1]) is required to enable application programmers easily to produce code which meets user requirements of these new systems?

Lack of good answers to these three questions has been a prime cause of the difficulties found over the past several years in exploiting fully the new system structures available to us.

This book describes the development of thinking, and the conclusions reached to date, in addressing all of these problems as a whole.  From several directions, the concept of the CBO has evolved as a solution.  This concept has been proved through the development of production-strength software.  The detail of this software is not within the scope of this book; the concepts which that software implements are. 

‘Business objects’ as a topic of interest in the industry is growing.  Recently, the Object Management Group started a special interest group in business objects.  Just as this book was completed, I received a draft of Rob Prins’ book, ‘You Can Bank on Objects’ (as yet unpublished).[2]  This presented a particularly exciting application of the business object concept for very large-scale distributed systems (and for smaller-scale systems!).  Rob works for IBM in the Netherlands, and is seconded as principal technical consultant to a wholly-owned IBM Netherlands subsidiary, Cyclade Consultants, in Utrecht.

History

At the end of 1986, I took on the technical support role for IBM's Systems Application Architecture (SAA) (announced on 17 March 1987) within IBM United Kingdom.  Towards the end of the following year, IBM introduced the concept of ‘cooperative processing’ into SAA. 

In essence, this concept pointed towards a vision of systems which blended the best of the PC with the best of the Mainframe.  The PC brought user interface capabilities that far surpassed the non-programmable video display terminals typically attached to mainframes; the mainframe brought shared resource management capabilities (large shared databases, for example) that could not be matched on the PC.

Early in 1988, I and a small group of colleagues started to build a demonstration application which would illustrate the benefits of cooperative processing.  Using a pre-release version of the IBM OS/2 operating system, we set to work.  However, it soon became apparent that most of what we knew about structuring applications (learned through experience with batch, interactive and transaction processing applications) had to be thrown away.

We started saying to each other, “Hey, this is a new world...”  And although some of what we had learned with early IBM distributed systems (e.g. the IBM 3790 and 8100 systems) remained applicable, to a large extent we found ourselves in an unexplored and uncharted land.  Early in 1989, I wrote an internal IBM discussion paper entitled ‘The New World’.  As a result of this and other related papers, the phrase ‘New World’ came to have some small currency among IBM people in the UK and in a few other countries. 

One thing about this new world which is very apparent is that we cannot treat the PC merely as a kind of souped-up terminal.  The PC is a complex and powerful computer system in its own right.  What we have is not a simple hierarchical network with intelligence in the centre controlling a large number of simple devices; we have a peer-to-peer network of large numbers of powerful computers. For the last five years (among other things), I have worked towards understanding how best to exploit this new system structure, focusing on the following goals:

        Ease of application programming (for both the professional and the casual programmer)

        Full exploitation of the PC graphical user interface

        Cooperative application code (cross-system)

        Application integration

Today, we can say that we have had a considerable measure of success in achieving these goals.  At the time of writing, achieving all of the goals is within sight.  There is a great sense that although the new technology of CBOs is in its early form, it is fundamentally moving in the right direction. 

This book is a summary of the concepts proved and lessons learned in addressing these goals through a series of industrial-strength software developments.  Throughout, a central theme has been to build an understanding of the appropriate ‘shape’ of application software (CBOs), and to construct a software infrastructure that enables that shape.  The infrastructure developed has been used successfully by a small number of UK companies in some major projects. 

In August 1993, IBM and Softwright (a UK-based software house) launched a joint venture company called Integrated Object Systems Limited. In May 1994, Integrated Objects announced and shipped a product called Newi (New World Infrastructure).  Newi is a CBO-enabling software infrastructure which implements, among other things, the concepts described in this book.

Acknowledgements

Writing this book has been largely a spare time effort.  I could not have done it without the unfailing support of my wife, Heather, who gave me the encouragement I needed to develop some of the initial concepts, and pursue their development.

Concepts, however, are only proven through excellent software design and industrial-strength code.  Here I owe a great debt to Martin Anderson, Chairman of Softwright Systems Limited (and now Managing Director of Integrated Objects), and Alan Boother, chief software designer at Softwright.  Both Martin and Alan shared in the vision.  Martin caused it to be realized; Alan, through his technical expertise, transformed it into reality.

Late in 1991, Charles Brett of Spectrum Reports encouraged me to put some of the concepts into writing.  His support, and editorial assistance, helped me hone my ideas to a considerable extent.  Several sections in this book include extracts from articles written for the February 1992, May 1992 and August 1992 editions of SAA and Open Software Spectrum, and for the November 1992 edition of Open Transaction Management Reports. These extracts are reproduced by kind permission of the Publisher.

In May 1992, IBM in Sweden announced a product called ‘Object-Oriented Infrastructure/2’.  This was developed by Softwright under a contract for which I was the technical architect.  This development helped immeasurably in moving the concepts of CBOs and the required CBO infrastructure forward.  It could not have been done without the enthusiastic support and dedication of Johan Emilsson, Lars Magnusson and Thomas Jonsson, all of IBM Sweden.

I must also thank Martin Anderson (Integrated Objects), Rob Prins, David Hutton-Squire, Dave Schofield, Ray Warburton and Chris Winter (IBM), who read drafts of the book, and whose comments and were always relevant and helpful. In particular I would like to thank Hugh Varilly of IBM, whose detailed reading and checking of the final draft was especially beneficial—in both technical and non-technical areas. Section 1.4 derives largely from an internal IBM paper written by Ray Warburton, and sections 9.2 and 10.3  benefited greatly from discussions with Dave Schofield and from an internal IBM paper written by him. Any errors or omissions, however, I claim as my own.

Finally, I am grateful to the many professional colleagues in IBM, both in the UK and in other countries, who have provided encouragement, constructive comments and support.  My thanks to them all.

Structure of the book

The Introduction gives an overview of the major arguments presented in the book.

Part One discusses how the user should perceive the system—how to get the best for the business out of these expensive GUIs.  Exploiting the GUI leads to object-based user interfaces, and from there to the concept of cooperative business objects as the thing the programmer should build.

In Part Two we examine what the programmer should see in dealing with distributed systems.  Here we define the general ‘shape’ of the application code that we'd like application programmers to write.  We discover that cooperative business objects are the right shape for distributed application code as well as for object-based GUIs.  Given this shape, we then discuss the kind of enabling ‘middleware’—or infrastructure—needed to implement it.

However, no matter how easy we make it to build application code, there is always the question of how you design application systems for the distributed processing and data inherent in client/server systems.  Part Three presents some solutions to the design issues and concerns commonly met by, it seems, everyone who is starting out with client/server.

All of this has implications for IS management, and in Part Four we look briefly at two areas—technical implications and people implications.  How to handle these, and how to make a start, are different questions of course, and in the last part of this section we discuss getting started.

In Part Five, we peer into the future a little, to outline some of the areas which initial experience suggests the new technology of CBOs may take us.

Finally, there are a number of appendices, which delve into greater technical detail about some of the areas covered in the body of the book.  These are:

        A high-level introduction to object orientation (Appendix 1)

        A technical description of ‘cooperative business objects’ (Appendix 2)

        A short discussion on approaches to handling both synchronous and asynchronous messages (Appendix 3)

        A rationale for having a ‘Model’ object held locally in the PC as well as in the server (Appendix 4)

        A process I've found useful for rapid development of GUI prototypes (Appendix 5)

        Two sample project proposals for getting started (Appendix 6)


Trademarks

The following trademark declaration was made in 1994. Some of the trademarks shown may now belong to other companies.

C++ is a trademark of AT&T 

CICS is a trademark of IBM Corporation

CUA is a trademark of IBM Corporation

ENFIN/3 is a trademark of Easel Corporation

Motif is a trademark of Open Software Foundation Inc.

OS/2 is a trademark of IBM Corporation

REXX is a trademark of IBM Corporation

SAA is a trademark of IBM Corporation

Smalltalk/V PM is a trademark of Digitalk Inc.

Windows is a trademark of Microsoft Corporation Inc.

Some of the artwork in this book is from Lotus Freelance Graphics for Windows, © 1993 Lotus Development Corporation. Lots and Freelance Graphics are registered trademarks of Lotus Development Corporation.

 

 


 

Introduction and Management Summary

 

This book is about a new development in application structuring.  It describes a new kind of deliverable from the application development process—the ‘cooperative business object’ or CBO. (A brief introduction to object orientation in provided in Appendix 1.)

It seems that whenever we in information technology (IT) get comfortable with a technology, and consider it mastered in terms of operation, management, methods, software and hardware, then along comes a whole parcel of new things to be handled.  The kind of new things that are most difficult are not those that are bigger, or smaller, or faster; it's those that are different.  Distributed and client/server systems are different, and have proved difficult.  CBOs, however, bring ease of programming to the new environment.  A major theme that we develop in the book is that CBOs, rather than more traditional structures, best fit the application programming needs of the 90s. 

But what is driving this change?  Well, IT change is often driven by changes in business needs, and the move to client/server is no different.

 

New business needs

Perhaps the most significant of current business needs is the imperative many companies face to enable people to become more effective.

This need is expressed in many forms, and comes not only from competitive pressures (e.g. a need to provide much greater customer service), but also from cost pressures that force a reduction in organizational hierarchy depth.  This can lead to a significant enhancement of people's jobs—to empowerment.  But empowerment means people addressing more aspects of the business, or addressing the same aspects in greater depth, or both.

Being more effective also means being more flexible as businesses increase the rate at which they change.  People need to handle new organizations, new products, new policies, new processes, new customers, new suppliers and new business environments.  At the same time, there is a need to reduce training times for new staff.

All of this means that people need much better computer support.  In particular, they need to be able to handle many more application areas than hitherto.


Very often, the business turns to their IT organization for the support people need to deliver the greater effectiveness being sought.  One of the major IT responses to these demands is to find new ways for users to exploit the investment in data and function present in their computer systems.

Specifically, the search is for approaches that will provide great ease-of-use, reduced training times, application integration for the user, and the ability to react to new requirements fast—ideally faster than other areas of the organization, so that IT can become a facilitator of change.  Client/server systems are seen by many to be part of the solution.

 

New challenges for application development

The whole point of building client/server systems is to combine the usability of graphical user interfaces (GUIs) with the traditional strengths of IT systems—access to and management of shared resources such as data.  The promise of such systems is great, boasting exceptional ease of use, application integration and taking IT off the critical path of business change.  However, if we are really to benefit from these new systems, we must not only understand how best to exploit the GUI, but also how to do so in a distributed systems environment.  One without the other is not particularly useful.

In addition, we need to effect such exploitation using the skills of the average application programmer; and eventually we must harness the capabilities of those many end users who have programming aptitude and are willing and able to apply it.  A vital requirement for this new environment is, therefore, ease of programming.

However, the acquisition of many more applications by itself is not enough.  These new applications must integrate successfully with others.  One of the main promises of the GUI is to allow users to have applications interact effectively with each other on the screen; but there's a problem.  Even on a GUI, traditional applications do not deliver the desired levels of integration, no matter how many icons there might be.  Object-based GUIs, together with new application programming constructs, offer a way ahead.

In the 1990s, then, we can see four major challenges facing application development:

        Enabling advanced object-based graphical user interfaces

        Building cooperatively processing application function across client/server and distributed systems

        Providing for application functions to be easily integrated with others

        Enabling ease-of-programming for all of the above.


Perhaps the most important thing to understand is that we are facing a radically new technology curve which demands new ways of thinking, and new approaches to design.

 

A new technology curve

In the 50s and 60s, most systems were batch.  We can say that such systems formed a specific technology curve as shown in Figure I1.1.  The seventies and eighties saw the evolution of a second technology curve, based on video display terminals such as the IBM 3270 family.  This evolution generated a huge change in the way people used systems, in the way we built applications, and in the structure of systems.

Such change does not come easily.  Initially, we tend to treat the new technology curve as if it's an extension to what we already know.  In the early 70s, I remember working with an IT department who were installing their first video terminals.  These were intended to replace a number of cardpunches.  Data would be entered directly into the system through terminals, so saving a lot of money on punch cards.  The existing batch applications would be unchanged. 

The first application placed card images on the screen.  Later, a purpose-built data entry system was introduced, which made the screen formats much friendlier; but the core business systems were still batch; there was no on-line access or update.

 

Figure I1.1.    A new technology curve


Looking back, we can see that front-ending the existing batch systems with data-entry screens was not the best way to exploit the new technology (video terminals, teleprocessing and database).  To a casual onlooker observing the screens, there would appear to be little difference between data entry and full on-line update with data integrity.  However, the user, and the application developer, would certainly know the difference.

Only when on-line access and update via effective transaction processing systems was introduced did the technology start to be fully exploited.  The idea of transaction processing introduced a new ‘shape’ of application code—the transaction program; and to enable programmers to build transaction programs easily, a new form of middleware was developed—the transaction processor.  Until then, one might say that using the technology for data entry only was little more than just putting punch-card images on screens.

Such under-utilization of a new technology is typical of how a new computer technology curve is often approached.  We apply tried and tested methods to the new technology, even though they may not fit.  If we really want to exploit the technology fully, we have to understand the software structures best suited to it.  Then we can build the middleware and tools required to make it easy.  By definition, a new technology does not come equipped with these things.

Today, we are in the same position.  Many of the tools being used to address the new client/server environment do little more than the equivalent of putting punch cards on the screen.  Some products make GUI-based applications easy to develop; others make it easy to connect a PC to a Server.  Few products today enable you to build advanced object-based user interfaces and connect application code across systems and address directly the challenge of application integration and do all of this easily, in the language of your choice.  Yet these are the major areas we must conquer if the full promise of client/server technology is to be realized.[3]

Just as twenty years ago, the casual observer saw little difference between data entry and full on-line transaction processing, so today the casual observer often sees icons and windows on a Graphical User Interface as being the full extent of the new technology.  However, seeing icons on a screen does not by itself mean that the client/server system behind it is being fully exploited.

So, we are currently at the beginning of the next evolutionary step—moving user interface and application function out from mainframe systems to the PC, and linking system components in cooperative client/server structures across LANs and WANs.


Driven by the business pressures for user flexibility and change, the full exploitation of these systems is taking us into new user interface capabilities, new approaches to application connectivity across distributed systems, new ways to structure application software (the cooperative business objects shown in Figure I1.1), and new software design techniques.

 

New user interface technology

By providing an increasing range of system resources (function and data) to users, IT delivers an increasing potential benefit.  But this by itself is not enough.  A user interface is needed which enables users to turn this potential benefit into delivered benefit.

The PC gives us a new level of user interface function—an incredibly high level compared to character-based terminals.  Furthermore, this level of function is standard on most if not all PCs today.  This point is worth emphasizing.  Although it's true that this level of function has been around for some time (Starting perhaps with the Xerox Star in the late 70s), only recently has it become generally available as a base function on all PCs.

However, as users are required to handle an ever-widening range of applications, the very fact of handling more applications makes the additional applications increasingly difficult to use.  From a system point of view, this can be seen as a bandwidth constraint—except that the bandwidth in question is a person's ability to handle an increasing number of things.  Since we can't change the human brain, the only way to increase the effective bandwidth is to change the way in which computer systems are perceived by their users.

With the technology of character-based terminals, we presented the user with lists of functions that could be used, and constrained business processes to those defined not by the business, but by the application designer's idea of what the business required of a user.

With PC technology on the other hand, we can provide the impression that the computer handles ‘things’, such as invoices, insurance policies, bank accounts, notepads, etc.—thus matching the user's perception of the world outside IT systems.  Instead of the computer appearing to the user as a separate and different universe, we can present computer systems as an extension of the user's world.  Indeed, in this extension, we can improve on real life (for example, we can have an order form that will tell the user if a mistake has been made—or a shredder which can de-shred).  We are talking here of a ‘thing-based’ user interface rather than a function-based one.  It's what is called an ‘object-based’ user interface.

This approach can free the user from previous technology-based constraints, hide completely the underlying system complexities, and allow for much greater application integration—all of this while ensuring that the required business rules are enforced.


Object-based user interfaces are a new universe of user interface design.  But when you exploit the new PC technology in this way, you find that you need an application software structure that is significantly different from the familiar menu/panel hierarchy structures we use to drive character-based terminals.

It turns out that the easiest way (by far!) to implement the new object-based user interface is to build and deliver CBOs (independent software objects) instead of ‘applications’.  Each CBO maps to one of the objects that the user deals with on the screen.

However, new user interface technology by itself is not enough.  It is the use of this technology in distributed client/server systems that really make the difference.

 

New application connectivity

Since corporate data is generally held on a mainframe or mini—a ‘server’—the PC application code must be able to cooperate with server application code in order to handle corporate shared data.

The PC handles the graphical user interface and user interaction, and since much of this is user-driven, we find that the best way to access shared resources (such as corporate shared data) is to build the PC as a client, making requests of separate servers.  By ‘servers’ we mean application code which handles independent requests from clients anywhere.

This is a significant change from the more traditional approach; it means that mainframe applications stop driving the business (by driving the user) and become servers, which provide both controlled access to shared resources, and secure enforcement of the business rules.

But, more than just connecting a PC to a server, we are looking at a world where computers of all sizes, whether clients or servers, are being linked together in peer-to-peer networks.

However, connecting one piece of application code to another is not easy—especially when (as in the case of the PC and a server) asynchronous connection is required.  The solution to this problem is to use event-loop and message-driven application structures, where some middleware handles the event loop, the messaging and the initiation of the application code. 

This is the shape of CBOs, which were so useful for the user interface.  So we find that the same new software structure is the right shape for client/server and peer-to-peer systems as well as for the user interface.

 

New software structures—CBOs

Most OO (object orientation) tools today enable a developer to use objects in the process of building some software deliverable that will be handed to a user to run.  But that deliverable is not itself an object.  The benefits of OO are seen by the developer at build time, but not by the user at run time.  A cooperative business object (CBO) is a deliverable, which is handed to users to run, but which is itself an object.  Thus it provides all of the benefits of object orientation to the runtime environment.  Since a CBO is a deliverable, it is ‘language neutral’; that is, you can use any one of a number of languages to build it.  You could use COBOL—or an OO language.

The cooperative business objects add a new answer to the question posed of application designers: ‘In implementing this business process, what is the thing that you're building—what is the end product?’ Previously, the answer was typically one of:

        A batch program (or suite of batch programs)

        A transaction program (or set of transaction programs)

        An interactive application (such as a PC application)

Now we add the answer ‘a CBO’.  The discovery of this as a desired shape of software—as a desired end product of a development process—is a major new element in client/server systems, and we examine this new phenomenon in depth in this book.

With CBOs, we can see new horizons of potential software re-use—even a market in such software objects (we might read in a few years adverts such as ‘Buy XYZ's Invoice CBO—first in Pan-European VAT support...’).

 

New middleware

To run CBOs, a new form of middleware is required.  Why?  Well, here's an analogy:

At the end of the 60s and the beginning of the 70s, we found that to exploit fully the then new system potential of terminals connected to computers via telephone wires, several essential system infrastructure components were required:

        Teleprocessing software

        Teleprocessing hardware at the mainframe

        Database management software

However, even with this infrastructure in place, it was soon discovered that the ‘shape’ of the application software—transaction programs—required a further underlying piece of software infrastructure—a transaction processor—if programming was to be made easy. 

Systems of the 90s comprise PCs with powerful GUI capabilities providing secure access to extensive, distributed and high-value corporate resources.  These systems also need a new ‘shape’ of application software—the CBO, and a new layer of software ‘middleware’—a CBO-enabling layer—upon which to run.[4]

Know-how

New system and application structures raise many questions.  For example, how do we manage the multiple update problem?  Do we need to design and build this ‘middleware’ code to support this new client/server structure?  How are business rules enforced on an object-based user interface?  What design methods will deliver the new software structures required?

Although today the understanding of how to answer these and other related questions is not yet widespread, the answers do exist, and the know-how is available.

Clearly many new skills will be required.  There is an old saw which says, ‘Hear and forget; see and remember; do and understand.’ In this area, as in others, experience shows that these new skills come only from doing—from actually building systems.

Does this all mean a revolution in application development?  Do we have to re-train everyone in some ‘big bang’ conversion to the new technology curve?  The answer to this is no.  Techniques have been developed which provide an evolutionary path to the new system structures.  The necessary know-how and skills can be developed by a small core of people, and then spread more widely in a controlled way.  New client/server systems can be grown alongside more traditional systems in such a way that they co-exist.

 

Summary

Today we face a major challenge—understanding how to climb the new technology curve, building systems that exploit:

        Cooperative processing in a client/server environment

        Object-based graphical user interfaces on the PC

        Application integration

and that are:

        Easy for the average application programmer to build.

A major theme of this book is that the CBO is the right ‘shape’ of application-level code to meet these challenges.  The CBO structure, together with its enabling middleware, provides us with a way forward into this new world.

 

 

 


Part One
The user—exploiting the PC

 

The driving force behind the adoption of GUIs is usability—enabling users to address a much wider range of function than hitherto.  This is driven in turn by changes in the business environment, such as an organisation's move away from deep hierarchies of task-oriented departments towards flatter organisations, employee empowerment, and emphasis on customer service. 

Thus an important question for IT departments is not merely how implement a GUI (although this by itself may be useful and may add benefit), but how to exploit fully the capabilities of the GUI, and how to bring ease of application programming to distributed client/server systems which place advanced GUIs on users' desks.

It seems trite to say that before we design and build code, we should know what we're trying to produce.  But it's worth repeating.  In the 70s and 80s, many people used video terminals to do data entry into a batch transaction file. But this was merely a step on the way to the goal—on-line update against industrial-strength databases—for which they would need to build not batch suites, but transaction programs.  What we see in the 90s is many people implementing graphical user interfaces (GUIs) in a way that is really little more than the 1990's equivalent of batch data entry.  This is a step on the way to the 1990's goal of object-based user interfaces.

In this section we present and justify a goal for the GUI: object-based user interfaces (OBUIs). 

We also develop the rationale for a new kind of software structure that is ideal for implementing object-based user interfaces.  Just as there is a world of difference in the software implementation between batch data entry and on-line update of shared data, so there are major differences in software structure—and in inherent benefit—between application-oriented and object-based GUIs. 

Our use of the word ‘object’ should not lead you to assume that we're talking about object-oriented programming versus procedural application code; we're not.  Rather we introduce the concept of a software object as a deliverable, rather than as something used by a developer to build a deliverable that is not itself an object.

These objects are a better alternative to functionally-oriented applications.  They have a structure that solves many of the technical problems of object-based user interfaces.  We give them the name ‘cooperative business objects’ in order to distinguish them from the objects used to create object-oriented applications that are not themselves objects.


In this section, then, we do three things:

        In Chapter 1 we discuss why there is a problem with application-oriented user interfaces, and show how object-based user interfaces solve the problem

        Then in Chapter 2 we work through an example, starting with a traditional menu/panel application, and ending by showing what we mean by an object-based user interface

In the process, we point out some ‘under the surface’ problems which are user interface problems, but have little to do with look-and-feel

        Chapter 3 introduces cooperative business objects—a new kind of structure for application-level code which solves the ‘under the surface’ problems inherent in trying to build object-based GUIs using more traditional application-oriented approaches.

 


1              Usability—the new system bottleneck

As people are required to handle an ever-widening range of applications, so systems are becoming increasingly difficult to use.  From a system point of view, this can be seen as a bandwidth constraint—except that the constraint in question is an inherent human limitation.  Since we can't change the physiology of the human brain, the only way to increase the effective bandwidth is to change the way in which computer systems are perceived by their users. This chapter discusses the nature of that change.

1.1 The usability challenge

Many companies have invested huge amounts over the past twenty years in data, systems infrastructure and systems management—managed and controlled on behalf of the business by the IT organization.  As pressures on businesses to re-organize, re-structure, diversify, consolidate and slim down increase, they look to their IS functions to provide flexible and changing access to that investment.

Thus there is increasing pressure on IS departments to provide computing services which will exploit existing investments for both:

        A wider range of users and processes

        New processes in new business areas

Often the response by the IS department is to embark on large projects aimed at integrating applications, often in conjunction with a change to a more distributed (client/server) system structure.

Yet there is little point in spending huge sums on integrating applications, and data, if this investment fails to exploit systems because they prove too difficult to use. As someone said (in the context of on-line systems):

Computer systems themselves deliver no business benefit. All they deliver is potential benefit to the screen. It is users who realize that potential, and so deliver benefit to the organization.

This seems self-evident; but it does mean that the benefit that can be realized from computer systems is dependent to a large extent on the user interface.

But is there a problem?  The answer is yes—because the demands on users—and on user interfaces—are changing and increasing.  To explore this problem, and to identify the solution, we must firstly understand the business need that is behind those demands.

 


1.2 A vital business need

Among the several imperatives many companies face today is the need to make much better use of the inherent abilities of their people.  This derives from various pressures, such as slimming down staffing and bureaucracies, re-structuring towards flatter organizations and winning competitive edge through significantly improved customer service.

Typical comments heard from executives across a variety of enterprises in recent years have one common thread—the need to obtain greater staff effectiveness—and such comments include:

        ‘We need whole job people…’

        ‘…move them out to the front office.’

        ‘Competitive edge through better service…’

        ‘If we could integrate our applications, our people could…’

        ‘People need to be (made) more effective…’

        ‘We need to cut the learning time…’

        ‘We must enable people to do more…’

Effectiveness does not mean only productivity.  It means people addressing a wider range of activities than previously:

        Bank tellers moving from behind the counter to become customer service and sales people

        Insurance claims clerks handling agent's commissions and customer queries as well as processing claims

        Order entry clerks becoming account administrators, so enabling business transformation by providing full single-call customer service, with an eye to additional marketing opportunities as well

It also means providing for depth—extending jobs so that a person can do the next step in a process.  For example, an insurance clerk may be enabled to handle some of the more complex queries, instead of having to pass them all along to their technical support.

IT implications

In IT terms, this demand for increased effectiveness also has a common thread: the need to enable people to access a much wider range of business processes than they did before.

Clearly one way to address this need is through the use of IT systems.  Other options—such as increased training time or more staff—are today no longer viable.

If supported by appropriate systems, the normal human capabilities of most people should be adequate for this extended role.  The 64,000-dollar question is, can we build these ‘appropriate systems’

Today, PCs and client/server systems can give the user multiple applications concurrently.  This means that the technical constraints imposed by character-based terminals on dialogues (including limited or no dialogue concurrency) can be overcome. Will this provide an answer?  Initial experience suggests that the longer-term answer is no. Why not?

 

1.3 The application problem

Consider an order processing system user. The column towards the left hand side of Figure 1.1 shows how the AD department might traditionally deliver this function. It would:

        Analyze the process (the leftmost of the three ‘clouds’ at the top of the figure).

        Identify that this process involved key corporate entities—such as the customer

        Design and build an application, to appear as one of several windows on the user's PC.

Figure 1.1.   Application orientation.


Such an application encapsulates the function required to perform the process.  Conventionally that's what is meant by the word ‘application’. Probably, within the application, the code accesses the common corporate customer database.

Suppose, however, that the users, in addition to taking and entering customer orders, are now required to handle:

        General customer enquiries

        Product returns from customers, and

        Journal entries for customer account adjustments

In principal, this is no problem—two additional applications which may even already exist are provided (see Figure 1.1). But consider the consequences.

We find that, although all three applications may use the common customer Database to access customer details, each application presents a list of customers (the same customers) in a different way.  So what?  Well, assume that in responding to a customer on the phone, the user needs to look at the customer's balance outstanding, and at the date of the last order.  This data may well be in two different customer lists. The user therefore has to choose the correct two customer lists—by accessing the appropriate two applications—to get at the needed information.

The practical consequence is that the application developers have unwittingly given the user a new problem—by expecting him to be an expert in:

        Understanding situations where one list rather than another must be used

        Knowing which application provides the relevant list

        Knowing how to start (get into) that application

        Understanding how to navigate through the application in order to produce the required list

        Understanding, that in some situations two lists will be required concurrently on the screen, and knowing how to produce that effect

Indeed, where a specific ‘customer query’ application (yet another application!) has not been provided, the user may well have to create an Order, do a search for the customer, then delete the Order—merely to look at a list of customers (the result of the search)!

 

Exporting problems to users

Now the IS department did not plan to give the user a problem.  It's just that the very design technology we've used so successfully over the past 30 years—choosing ‘applications’ as the vehicle for delivering business function—is now becoming a serious constraint—on IS's ability to deliver the level of integration and ease-of-use required, on users' ability to get the most out of the IT investment, and hence on the ability of the company to realize the return on investment expected.


In fact the problem may be worse than this.  Many companies today are planning staff empowerment that envisages users accessing (not all at once!) up to thirty applications. Yet one such company already estimates that after around twelve applications there is a diminishing return, at an increasing rate, for every application added. The increasing user load makes each additional application effectively less usable than the last. This applies despite improved ease of use of any given application—including those with a GUI.

Thus we face a major problem: Although our technology now allows the user to access many applications concurrently, their very nature makes them difficult to use together. But use them together is precisely what the user expects when he sees them on the screen concurrently!

Furthermore, the problem exists even when a set of consistent ‘look and feel’ user interface standards are applied across applications.

This problem derives from the very nature of today's typical application, which is:

        A software encapsulation of a single business process.

        Essentially stand-alone with respect to other applications

Consider: the one thing that a typical application is seldom if ever designed to do is to talk as a matter of course with other applications.

Encapsulation on conventional process boundaries makes it infeasible for the user to  use something from one application in another application (other than trivially through a cut-and-paste function)—even though he may be viewing them side by side, each with an attractive graphical user interface, on the same screen.  Clearly, since our systems can deliver the required business function to the screen of a user's PC, then there must be something wrong with the way it's delivered at the screen.

So how do we overcome this?  To answer this, we firstly consider briefly the nature of the general problem of usability.

 

1.4 Usability

The conventional way to address usability is through intensive training.  But the time required is prohibitive.  Already reducing training time is a key business need...

There is another way: Instead of trying to teach the user about how the computer system works, we can exploit knowledge already in the user's head.

But how do we do that?  Let's start by considering how people deal with technology in general, and then apply it to computers.

In discussing how people use technology (such as a video player, a swing door, a cooker, or a computer), Donald Norman [Norman 1990] identifies three conceptual, or mental models that are important in the operation of such devices.  These are:


        The user's model (the user's concept of how the device works)

        The designer's model (how the device actually works)

        The system image (what the user sees).

In practical terms, the system image is the user's only access to the designer's model of the system.  Thus, for a person to formulate a good user's model, the designer must make his (the designer's) model obvious through the system image.

The difficulty is understanding how the user builds a conceptual model.  Norman describes the psychology of the user's conceptual model as being built from everyday experience.  The user will attempt to transfer existing knowledge to new systems:

        If the new system fails to match his conceptual model he will find it difficult to learn (leading potentially to frustration and rejection)

        If it does match, he will rapidly gain confidence in its relevance and applicability

 

The ‘usability iceberg’

Mapping this to computer user interfaces indicates that users' conceptual models are of crucial importance to computer usability.  This is illustrated by the ‘iceberg’ concept,[5] which shows (see Figure 1.2), system usability as being dependent on three factors:

        First, usability is dependent on the presentation of the system to the user.  This includes factors such as layout, color, and the aesthetic appearance of the system.  However, these factors account for only 10% of the usability of the system.  For example, in a car the layout of the instruments does not significantly affect driving the car.

        Second, we derive usability from the effectiveness of user interaction with the system—how he makes it do things.  This includes editing techniques, scrolling, mouse use, pop-up panels, keyboard use, etc.  Consistency of interaction is a very important factor in achieving usability; but the contribution of these factors is only 30% of the total.

        Third, the remaining 60% of the look and feel iceberg—the bit under the water—comes from how well the system maps to the user's view of the world. In other words, how easily can the user build a mental picture (a conceptual model) of the way the system works, so that he can accurately predict how the system will behave?

 

 

Figure 1.2.   The usability iceberg (source: D.F. Liddle, Metaphor Computer Systems).

This last point suggests that IS should put onto the user the responsibility for building the correct conceptual model.  However, a better way must surely be for IS to provide a system image which naturally maps to the user's model of his world.

What is the user's world?

But what is the user's world?  Looking at a typical office, it is a world of things, of tools (used to perform tasks), such as:

        Forms

        Ledgers

        Files

        Notepads

        Manuals

        Phones

In fact, on looking round an office, you will never see an ‘application’. Of course, there are procedure manuals, and instructions on forms which represent the actions required, etc.  But what you see are manuals, forms, etc; not the ‘procedures’ themselves.

Considered in this way, it becomes apparent that the notion of an ‘application’ is an artifice of the IS world. Until recently, the only way to deliver business function through IT systems was by building code around this artifice.

1.5 New user interfaces

To explore how these considerations can help us find a solution, let's return to our example of the order processing system user. His ‘extended job’ would be made much easier if the customer list could be presented:

        As a single thing (a list—of customers)

        As being usable in whatever business process it is required.

This would not only mean that experienced users could do their job more efficiently, but also that new users would benefit.  A manual order entry clerk would be familiar with a ‘customer list’ to which he could refer whenever necessary.  Thus the system would work in the way he would expect.

This, of course, is the prime consideration.  Users will find a system much easier to use if, as we indicated above, it maps well to their own view of the world.  Remember that 60 per cent of the usability of a system depends on how easily a user can build a conceptual model of how the system works; and he can do this most easily if IS provides a system which naturally maps to that user's world of everyday things.

With these concepts understood it should be possible to build a user interface which keys into the well-evolved world of the office or work place.  What is needed is provision of a ‘thing-based’ rather than a ‘function-based’ user interface.

This is quite possible, but only via the local power of the PC.  The PC can present on the screen sets of ‘things’ or ‘objects’—reference material, ledgers, forms, files, notepads, etc.  Instead of the computer presenting to the user some list of functions that can be done, it should present the set of things the user needs to do his job. 

For example, referring back to the multiple application problem, a user will probably find it difficult to build a mental model of the behavior of a ‘customer list’ when it:

        Appears in several different forms,

        With different underlying behavior, and

        Accessible in different ways

But if the customer list is presented as a single thing, usable wherever needed (following our previous discussion), then it should be much easier to use. (For data analysts, such commonality of data is not something new. But for application developers, and for end users, common presentation of such data entities across business processes—or ‘applications’—is new.)

With this approach, separate education for each application should not have to be given. ‘Thing-based’ (or ‘object-based’) user interfaces make use of knowledge which already exists in the user's head—knowledge about ‘how the world is’. Providing user interfaces that exploit this knowledge is the key.

Suppose we adopt such an approach—what would it look like to the user?  To examine this question, the next chapter shows the train of design thinking, starting from a typical character-based terminal application, and ending with an ‘object-based’ user interface of the kind described.


2              Making computers familiar:
object-based user interfaces

Starting with an old-style menu-panel character-based terminal application, this chapter traces the development towards an object-based user interface.  In particular, two underlying principles of object-based user interfaces are introduced:

        The user should be able to use a given object in all processes (‘applications’) where that object is required

        The user should be able to re-group objects in folders or containers (themselves other ‘container’ objects) to suit their way of working

It is in the implementation of these two principles that we will see the necessity for new software structures.

To illustrate the solution, and why the underlying technical implications are non-trivial, we use a sample application—order entry.

Firstly, we look at how this application might be done with character-based terminals, and then examine some of the ways we can exploit the PC technology.  Then we assess this PC solution in the light of the key usability factors discussed in the last section.  Finally we show how our example application develops beyond application boundaries into a true object-based user interface.

 

2.1 The character-based terminal approach

Figure 2.1 shows a sample initial panel of an order entry application, running on a non-programmable character-based terminal such as an IBM 3192 or 3278.

The implications of this technology are as follows (and we take this list from what we observe about real applications):

        The user selects what to do from a list (a ‘menu’) which has been defined by IS.

        The user can do only one thing at a time.

        The interface is typically ‘modal’—that is, once the user has chosen one function, then in order to go to another function he has first to complete the current function.  A good example of this is where the user has to handle an enquiry in the middle of an entry transaction (such as entering a customer order). Before answering the enquiry, he first either has to cancel out of the order, or complete it.  In other words, the user is bound by the ‘mode’ he's in (in our example, he's in ‘order entry mode’).


Figure 2.1.    Character-based terminal—menu/panel user interface

        Two or more functions typically cannot be handled concurrently. For example, we are not able to allow the user to process both a customer enquiry and an order panel concurrently.[6]

In summary, we might say that the current character-based terminal interface is single-function modal. By ‘modal’ we mean that the dialogue with the user is placed in a certain mode, and while in that mode, the user cannot do other things.

However, the menu-panel hierarchy design approach for character-based terminals has two great advantages:

        The design technique maps well to the technology (where each box in the function hierarchy typically maps to a menu or panel)

        It enforces business rules by encapsulating them in one or more ‘transactions’, and so guides the user through the required business process

With PC technology, we can throw away the first of these; but we must retain enforcement of required business rules.

2.2 A PC-based approach

Here we see how we might start to exploit the PC technology.

 

Figure 2.2.    PC—graphical user interface.

Figure 2.2 shows a PC (bottom left) with four application windows. (The PC screens illustrated in this document are stylized for presentation purposes—they do not accurately show all the detail.) The order entry window is shown enlarged (top left), and this shows a PC alternative to the initial menu panel on a character-based terminal. Instead of a list, the user sees four icons:

        The Customer List (bottom right)

        The Product Catalogue (bottom left—this company specializes in gear wheels and sprockets)

        A pad of Order Forms (top left)

        The Order History File (top right)

To enter an order, the user would ‘tear off’ a new order form by dragging one (with a mouse) from the Order Forms icon.  The result might look like the window on the right of Figure 2.2, where the new order form is shown.

Note that the order history file is now hidden (under the order form).  This is not a problem, as the user (not the application) controls the positioning of windows, and so can easily slide the Order Form to one side to reveal the Order History File icon if that's what's wanted. (The application can, if desired, take over control of window positioning; however, except in very special circumstances, experience has shown this to be user-hostile behavior for this style of user interface).

Figure 2.3.    Application concurrency.

GUI principles

With this application, we can now go on to illustrate four important principles of a GUI.  Without all of these principles, the user productivity potential is severely constrained. 

Note also that while these principles are relatively easy to implement on a PC, they are difficult or impossible with the typical character-based terminal.

1.      Multiplicity     What if the user wanted to work on two orders concurrently?  With a PC, this is very easy—you can have as many order forms up as you like.  This ability to display multiple instances of the same class of thing we call multiplicity.

2.      Concurrency     How does the user fill in the order? Well, the customer list icon is clicked to bring up a list of customers (see Figure 2.3).  The user would search through this list until the customer record is found.[7] Note that the user now has two things up on the screen. Showing several different things on the screen concurrently we call concurrency.

3.      Amodality     Further, the user should be able to switch between the two things at will.  If he or she had to complete work with the customer list before continuing with completing the order, then that would be a ‘modal’ way of operating. In general, modal operation is undesirable, and a non-modal or amodal interface is almost always highly preferable.

4.      Direct Manipulation     To fill in the details on the Order Form, the user just clicks on the customer record, holds the mouse button down, moves the mouse over to the Order Form, and releases the button. The system then fills in customer number, name and address details.  Note that this is more than cut and paste. The name for this type of operation is direct manipulation

In this example we've seen how the user is ‘in control’—even in a structured clerical process like an order entry application.  We've also seen how the four GUI principles greatly help with achieving the usability aims set out in the last chapter—of enabling the user to deal with ‘things’, just as in the real world, rather than with functions.

However, we are still dealing with an application—the order entry application in our example—and we have not yet dealt with the ‘application problem’ (discussed in Section 1.3).  To recap, the application problem is the inability of a user to use effectively information from one application in another.  This is fundamental, and we now go on to address it.

2.3 Presenting objects on the screen

Our problem really lies in our overall mind-set.  For all we have done is to look at how the new technology could be used for a standard application such as Order Entry.  What we have effectively done so far has been little more than transferring our unconscious knowledge of what an application is, to the new technology.

In the early days of the character-based terminal, some IS departments saw it as a kind of superior cardpunch, which would save money on cardboard (punch cards).  They actually put card images on the screen.  What we have done in our example is the 1990s equivalent of putting card images on a screen.

What we have not done has been really to ask, ‘How can we use this new technology to further our business objectives?’ We didn't ask this because we know that the answer has always been ‘through building applications…’, and we assumed unconsciously that this is still the right answer.  And of course, in the sense that we still need to address business processes, it still is. However, in answering the question, we have also dragged along all our notions about what an application is.  This is a fundamental design error.

2.3.1             A ‘workbench’

Now let's come back to our question (which was, ‘How can we use this new technology to further our business objectives?’), and at the same time look at an alternative to our traditional notions of application structure and presentation.  We build on the concepts introduced in Section 1.5. (These concepts are reflected in such modern user interface guidelines as the IBM CUA-91 Manual (IBM 1991a,b).)  Thus we present an interface where the user manipulates and uses every-day real things (or objects) directly on the screen.  Let's develop this idea.

Figure 2.4 shows what one might call a ‘workbench’—a window containing the things or objects required by the user to do sales support tasks. Objects are shown by the small pictures—or ‘icons’. The user manipulates these with a pointing device. For example, an object can be looked at by ‘opening’ it (through a double click with mouse button 1).


Figure 2.4.    Object-based user interface.

Again, an object such as Product X567 may be printed by dragging it to the printer icon and dropping it (mouse pointer over Product X567 icon, press and hold Button 2, drag pointer over to the Printer icon, release Button 2).

Some things remain the same as our previous design—we still see the pad of order forms, the customer list, etc. However, we've added some of the other things a user might use from day to day, such as an in and out tray, a bin, a notebook, etc.  Two questions arise:

        Figure 2.4 shows only a fraction of the objects a real end user would normally deal with, and the screen still looks messy.  We need a way to let the user tidy it up.

        Much more importantly, where have the business processes gone? For example, how does the user enter a customer order?  Or query the status of a part?  Or update a customer record?  We need to ensure that such business processes are retained in our move to an object-based user interface.

Tidying the messy desk

Firstly, let's look at the problem of keeping things tidy. The major solution to this is to put things in ‘containers’. (The IBM CUA 91 Manuals describe this approach in detail, and identify it as a key user model concept. They also introduce other useful techniques for keeping things tidy.) In Figure 2.5 we see two container objects called ‘Store Room’ (the icon is meant to represent a cupboard), and ‘Stationery’. The user has tidied most of the objects into one of these two. Figure 2.5 shows how the user has double-clicked on the Store Room icon (bottom left), and has ‘opened’ it, resulting in a window showing the contents of the Store Room.

Figure 2.5.    ‘Container’ objects.

The user has pulled out (by direct manipulation) several objects (Customer list, Product Catalogue, the Stationery container, and the Department Printer) from the Store-Room.  Note that the ‘Stationery’ object is also a container object, and will contain the various pads of forms required.

The business process

The second question—how to tell the user about the clerical or administrative procedures the Company requires him to follow—is not as difficult as it might first appear.  Two approaches to this are:

        First, we might make use of the ‘procedures’ manual that exists in most companies, even though it sometimes gathers dust in a forgotten cupboard.  This manual defines clerical and administrative procedures. The idea is to put this manual on-line as another object on the screen.  This is shown in Figure 2.6, which shows the workbench as it might look halfway through the day, while the user in the midst of taking a customer order.[8]

The procedures manual is open at the ‘Take Order’ page (top right in Figure 2.6).  The individual ‘Take Order’ procedure is itself an object, and the procedures manual


Figure 2.6.    Retaining the business process.

(not shown in Figure 2.6) is a ‘container’ object containing many different administrative or clerical procedures. 

As the user completes the order, appropriate boxes in the procedure are automatically checked off by the system.  If a job is incomplete (say at end of day) then it can be filed away to be opened and worked on the next day.

Notice also that in the example shown in Figure 2.6, there were initially four steps in the procedure.  In the course of filling the order, an exception condition was triggered (above credit limit).  The system can automatically insert an additional step into the procedure (in this case, ‘Contact Manager’).  Indeed, a major application of expert systems to business processes will probably be in the area of guiding people through complex administrative and clerical procedures.

        Secondly, we could use the Help function, along with an ‘intelligent’ form.  The user would first look up Order Entry (for example) in Help.  This would tell him to open an Order Form (and guide him on how to do it).  The Order Form in turn would have behind it the business logic to guide the user through the Form, warn of errors and insist on business-determined sequences and completion criteria.

In general, the second approach is to be preferred.  Although on first sight the first approach might look better, experience has shown that for single-form processes, the user can easily see the state of the process merely by looking at the form.  Showing a ‘procedures’ page merely uses up the screen unnecessarily.

Where a process uses more than one form, or where it goes past more than one user (or both), then we can encapsulate that process in a ‘folder’.  This folder will hold all the objects (things) required for the various users to do their part of the process.  An essential attribute of this folder would be a list of the steps to be taken.  This list would be ‘intelligent’, and would behave rather like the ‘Procedures Object’ in the example above.  It might be implemented as a separate object within the folder, or as data ‘belonging’ to the folder.

‘Tool’ Objects

This approach can provide an extremely effective environment for ‘tools’ which, although individually trivial, can add substantially to user effectiveness. Figure 2.6 shows two such tools:

        A notebook (bottom left, next to the out tray).  The idea here is that the user can make free-form notes on anything relevant (such as the information that a customer buys a product from another company) and then mail that note through the electronic mail system simply by picking up the note with the mouse and dropping it on the out-tray.

        A ‘customer contacts’ tool (shown by a little organization chart).  The user can keep here specific contact details by customer. For example, he might record that for Company X, his normal contact is Joe, Joe's manager is Mary, and Fred stands in for Joe when Joe is away.  If our user belongs to a company that sends Christmas presents to customers, then Joe and Mary's gift preferences might be kept. Then, when our user takes a vacation, this object can be mailed to his stand-in, so that small personal contact details are not lost.

Other Objects

Finally, at the top left are two windows that the user might have been using before starting the order entry procedure.  The first is some simple analysis (prompted by another administrative procedure) of a customer's order history; the second shows a high-level index to the customer file, illustrating one possible approach to handling large amounts of data at the user interface.

Advantages

This sort of approach—the building of what we might call a ‘sales support environment’ for clerical staff—is clearly achievable with PC and cooperative processing technology.  It is not particularly obvious that we could do it with character-based terminal technology.

So, it appears that the advantages, and design implications, with PC technology are:

        The user has a choice of several things he can do.

        Once the user has chosen to do one thing, then he can at any time switch to another—as easily as taking out a new file from his desk drawer and laying it on top of the old one—in fact, probably even more easily than that.  Things that are suspended are not in a different state than when they're active; it’s just that mouse and keyboard input do not go to the ‘suspended’ things, they go to the ‘active’ thing.

        The user can have multiple objects concurrently on the screen, such as an administrative procedure, an order form, a customer list, a mailbox or out tray, etc.  Users quickly adjust to their own comfort level of concurrency and multiplicity.  Indeed, this is a major advantage of the PC technology: if properly exploited it can (as it were) automatically adjust to an individual’s needs.

        We do not really have ‘applications’; rather a business process is built from a set of ‘application objects’, where a single given object may be used by more than one business process.  For example, a ‘customer object’ could be used in the following processes:.

         Customer Enquiry

         Order Entry

         Billing

         Customer Locate

         Customer Record Maintenance

Now compare Figure 2.6 with where we started—Figure 2.2. To the extent that business benefit is delivered at the glass (at the user interface), it is arguable that the sort of capability illustrated through Figure 2.6 can potentially deliver a far greater and more effective range of business function and hence benefit than that shown by Figure 2.2.

2.3.2             Summary

To summarize, we can say that—in addition to graphics and image—a PC can exploit the following functions which are typically not available on character-based terminals:

        Multiplicity

        Concurrency

        Amodality

        Direct manipulation

        Presentation of objects

        Presentation of container objects (which hold other objects)

These functions allow us to build a workbench within which application objects reside.  With such capability, we can design imaginative and innovative application solutions that can contribute substantially to the effectiveness of the enterprise, and hence to the business case for high-function PCs.

However, the design we have built up so far has one important flaw. We have presented everything in a single window. This approach, when implemented, allows one all too easily to slip back into the old application-oriented way of thinking. If this happens, then we can be in danger of throwing away much of what we've built.


2.4 The ‘workbench problem’

The approach required in designing an object-based user interface is to ask not what function does the user need, but rather, ‘What are the things the user needs to perform the required tasks?’  Given this, the trap one can fall into is to deliver this set of objects wrapped up in an application—which appears to the user as a ‘workbench’.

What can happen is that different development groups, each addressing a different business process, each builds their own ‘workbench’.  The problem arises when the designer's model of this workbench is that of an application that contains all the things required for the given set of business processes.

This is the flip-side—the developer's side—of the ‘application problem’, and in this section we further explore the problem.

As a vehicle for discussion, we'll look at what actually happened in one company that took this approach (although the details have been changed to fit our example—and to protect the innocent!).

The company decided that new developments would have an object-based user interface on PCs (and that data would be held on a mainframe).  Shortly after this decision was made, the AD department was asked to implement two different business processes—order entry and contracting. Thus they initiated two projects, each with their own development team. 

The first project started work on an ‘order processing workbench’ for a set of people in the marketing division. The project team’s initial analysis suggested that this type of end user needed seven objects:

        Customer list

        Customer

        Order history file

        Pad of order forms

        Order form

        Parts catalogue (a list of products)

        Part (a product object)

Figure 2.7 shows the Order Processing workbench as the top left window, with an Order Form opened to the right (the icon for this is not shown).

A few weeks later, the second project began development of a ‘contracting workbench’ (all the things needed to handle contracts with customers) for people in both the manufacturing and marketing divisions. Analysis suggested that users needed seven objects with which to handle this business area:

        Customer list

        Customer

Figure 2.7.    The workbench problem.

        Product list (a catalogue of parts)

        Product (a part object—not shown in Figure 2.7)

        Contracts folder

        Pad of contract forms

        Contract

This workbench is shown at the bottom of Figure 2.7.

As it happened, many of the users in Marketing needed to handle both customer orders and contracts.  When (luckily at a prototype stage) these two workbenches were put together on a single user's PC, it immediately became apparent that the system—as a whole—would be very difficult to use, because:

        There were two customer lists (each with the same name—suggesting they were the same, but each with a different icon—suggesting they were different). 

        There were two different ‘customer’ objects, each containing much the same information.  But worse than that, the users found that they could use the customer object from the Order Processing workbench only with the Order Form, not with the Contract Form. In addition, the Contracting customer object could only be used with the Contract Form object.  These limitations are shown by the ‘X’s in Figure 2.7.

        There were two different product lists—one of them being called a ‘parts catalogue’.  At least both the names and icons were different.  The only problem was that they both contained the same information. 

        There were two different ‘product’ objects, with the same difficulties as the product/parts lists.

Instead of an easy-to-use natural real-world user interface, the user found confusion and inconsistency.  In fact, by presenting objects within application boundaries, the AD department had inadvertently given the user the worst of both worlds:     

        An interface which suggested the user was using things rather than functions, without showing the user the boundaries of where those things could be used

        Application boundaries which relied on the user recognizing ‘modes’ of use—without giving the user any indication at all about where those boundaries were, or how to switch into one of the modes

In essence, the AD department was producing what I call ‘iconic applications’.  In the two workbenches, there are 14 objects (12 shown in Figure 2.7, two not shown)— which, by all the conventions of object-based user interfaces, states quite categorically to the user that there are 14 objects (things) needed for order processing and contracting. However, the user knows quite well that only ten are needed (customer, customer list, product, product list, order history file, pad of order forms, order form, pad of contract forms, contract form, contracts folder).

To summarize; the essence of the workbench problem is this:

        In building code, we duplicate items across our applications.  Over the past twenty years, we have learned not to do this when it comes to data; we try to design databases so that data is not duplicated.  Yet we still duplicate the application function that accesses that data.

If we adopt object-oriented programming techniques, we may avoid actually re-writing the various duplicated items.  But that is typically at the source code level, and when the applications are delivered, we still have the situation where the executable code consists of much duplicated function—which leads to a greater maintenance load.

        Great difficulty in ‘integrating’ applications.  This does not mean running them concurrently; on today’s PCs that is trivial.  What it means is enabling both the developers and the users to re-use parts of one application in another business process.

        If each application presents an object-based user interface, then the user may see several of the same object—each of which behaves differently, and none of which can be used outside their own ‘application’ environment.  Experience has shown this to have very poor usability.

A good principle for object-based user interface design is the principle of ‘least astonishment’, which goes something like this, and where ‘correct’ means correct in human factors terms: 

For any given user action, the result which least astonishes the user is most likely to be the correct result.


In our example, four business objects were represented to the user as eight objects. Each of these looked different, behaved differently, and interacted with other things in different ways than their duplicates. That astonished the users.

The result was an unusable interface.  The company in question had to go back and do a fundamental re-think, and re-design.

The lesson here is simple: an iconic user interface which does not present objects as the user understands them is not an object-based user interface.

Thus in spite of consciously setting out to do so, the AD department failed to provide an object-based user interface for the users; what they provided was an application-oriented interface which used icons to represent application objects.  Instead of an object-based interface, AD actually produced two iconic applications. If the objective is to produce an object-oriented user interface, then building iconic applications is a fundamental error. The objects on an object-based user interface are user objects, not application objects.

Let's now complete the picture of the solution to the business need by deriving some important lessons from the ‘workbench’ problem.

 

2.5 The object-based user interface

Given that there is a consistent standard for presentation and interaction (the ‘usability iceberg’ above the water), then two principles of an object-based user interface from ‘below the line’ are of crucial importance—both for usability and for implementation). These two principles are:

        Object re-use:     Any given object must be re-usable across business processes where appropriate

        User-defined grouping:     A user can put any object in any container he chooses (subject to business rules).  A container is an object.

 

Object re-use

The principle here is that a single given object can be used across multiple business processes. Figure 2.8 shows a simple example of this, where the customer object is used in both order processing (the user can ‘drop’ it on to the order form to enter the customer details), and also in the contracting process (where the user can similarly use the same customer object to complete a contract form).

Again, this same customer object can, of course, be used for customer enquiries, and to update the customer record.  To look at (and perhaps update) the customer details, the user merely has to get a ‘view’ of the object. (The CUA practice for this is to do a double-click on the Customer icon with button 1 of the mouse.)

 

 

Figure 2.8.    Object re-use by the user.

User-defined object ‘grouping’.

The second principle is that a user can ‘group’ or locate objects to suit their own needs. Suppose, as shown in Figure 2.9, that AD delivered to the user an ‘order processing’ object (a container or ‘work area’ containing the things needed to process customer orders). 

Figure 2.9.    Business object independence.

What the ‘user grouping’ principle means is that the user can, if he wishes (and assuming the business rules allow it), re-group the contained objects in other container objects. Figure 2.9 shows an example of this, where the user:

        Creates two new containers—‘Customer A123’ (perhaps because there is some significant piece of work to be done with this customer); and ‘My Daily Work’ (to hold the things needed for the user's normal daily tasks);

        Moves objects from the ‘Order Processing’ container to the two newly-created containers.

        Then deletes the original ‘Order Processing’ container, since he does not intend to use it. Deleting this container does not necessarily mean removing it from the system—it generally means removing it from the user's PC only.)

        Finally gets the ‘Customer A123’ object out of the Customer List object—which is also a container—and moves it to the ‘Customer A123 Work’ container.

The re-grouping would be permanent as far as that particular user is concerned—positions of objects on the screen are normally only changed by the user, and are maintained by the system over power-off.

Now let's consider these two principles from the point of view of the developer.  The key question here is: if this behavior is available to the user, then what was it that development developed? If ‘applications’ were developed, then how did the user re-group them?  What sort of application is it when the user can interactively—and extremely easily—tear it into pieces and re-assemble the pieces to suit him or her self?

We answer this question in the next chapter, in which we find that implementing this behavior has enormous implications on the software structures required.

2.6 Summary[9]

We have argued that, in a cooperative processing system, the best approach to exploiting the investment in system resources is by providing the users with an object-based user interface.  But in doing that, there are several major implementation issues, including:

        Enabling the object-based user interface

        Design of a cooperative processing infrastructure which will make it easy for application programmers to write cooperative business systems

        Ensuring a cohesive system-wide design which embraces a potentially large number of heterogeneous `server' systems as well as large numbers of PCs

        Handling resource sharing and resource distribution

We now go on to look at the implementation of an object-based user interface, leaving the other issues for later chapters.

 


3              Cooperative business objects

 

In the previous chapters, we have seen how an object-based user interface can let people use IT resources without stumbling across artificial ‘application’ barriers.  Building this style of user interface, however, is typically not easy.  In particular, implementing the two important principles of object-based GUIs—object re-use and object re-grouping—has proven elusive when attacked with commonly-available programming tools. 

This problem we call the ‘integration problem’, as it revolves around how to build independent and (possibly) separately-developed units of software which map to the objects on the user interface in such a way that integrating them—at run-time—is easy-to-program.  It is the flip side of the workbench problem (see Section 2.4, which was how to avoid building ‘workbenches’ which are really applications, with application boundaries.  From now on, we refer to this problem under the single name, ‘integration problem’.  

The problem can be overcome by adopting a new kind of software structure that has proved to have great ease-of-programming attributes for object-based GUIs.  We call this new structure a ‘cooperative business object’ (CBO) – because:

        What we deliver as the end product of the development process is an object. in the true object orientation sense of the word

        The size of the object maps to ‘business’ things (such as a customer, an invoice, a claims form, etc.)—so it's a business object

        It cooperates with other business objects to perform some desired task—hence it's a cooperative business object—or ‘CBO’.

This chapter introduces the concept of CBOs as applied to the GUI. Later on, we will see how the same structure also brings ease of programming to cross-system communication.

3.1 The ‘integration’ problem

The traditional approach to structuring business function for delivery through computer systems is to build ‘applications’. Figure 3.1 shows how each overall business process (the ‘clouds’ at the top of the figure) is encapsulated into a software structure called an ‘application’.  On the left of the figure, we show two old character-based terminal applications being front-ended on the PC to form a single GUI application. (While this approach can bring major benefits in specific circumstances, it is not generally seen as a viable long-term structure for the majority of mission-critical systems.) We also show several departmental processes which people build and run for themselves in order to help them fulfill their corporate mission.  Our traditional application approach does little or nothing for these.

Figure 3.1.    Application orientation.

 

Now, the problem with application-oriented software structures is that they cannot avoid the integration problem.  Although they may enable us to deliver iconic applications, they cannot deliver object-based user interfaces.  Let's look at two ways that have been tried in the past to address the problem using application-oriented software structures:

        Build one large application, within which the various components can communicate.  All objects appearing on the user interface will be controlled by this one application

        Build a number of small applications where each application can talk to each other application.

The problem with the first approach is that it takes us right back to building large monolithic lumps of code which become a nightmare to maintain—especially if different parts are produced by different development teams.  While acceptable for trivial developments, this approach just does not scale up.

The second approach runs straight into the ease-of-programming problem.  It is certainly the case that expert programmers, using languages such as C or C++, can write system-level code to access the various low-level system mechanisms for inter-application connection (such low-level mechanisms include, for example, DDE, OLE, Pipes, and Shared Memory, etc.).  However, it is not obvious how the average application programmer using (say) COBOL—or the ‘ambitious end user’ using (say) Visual Basic or REXX—can do so.  Even the expert programmers must first agree the mechanism, then the data formats to be used, not to mention agreeing on the general architecture of their necessarily single solution.


As Figure 3.2 shows, the developer of an application can use either procedural languages or object-oriented languages to build the thing that is delivered.  That is not the point.  The point is that what is delivered is an application; and the heart of the integration problem is how to make those applications interact—as interact they must do.  This problem is shown in Figure 3.3.

Not only must they interact with each other, they must also interact with the GUI, and with other applications across the network. They must also interact with new applications introduced if application integration is to be achieved (more on this later). The question marks in Figure 3.3 indicate that, in creating an application that will interact effectively with other applications (both local an remote), the average programmer—or someone in his organization—has to contend with much more than just the application code. He or she has a major integration problem at the system level.  In general, one or more (sometimes all) of the following problems need to be resolved:

1.      GUI APIs     The need to understand and code the system-level APIs needed for modern GUIs.  Examples are OS/2 Presentation Manager, Windows, X-Windows.  All of these require a high skill level on the part of the programmer.  It can take up to six months for a good C programmer to become proficient.  This explains the popularity of GUI tools, which allow one to ‘paint’ the window, and which hide the underlying complexities.

2.      Multi-Tasking     While not required for stand-alone applications, multi-tasking (that is, multi-threading) is often required when the PC communicates with other systems.  Otherwise the window(s) can ‘freeze’ while interaction with a remote system is going on.  This is sometimes called the ‘hourglass problem’ (due to the appearance on the screen of an hourglass symbol, telling the user he has to wait).  Handling this assumes, of course, that the programming language supports multi-tasking, which some high-level languages do not do.  But there is more to multi-tasking than just starting the task.  The programmer also has to handle task synchronization, inter-task communication, ending the task tidily, memory allocation for cross-task data, etc.

Figure 3.2.    ‘Application’ isolation.

Figure 3.3.    The ‘integration’ problem.

3.      Communications APIs     In connecting to another system, the application programmer often has to deal with system-level API calls to the communication subsystem.  The programmer's real problem here is not the complexity of the APIs, but understanding what's going on.

4.      Application Structuring     The application programmer has to determine how the application should be structured so that it copes with all the various external influences on the program. 

5.      Asynchronous Processing     If multiple threads are used to handle long-running events, the programmer has to understand how to handle asynchronous events within his or her own code.

6.      System APIs     If one program has to talk with another, there are mechanisms provided by the operating system that he can use.  However, the system APIs to these are often extremely complex.  In addition, someone has to ensure that the writers of the other programs use the same mechanism.

7.      Cross-Language     If a program talks to another program, then not only does some inter-application communication mechanism have to be built from generally low-level system functions, but if the languages used to write the applications are different, then some form of software ‘Esperanto’ must be built, since many languages do not talk easily to others.  One form of cross-language and cross-program communication is an intermediate file. This will often not perform nearly well enough—aside from the problem of managing the file, synchronizing writes with reads, etc.

8.      Other     This includes things like how code is initiated, and how data is understood when exported from one program to another.

Although various software tools address some number of these, there are very few (if any) which address them all.  This makes any effective solution to the ‘workbench problem’ (discussed in the previous chapter) very difficult for the average programmer building applications.

3.2 Objects and Messages

It appears, then, that applications, whether large or small, will not provide us with the interaction we need without involving the programmer in layers of complexity. So we need to look at the question from another angle.

Now, real life is a continuous spectrum of complexity.  In capturing some aspect of that complexity in software, the designer must focus on one aspect rather than another.  He must ‘encapsulate’ software based on that aspect.  Such encapsulation is essential in any software design process.  ‘Applications’ are the result of designers choosing a business process or procedure—some set of closely-related functions—as the focal point. 

Intuitively, since the objects on the user interface must be independent, then it seems reasonable to base our encapsulation strategy on those objects.  Experience has shown that this is an excellent design strategy.  Thus instead of focusing on processes and functions, we should focus on the things needed by the user.  This means encapsulating on the basis of things rather than functions.

And this brings us to object orientation. (A brief introduction to object orientation is provided in Appendix 1.) A fundamental principle of OO is encapsulation on the basis of data—or ‘things’.  Consider a ‘customer’.  A customer has data (or attributes)—a number, a name, an address, a balance, etc. A customer also has behavior associated with it. It can be deleted, changed, created; it can be printed or displayed. It can be queried (e.g. ‘What's your balance?’).

OO encapsulates customer by wrapping into one software unit firstly a definition of the data for any given instance of the class ‘customer’, and secondly the functions required to give the customer some behavior. These functions operate on the data (the attributes of customer). 

Now, object-oriented programming languages apply object orientation to the components being used in the build process. They seldom (if ever) focus directly on the thing being built—and which will be delivered to the end user.

We, on the other hand, need to deliver objects as independent executables.  Thus we must apply OO to the unit of delivery.  We call such a deliverable a ‘cooperative business object’—a CBO. (To be absolutely precise, I mean, of course, deliver classes not objects.[10]  See Appendix 1 for an explanation of the difference.)  Whether we use procedural or OO languages to build a CBO is beside the point, as ether can be used, as illustrated in Figure 3.4

Now these objects need to talk to one another. For this, they use messages. Since the objects are produced as independent executables, the messaging mechanisms must be external to the objects, and this in turn points to the need for some form of run-time layer of middleware infrastructure software.  This message externalization is a major difference between these objects and the sort found in OOPLs.

Of course, the infrastructure is the 64,000 dollar question.  But if there were such an infrastructure, then the difficulties referred to above could be handled by that infrastructure.  The result would be as shown in Figure 3.5, where CBOs as deliverables can communicate happily with each other at run-time. Figure 3.5 shows a collection of CBOs, some written using an OOPL, others using procedural languages, and where one of the CBOs is remote. The figure indicates that the developer does not have to worry about system level complexities. Multi-tasking (multi-threading), communications APIs, GUI APIs, cross-language considerations, etc., are all handled for him by the same middleware layer that handles messages.

This same middleware also converts all external events into the same messaging API, as well as providing all of the system-level GUI function required.  But what does CBO application code look like?  Figure 3.6 shows (at a high level of abstraction) what a CBO programmer might write.[11]

Figure 3.4.    CBO cooperation.

Figure 3.5.    CBO integration.

First, a CBO is event-driven.  It is invoked by the infrastructure when a message needs to be handled by it.  Secondly, the CBO handles the message (shown by testing for the message within a ‘case’ statement).  If the message is not handled, then it is passed to the CBO's superclass.[12]  Note that a CBO may send a message to another CBO in order to complete the handling of the incoming message.  The CBOs can each be written in different languages.

A CBO is not a kind of mini application, running in its own address space.  While that may on occasion be required, in general, several CBOs should be able to run in the same

address space—perhaps in the same thread (or task).  From the operating system point of view, this makes CBOs quite fine-grained units of execution.  The underlying CBO Infrastructure must enable this level of granularity as well as larger levels.

Finally, we need to consider the data carried on messages between CBOs.  Here we need some form of type-independent data stream, which carries information about the data itself.  It must carry data labels, in other words.  Performance is a prime concern here, so building and parsing this data stream must be exceedingly efficient.  This aspect of required CBO infrastructure is discussed in greater detail in Section 8.4.2.

 

 


Figure 3.6.    A cooperative business object.

3.3 A re-statement

That a CBO is the end-product of a development process, rather than a component used within it, is so important that in this section, we re-state what we've said already, but in a different form.  If what we've said so far needs no re-statement, please skip this section.  If not, please read on.

Let's take the example of a financial institution, where a user is required to handle account transfer, customer enquiries, and account maintenance (or update).

The question we now want to address is, how should we structure the application software in order to present this function—money transfers, customer enquiries and account update—to the user?

We might have done an entity analysis of this and related business areas, and we might have identified three base entities (using the term fairly loosely)—account, transfer (a relationship between accounts) and customer.  Further, we might have done some process analysis which showed, among other things, that we needed to display four things to the user: customer, a list of customers, an account and a transfer slip.  Treating the base entities as servers, and the others as clients (which might reside on the PC), we arrive at the conclusion that our application software must handle the things shown in Figure 3.7.

Traditionally, when we build code to deliver business function through computer systems, what we do is to encapsulate a given process into what we call ‘an application’. 

 

Figure 3.7.    Example—items to be handled by application software.

This is shown in Figure 3.8, where we have wrapped the seven items we identified above into our three applications.  Tidying up this picture gives us the situation shown in Figure 3.9.

Here we see the major aspects of the application problem.  How can we address this problem?  The answer is not to encapsulate business function and processes in applications.  Rather, it is to build CBOs—individual software executables which are compiled and link-edited separately.

The key here is to encapsulate on the basis of the data, not on the basis of the process or function.  This, of course, is the idea behind object orientation.  What's new here is the idea of developing and producing independently-executable objects rather than producing applications whose source code is organized on the basis of objects.

 

Figure 3.8.    Example—applications by functional encapsulation.

Figure 3.9.    Example—duplication in applications.

In building an independent software object of this sort, the developer knows that it is of limited use all by itself, and that its real use is to be used with other independently-developed objects at execution time, connecting to each other through messages.  This is quite unlike applications, which are generally built without any idea of communicating with other applications.

Figure 3.10 illustrates a set of such independent objects.  Note that account maintenance required us to implement a customer object.  Because the customer object was an independently-developed object, then at run-time we also gain (free and gratis!) three more capabilities—without development having built any applications:

        Customer service (elements of)

        Customer locate

        Customer enquiries

 

Figure 3.10.    Example—independent objects.

The design point for independently-developed objects is at the level of a business entity—such as a ‘customer’, an ‘invoice’, an ‘account’, an ‘order form’, a ‘transfer slip’, etc.  This is why we call them business objects.  To reflect the key point of their being independently-developed, and hence their need (generally) to cooperate with each other to perform many business-related processes, we call them ‘cooperative business objects’—or CBOs for short.

So far, we have seen how borrowing the OO idea of encapsulation on the basis of data is very useful.  Is there anything more we can usefully borrow from object orientation?  Well, consider again our transfer slip object.  Its behavior is as follows:

        It knows about two accounts

        It knows about an amount

        It understands that the amount is to be transferred between two accounts.

        It has data (and probably an identifying number)

        It can send a message to a Register Log object to effect the transfer

Now consider that we want to add another function for the user—a ‘sweep’, where an amount is transferred from account A to account B on a periodic basis (say once per month)—if the balance in A is greater than some pre-set value.  Would we have to write a new money transfer object?

Here we can draw on the OO technique of ‘inheritance’ (discussed in Appendix 1). What we could do is to build a subclass on our transfer slip CBO (remember that a subclass, in OO terms, typically provides a superset of behavior—the behavior of the subclass plus the behavior of its superclass—the object from which it is subclassed).  Thus we might build a subclass which had this behavior:

        It knows about how often the sweep is to be done

The rest of its behavior would be inherited from the Transfer CBO.  Thus just by adding a subclass of Transfer, we get an object that:[13]

        Knows about two accounts—one to be debited, the other credited

        Knows about an amount (below which the transfer should not occur)

        Understands that the amount is to be transferred between two accounts.

        Has a data (and probably an identifying number)

        Can send a message to a register log object to effect the transfer

        Knows about how often the sweep is to be done.

And since the subclass may be in a different language than the superclass, we see a level of software re-use which has long been wished for, but never—until now—delivered.

The key to enabling this is to enable inheritance mechanisms across independently-developed CBOs—which also means across languages.  Thus we see another requirement on our infrastructure.

 

3.4 Summary

The cause of the integration problem is the very structure of code we have been building for the last 30 years or so.  An ‘application’ is an island of function, devoted to a specific task.  It hides the things in it—things which are needed to perform the function.  The solution is to build CBOs instead of applications.

If we build CBOs, then the picture would look like Figure 3.11, where IT would firstly have done an object analysis of a given business process, and would then have built the CBOs needed by the user to perform that process.

Note, however, that if a CBO has already been built for another process, it is not re-built.  It is not even included in other code (as would be the case with many object-oriented programming languages).  The user would just re-use it (on the object-based user interface).  This illustrates the principle of object re-use—by the user, and also by other CBOs.

Again, the use of CBOs rather than applications makes it feasible to do something about the departmental processes which are seldom addressed by AD departments.  For instead of building trivial applications, the AD function could develop re-usable ‘tools’—such as a pad of Post-It notes, or a specialized Container (some form of folder, perhaps).  These would then be able to work with other CBOs.

Figure 3.11.    Business objects.

Cooperation between CBOs is all-important to addressing the user needs, and is achieved by CBOs sending messages to each other.  Thus systems are built of independent CBOs, acting on messages sent to them, and in turn sending messages to other CBOs.

For these objects to live on today's operating environments, some form of ‘middleware’, or software infrastructure needs to be built. But, just as, if you want to build transactions, then a batch spooling system is the wrong software infrastructure, so current infrastructures are wrong for CBOs.  A new infrastructure is needed (as shown in Figure 3.12). 

We will discuss some of the characteristics of such an infrastructure later (Chapter 8), but first let us concentrate on the other great impediment to ease-of-programming—the client/server connection.

 

Figure 3.12.    Business process integration.

 

 

 

 


Part Two:
The programmer—application structures

 

In this part of the book, we develop a general application model of cooperative processing in client/server and distributed systems.  The term ‘application’ has come to denote two things.  Used as an adjective, it denotes code which implements business function rather than system function.  Used as a noun, it is often interpreted as an executable which encapsulates a business process.  In the last chapter, we showed how building ‘applications’ (the second of these two meanings) was part of the problem at the user interface, and that building CBOs was preferable.  From this point in the book, we will only use the term ‘application’ in its first meaning—as an adjective denoting business as opposed to system function.

The model is developed in terms of the desired shape of application code, and a cross-system (program-to-program) structure.  The shape and structure together are aimed at delivering ease-of-programming. 

We show how a shape combining messaging and ‘event-loop’ code best meets the aim.  Further, this shape turns out to be a true subset of the structure of CBOs.  Thus the shape of application code which best handles object-based user interfaces also turns out to be excellently-suited to handle cooperative processing.

Our general model for cooperative processing in client/server systems identifies two major structural elements or ‘domains’—the user interface domain, and the shared resource domain.  Within these two domains, we identify a number of different types of CBO.  These types, within the structure identified, prove to be particularly useful in addressing a number of the common design problems of client/server systems.

 

Cooperative systems are different

Real cooperative client/server applications have gained a reputation for being difficult to design and implement.  There are broadly speaking two reasons for this:

1.      Programming difficulty   Building application code which cooperates across PC and mainframe has proved in practice to require very high skill levels in program design and implementation. 

2.      Systems management   Viable systems are manageable systems.  It is only recently that we have seen the introduction of systems management products which start properly to support distributed and heterogeneous systems

In this book, we focus on the first of these problem areas.

Consider client/server systems in general; we can perceive these general characteristics:

        The essence of GUI application code on the PC is that it is event driven.  Events are generated by the user; the application code reacts to those events, does what is necessary, then quiesces, waiting for the next event.

        Shared resources such as data are normally located on systems such as mainframes, LAN servers or minis which are accessed by many PCs concurrently.  Corporate systems (even those of very small companies) seldom live entirely in a single PC.

        There will usually need to be application code on the Server end as well as at the PC.

Ah, you may say, what about distributed database?  Then you can have all the application code on the PC, can't you?  Well, on the face of it, that's true.  But, as we'll see, it's not as true as it appears.  In any case, and perhaps more importantly from our viewpoint, it turns out to make little difference to the application developer.

The inevitability of event-driven code at the PC means that the shared resource systems (the servers) cannot ‘drive’ the user interface.  Rather the system must be organised such that it is the Servers which are ‘driven’ by the client PCs—but in such a way as to enable business rules to be enforced, and to protect the integrity of corporate shared resources.

So the first step to client/server wisdom is to recognise that the traditional role of mainframe (or mini) and character-based terminal are, must be, stood on their head.  Instead of the mainframe/mini containing code which drives a dialogue with the user through the terminal, with client/server the PC contains the equivalent of the ‘dialog’ code, and drives an essentially passive ‘server’—the mainframe.

Since there often needs to be application-level code on the server, this means that code on the client needs to be able to communicate with code on the server.  We give the name ‘cooperative processing‘ to the situation where two pieces of application-level code in different systems (or at least in different address spaces) need to communicate with each other to do some single piece of work.

 

Cooperative systems are difficult

The developer who first sets out to build a cooperative business code meets a whole number of difficulties, from handling an advanced GUI on one hand to understanding how to ensure data integrity on the other.  The response to this is often to build some form of middleware such that the application programmer will find his job easier.

In practice, it is often the case that each cooperative application developer has had to design and implement his own software infrastructure.  This is so in spite of products aimed at this area; some make programming simple, but do not allow full exploitation (or sometimes even partial exploitation) of the object-based user interface; some provide useful APIs for connection, but do not provide for other complex areas (such as multi-tasking); others provide a closed environment, often for a single language only.

The reason for this is that there has been no general model of what—in software structure terms—a cooperative processing system is.  There has been no generally-accepted notion of what it is that we'd like application programmers to build—and hence no well-understood appreciation of the services and facilities that should be made available to those programmers.

We need such a model because building a cooperative application is different.  If we treat it like previous application structures, then we will find things difficult.  If we accept it's different, but don't have a model that explains the differences, then it will still be difficult.

One of the major challenges to ease-of-programming in client/server systems arises in what the average application programmer is confronted with, in terms of understanding and of APIs, when connecting code on one system to code on another.  A programmer finds things easier when there is a generally-agreed structural model which describes the shape of the software to be written.  Thus while it may take a week or so for someone new to transaction processing to understand the nature of the code that must be written, once that shape is understood, things become much easier.

 

Objectives of the model

The objective therefore is to construct a programmer's model which exhibits the following characteristics:     

        A simple conceptual structure for the programmer

        Language independence

        Enables application integration

        Ease of programming

        Wide applicability

In this section, we develop a model which has been proven to meet these objectives.  The context for discussion is a client/server system where the PCs have an object-based GUI, and the servers have application code.  This is probably the more complex case.  Our model, however, will be equally applicable to other system shapes—for example, server-to-server, and general peer-to-peer application code relationships.

It is important here to focus on the area of discussion.  We are not talking about basic communication protocols, or of system mechanisms such as RPC (remote procedure call), or of systems management across a network, or of underlying safe store and forwarding of cross-system messages.  What we're talking about is making it easy for the average application programmer to communicate with other (remote) pieces of application code in a way which hides system-level facilities such as the low-level communications APIs. 

A major surprise is that we find that the structures which best allow us to implement the object-based GUI—CBOs—are equally useful in solving the ease-of-programming problem for connecting two (possibly separately-developed) pieces of application code across a network.

And just as ‘transactions’ (as a desired application structure) need an underlying software infrastructure to support them—a ‘transaction processor’—so CBOs need an infrastructure. 

We develop the discussion as follows:

        Chapter 4 develops the model as the application developer would see it, and provides an overview of the structure.  In particular, this chapter introduces a most important design concept—the separation of application code to do with serving a single user from application code to do with handling shared resources.

        Chapter 5 looks in some detail at the first of these two areas—application code that serves a single user—and develops the concept of two kinds of CBO within this domain.

        Chapter 6 discusses the second area—application code that handles shared resources—and introduces three different kinds of CBO within this domain.

        On the face of it, it seems that having five different types of CBO does not lead to ease-of-programming.  Chapter 7 shows how these five types represent a superset, which can be both subsetted and coalesced in many situations.  In essence, the five types of CBO represent a way of thinking about design, rather than a rigid recipe.

        Chapter 8 summarises the functions and facilities required by the infrastructure, and briefly discusses the importance of self-defining data.


4              Structural overview

 

In this chapter, starting from the viewpoint of traditionally-structure application code, we develop the argument in favour of CBOs as the foundation for a general model.  We do this by addressing three main questions:

        What are the base requirements for application structuring in a client/server system?

        What shape of application code should be written if ease-of-programming is to be achieved?

        What is the end-to-end structure of a client/server system?

Making programming easy is the crux of the matter.  But unless there is a consensus about the general ‘structure’ of the application code—and what should be distributed where, then we cannot design a general-purpose application enabling layer which will make it easy.

Given that CBOs are the right shape for application code, we further develop a structure based on the separation of ‘user logic’ from ‘business logic’.  We develop not only technical reasons for this separation, but also sound design reasons.

 

4.1 Base requirements

To begin the discussion, we propose the following four maxims:

1.      There will be application code on the PC which handles the GUI, and thus serves the user.  For the moment, we will call this code ‘user logic‘.

2.      In general, the user logic must be able to respond to keyboard or mouse ‘events’ within a tenth of a second or less (otherwise the GUI becomes well nigh unusable).

For example, suppose the user has requested some data, and then changes his mind and presses the ‘cancel’ button.  The user logic code should (must?) respond to this straight away (if only to say ‘please wait’).  Again, for direct manipulation (drag/drop by the user), it is sometimes necessary to refer to the user logic code to see if one object is allowed to be dropped on another.  This requires GUI response times within the application code (i.e. the user logic).  (Clearly this is not always the case.  My point is that we must allow in our thinking and in our design for such response times.)

Although not always required, such fast response time within the application code must be provided for by our model—otherwise it would not be general-purpose.

3.      There will be application code which accesses the shared resources needed by the PC user.  We could call this code the ‘shared resource logic’; but since the shared resource is mostly data, for the time being, and for ease of reference, we'll just talk about data.  Hence we'll call this application code the data access logic.

4.      The delay in accessing shared resources is generally (much) more than a tenth of a second.  In addition, such shared resources will not typically be located on the PC.

From these four givens, we can see a key design consideration: there is a significant difference in response time between the application code which handles the GUI—the user logic—and the code that handles the shared resource—the data access logic.

The user logic code must return to the underlying system-level presentation layer within around 1/10th of a second.  The data access logic, on the other hand, cannot be expected to respond reliably within this time.  Indeed, we may be talking of tens of seconds (two orders of magnitude greater) rather than milliseconds!

The only effective way to handle this difference is to separate the two pieces of code, such that they run at least in different threads of control (tasks), or in different processes (address spaces), or maybe on different machines. 

Running the two pieces of code in different threads of control means that the connection between the two must be asynchronous.  That is, code initiating a request from the user logic (managing the GUI) to the data access logic must not be held up waiting for the response.  Here we have a key distinguishing characteristic of cooperative processing in a client/server environment:

Cooperative processing applications require asynchronous requests and responses between user logic and data access logic.

The top part of Figure 4.1 shows this—the separation between the user logic and the data access logic by an asynchronous connection.  This in turn means that we need a simple way for the programmer to handle asynchronous requests.

Notice that this model is insensitive to the placement of the data access logic.  As shown at the bottom of Figure 4.1, whether or not we have distributed database makes no difference to the model.  Placing structured query language (SQL) calls on the PC does not remove the disparity in response times between the user logic and the data access logic.  It merely means that we have to do asynchronous connection within the PC.  However, for ease of exposition, we will assume that the data access code will not be on the PC.  This assumption makes no difference to the overall model—it merely serves as a useful vehicle for discussion.

The key point here is that user logic must not be blocked while the data access logic is executing; otherwise the user interface will block.

 

 

Figure 4.2.    Client/server separation.

Our model, then, must recognise that:

        There will be application code (user logic) which handles the GUI and which is separate from:

        Different application code which handles the shared resource (data access logic); and that

        User logic and data access logic must run in different threads of control, but in such a way that

        There is a simple mechanism provided for the application programmer to connect these two pieces of code.

These considerations give us the model shown in part 1 of Figure 4.2(a) (for clarity, the operating system components and their APIs are not shown).  The thread boundary is shown as a vertical dotted line.

So far, we have talked only of the user logic (GUI) and data access logic.  But what about the business logic?  And what do we mean by ‘business logic’?

Business logic 

What people normally think of as ‘business logic’ is that application code that enforces business rules.  For example, code which ensures that a customer order is either completely recorded on the database, or completely rejected, is business Logic.  ‘Complete recording’ of an order will often entail the update of several different database tables (e.g. order header, order detail, product, and maybe customer).  In other words, ‘business logic’ is often associated intimately with the commit scope processing.

Figure 4.2.    Cooperative processing model (1).

We can distinguish this sort of logic from that required to access a single table or file (data access logic), or to do cross-field validation on data entry (user logic), and we call it ‘business logic’.

Thus we introduce business logic as a separate thing from data access logic.  But since it is normally dedicated to ensuring the integrity of data (and shared resources generally), and will typically not return with a ‘yes’ or ‘no’ until the data has been properly accessed, then it is placed with data Access on the other side of the thread boundary from user logic.

An example which serves to show the difference is as follows.  Code that checks whether a customer is over credit limit is business logic.  On the other hand, code that displays a warning to the user that the customer is over credit limit is user logic.  Again, code that reads a record from a file is really data access logic; while code that checks that if an order header record is recorded, then at least one order detail line must also be written, is business logic.

Consider an order entry process.  Figure 4.3 illustrates the main actions within such a process (for simplicity we exclude back-ordering, stock allocation and error conditions). Notice how many of the things which need to be done are primarily to do with accessing data, or supporting the user.  However, the kernel of the process—the part which must be done (even if it's a batch process, where the ‘user’ parts would have been done via data entry and edit runs) is business logic.

But where should this logic be?

Figure 4.3.    User logic, data logic, and business logic.

We have said that business logic should be on the same side of the thread boundary as the data access logic. In general, the thread boundary defines the boundary between ‘client’ and ‘server’. Often this boundary will be much more than just a thread boundary—it will be a communication link. Business logic is frequently associated with commit scope processing (that is, code effecting business rules which must be followed between receipt of a request by the server and its response to the client). In that case, the ‘server’ will normally be a shared resource system which is physically separate from the PC. On the other hand, where the data access logic is handled by a distributed database facility on the PC, then the business logic and the data access logic may be on the PC—but still on the other side of the thread boundary from the user logic.  User logic (which will often include simple data validation) should be in the PC unless it is obvious from the nature of the logic that it should be done elsewhere.  In building real systems, this is almost always a trivial and obvious decision.

In this categorisation, we are not attempting to define rigorously the differences between the three types of application logic, nor to build a model with precise theoretical boundaries.  What we've done here is to focus on the major differences between the necessary bits of code in the client/server system.  Later, we shall give each part more precise responsibilities.  For the time being, the important thing is the separation of business and data logic on the one hand as the preserve of the server, and user (presentation and interaction) logic as the preserve of the client. 

Thus we can enrich our model by adding business logic, as shown in Figure 4.2(b).  (For clarity, in Figure 4.2(b) we omit the PC screen and the actual data (the disk storage). From now on, we will not show these.)

GUI code

But why have we also added another block called ‘GUI Code’ in this figure?  And why is the task boundary shown as being between user logic and business logic, rather than between GUI Code and user logic?

Consider the GUI API provided by the system.  Handling this API is generally complex and requires a very high level of C programming skills.  We need to isolate the application programmer from this complexity if we are to achieve our ease-of-programming objectives.  This is often done by providing a window layout tool, which generates the required low-level GUI code.  We re-draw the model to show this in Figure 4.2(c), where we label the GUI code as just ‘GUI’, being produced by some appropriate window layout tool.

But what about the thread boundary?  Well, and this is an important point, often not fully appreciated, the thread boundary is as shown because, in the general case:

The user logic on the PC must sometimes respond at GUI speeds to user events.

This does not mean, of course, that there cannot be a thread boundary between the GUI code and the user logic; indeed, such a thread—or process—boundary may be very useful depending on specific PC operating system implementations.  What is being said here is that there does have to be such a boundary between the user logic and the business logic.

Thus there is a technical reason in addition to a design reason for separating user logic from business and data logic.

Before examining this separation in more detail, we now take a necessary detour to answer the question, how are the separate pieces of cooperative code connected?  How does the programmer access one from the other?  This is perhaps the key question in enabling us to design easy-to-build code, and hence define the underlying ‘enabling’ infrastructure required.

4.2 Application code structure

In this section, we address three areas of difficulty:

        What kind of connection—between two separate pieces of application code—should we provide for the programmer?

        What API model best implements the kind of connection chosen?

        Is there a specific structure for application code which delivers ease-of-programming more than any other?

 

4.2.1             Kinds of connection

Our focus here is a single interaction between two pieces of application software across a communications link.  There are, in general, three different design approaches for such an interaction:

1.      Master-slave     This is where one end (the master) owns and drives the other (the slave), and usually implies a great deal of knowledge at the master end about the state of the slave end.  The master controls and directs the whole process. An example of this is where code on one system drives a non-programmable terminal through the code on a screen controller. In this case, the ‘single’ interaction is probably very long-running—from login to logout.

2.      Peer-to-Peer    This is where each end of the link is of equal importance.  Either end can initiate an interaction.  Usually, peer-to-peer requires each end to know detail about the state of the other, and protocols are defined to communicate these states.

An example of this might be a distributed database system, where to ensure coordinated recovery, each database must discuss recovery, sync point and rollback matters with the other.  Any of the databases can initiate this with any of the others.  The single interaction here may be short or long duration, and is characterised by each end both sending and receiving within that interaction.

3.      Client/server     Here, a ‘requester’ piece of code (the client) initiates a request to a ‘responder’ piece of code (the server).  Note that here we are using the phrase ‘client/server’ in its ‘connection design’ sense; see the Glossary for the other two meanings.  Essentially the server is there to provide a service of some sort for any client that requires it.  The server knows nothing about the state of the client.  The server is (typically) passive. 

An example of this might be a credit check agency, which provides a ‘server’ interface to other company's computers requesting a credit check.  This design is based on the following principles:

(a)    The client makes a request of a server; the server responds back to the client

(b)   In each completed interaction between client and server, the client sends one and only one request.  This request starts the interaction.  The server responds with one response (which may be in several parts).  This response ends the interaction.

(c)    No state information is retained in the client or the server about the state of the interaction.  That is, each interaction between client and server is independent of other interactions.  Of course, both the client and server may retain information about their own state across multiple interactions.  What we're saying here is that for any given interaction, there is no dependence on either previous or future interactions for that interaction to complete successfully.

This design is sometimes called a ‘one-shot transaction‘, to imply that there is no continued conversation or dialogue between the client and the server.  Note that servers can also be clients.  The processing logic within a server could make requests of other servers.  In that case, the requesting server would be a client to those other servers.

From these three, we choose the client/server design as the basis for connection between the separate pieces of cooperative code.  The reason a client/server design is preferable are:


        It is simpler than the others, due to the lack of state information that needs to be held about the interaction

        You get much looser ‘binding’ between each end of the interaction than otherwise.  The looser the binding between two pieces of code, then the greater the potential for re-use.  Thus several clients may use the same server.  Servers can be re-used across application boundaries.  Several servers may be accessed by a single client.

While this form of connection may not suit all occasions (for example, if the thing you're building is a distributed relational database, then it may be insufficient), it seems to suit most, if not all, of the situations met in commercial data processing—the ‘core’ or ‘mission-critical’ business systems.  It also seems to suit many other situations.

 

4.2.2             Programming models

But what does a programmer actually have to write to effect the connection between two separate pieces of application code?  Well, when a programmer writes code, he is actually doing two things:

        Implementing a concept—a model—of what he wants to do

        Obeying some syntactical rules about how to implement that model

The second of these is of much lesser importance.  If the syntax is particularly tortuous then it merely makes the thing the programmer wants to do more laborious. The first is much more important.  If a programmer cannot easily conceptualise the effect that each line of code has, then his task is made extremely difficult.

In Figure 4.4 we show on the left a number of different tasks that a programmer might want to perform.  In each case, the programmer is easily able to visualise the effect of the code he needs to write.  This code is shown in the centre of the figure.  Depending on the language, the programmer will either do a call to make something happen (as with the C language for example), or there will be some built-in language syntax (as with COBOL), or there will be some statement which is pre-compiled into a call (as is often the case with EXEC SQL …).  Figure 4.4 shows examples of all three.  Finally, down the right of the figure are the underlying system mechanisms which implement the code he writes.  Note that the implementers of such mechanisms are usually also the designers of the APIs which access them.

In the first example, the programmer is reading a record from a file.  He is protected from all the underlying complexities of the file system by the design of the API to that file system.  This allows him to write the simple statement ‘READ FILE INTO RCD_A’ to implement his image of the process.  Thus the designer of this API ensured that the API corresponded to the programmer's model of what he (the programmer) was actually doing. 


Figure 4.4.    Models, APIs, and implementations.

Thus:

Implementers of APIs for cooperative processing must first design the application programmer's  ‘model’

Designers either do this explicitly or they do it unconsciously (because they cannot avoid it!).Note also that the examples of code in the centre of Figure 4.4. include a ‘call’ form (such as might be written in the C language).  Self-evidently, although the syntax of each of these calls is similar, the programmer is doing quite different things, and has a different model of what's happening in each case.  We will come back to this later.

In providing an API model for the programmer, we must above all else keep it simple.  It must be easy-to-use.  This means that:

        ‘System-level’ complexities must be invisible to the programmer.  Such complexities include:

         Getting across a task/process boundary to handle the asynchronous nature of the connection

         Synchronising application code with code running in the other task/process

         The physical location of the other piece of business logic

         Driving the underlying communications API

         Initiating or loading the other piece of business logic

         Routing the response back to the requestor

        The API itself must be simple (a few lines of code at the most)

It is for these reasons that many currently-available APIs are not suitable (e.g. pipes, sockets, re-directed file I/O).  For example, none of these will get the programmer transparently over the task boundary.

There are probably four different choices for an API model:

        Remote Procedure Call

        Conversation

        Make the interaction look identical to some existing external facility (for example, like file Input/Output)

        Messaging

As we shall see, none of these is entirely satisfactory by itself.  We will need to add a further factor—the shape of the code which deals with the API.  Before discussing that, however, let's look at each of the four models.

Remote procedure call 

The idea behind RPC is that the programmer merely calls a procedure (a subroutine) as usual—but the procedure is actually outside the program—in another process or address space, either on the same or a different system.

Figure 4.5 shows this.  The programmer calls a procedure.  The ‘system’ intercepts the call, routes it to the remote procedure, which then executes.  Meanwhile, the calling program waits—just as it would do if the procedure was local.  When the remote procedure has executed, it returns control (together with any data and maybe a return code) to the ‘system’, which then returns control (with the data and return code) to the next sequential instruction in the calling program.  If any code translation needs to be done (e.g. ASCII to EBCDIC, or one form of floating point to another), then the ‘system’ will do it automatically.  A different form of RPC from that shown in Figure 4.5 is where the programmer passes a handle that identifies the routine to be invoked by the response.  In this case, the programmer has to work out what to do if, before his program can continue, the response is needed.  Perhaps wait (if that’s an option)—synchronously.

Figure 4.5.    Client/server models—RPC.

Figure 4.6.    Client/server models—conversation.

Conversation

The principal behind a conversation is that each end does explicit sends and receives.  This implies that each end must be aware of the state of the other (at least whether it's expecting to send or receive).  Figure 4.6 shows such an interface.

Note that if both ends send at the same time, there must be a way to handle and recover from the resulting collision.  Protocols such as CPI-C provide for such events.

Using an existing model

This means creating an API based on some existing and well-known mechanism.  For example, one could construct a cooperative processing API to look like file I/O, thus building on concepts already familiar to the programmer.

Messaging

Here, the programmer thinks in terms of sending a message to something (another piece of software), and of receiving messages from somewhere (other pieces of software—either his own, or someone else's).  In procedural code, a messaging scheme might look like the one shown in Figure 4.7.  Notice that the programmer must be aware that a message will be returned.

Sending a message is inherently an asynchronous operation.  Control is returned to the calling program before the message is delivered.  Notice, however, that the receive is synchronous; the program issues a ‘receive message’, and waits until there is a message to be received.  Thus messaging is synchronous at the receive.

 

Figure 4.1   Client/server models—Messaging

4.2.3             Choosing a programming model

Before choosing between these four, remember that there are three different aspects of an API:

        What the programmer thinks he or she is doing.

        The language statements (and their syntax) which the programmer needs to write to make it happen.

        The underlying mechanism that implements those statements.

It is the second of these three that people normally mean when they talk about an ‘API’.  But it is the first of these three which is the most important to the programmer.

Let us consider the four types of connection in relation to our objectives—asynchronous messaging and ease of programming.

Remote procedure call

The great advantage to the programmer is that RPC appears just like an ordinary procedure call ... or does it?

The problem with RPC as the basis for cooperative software across unreliable networks is that in reality, it does not map to a normal local procedure call.

For example, after issuing a normal procedure call, the programmer is concerned only with:

        The data returned by the procedure.

        The return code from the procedure (which might indicate that the procedure found some error condition).

The programmer using RPC must be concerned with:

        The return code from the system (as opposed to that from the remote procedure)—telling the programmer (depending on the specific RPC implementation) such things as:

         Whether the call failed to connect

         Whether the called procedure exists

         Not all the data could be returned

        Dealing with the wait if a call takes a long time (say more than 100 milliseconds).

        Dealing with the situation that not all the data was returned.

        The data returned by the procedure.

        The return code from the procedure (which might indicate that the procedure found some error condition).

        In addition, depending on the RPC implementation, someone must handle some or all of the following ‘setup’ requirements:

         Define the RPC bindings (for automatic data conversion at run-time)

         Pre-compile the code ‘stubs’—one for the Client, one for the Server

         Make sure those stubs do not have name clashes with other stubs

         Ensure that the stubs are available in the appropriate development libraries

Even assuming that someone else does all the setup, the programmer still has a problem with long-running calls, and with more than one chunk of data being returned.

And, of course, RPC does not handle the necessary asynchronous nature of cooperative processing (which we specified as a key principle)—it does not handle the ‘long-running call’.

The long-running call

A long-running call occurs when the transmission of the call to the server takes a long time, or when the server itself takes a long time to process the request (for example, accessing a database)—or both.  Remember the RPC model—the calling code waits until the call returns.  This is called ‘blocking’, as the wait ‘blocks’ other threads of control from using the code.

Now for a normal procedure call (or maybe for a very fast call across a high-speed LAN to a powerful server) the wait is very small, and the blocking effect is trivial.  By ‘small’ we mean less than (say) 100 milliseconds.  This is the maximum time a user should be locked out of doing anything with a graphical user interface—and hence (in the general case) the maximum time the business logic on the PC should have to wait.

The call may even block a whole address space.  For example, if this call were done in a program which also handles the system-level GUI API, then unless the programmer handles his own multi-tasking, the whole application—and maybe even other applications—will be blocked.  And the user will not be happy as his screen and keyboard lock—for no apparent reason!

Now the programmer's model for cooperative processing must include the idea of asynchronous connection.  The RPC model excludes this.  Therefore, RPC is not a good choice of model.

Telling the user to do his own multi-tasking to handle the long wait problem does not smack of ease-of-use!

Conversation

Handling a conversational API model—even a simple one—is intrinsically complex.  You might restrict the complexity by specifying a subset of capability (such as the ‘CPI-C Starter Set’ does).  However, such a restriction will probably include only synchronous receives, which do not match our requirement for asynchronous operation.  If asynchronous receives are offered, then this does not map to our requirement for shielding the programmer from handling task/process boundaries.

Use an existing model

The problem with using existing models is that they typically do not map easily to the essence of cooperative processing—a client making an asynchronous request of a server.  For example, each file I/O call is synchronous—it waits until it completes.  Also, with a file I/O model, you would have an open and close operation.  But there is no obvious necessary analogy to this with cooperative processing.

Messaging

The standard Messaging model looks as if it is by nature asynchronous.  This, however, is misleading, since we need to handle incoming messages as well as outgoing ones.  And this is important, as handling incoming messages imposes a synchronous element which is unwelcome for our current purposes. 

Often I'm told this is no problem.  ‘Just spin off a thread’ I'm told.  The problem is that the people saying this are usually highly experienced programmers for whom writing operating system extensions is a weekend hobby; or by people who have little idea of the realities of programming of any sort, and who imagine that ‘spinning off threads’ is something learned on day two of the standard COBOL application programming class.

So, we seem to have concluded that none of the models are suitable for our needs!  None of them appear to make asynchronous connection easy to program, as they all have synchronous elements, so leaving the application programmer to deal with the difficult bits!

To repeat: What we're after here is ease-of-programming.

Happily, there is a solution.  It lies not in a connection model alone, but in the combination of a connection model and a particular kind of code structure—event loop programming’.  And since events can most easily be thought of as messages, then we will choose messaging as our connection model.

4.2.4             Event loop programming

Event loop programming is not new.  We find it today in both the transaction processing environment and in the low-level GUI code on a PC. 

Transaction processing

Most existing transaction processing systems are event-driven.  For example, consider what happens when the user on a character-based terminal connected to a CICS system (say) causes an ‘event’—perhaps by pressing the ‘enter’ key:

1.      An ‘event’ is signalled to the terminal controller.

2.      The terminal controller transmits that event (with any associated data) to CICS.

3.      CICS wakes up a piece of application logic (a ‘transaction program’, often casually referred to as just a ‘transaction’) and passes the event (and its data) to the transaction.

4.      The transaction processes the event and replies by signalling a second event, destined for the terminal, together with any associated data, back to CICS (e.g. ‘Write this data to the screen’).

5.      CICS passes that event and data to the terminal controller.

6.      The microcode in the terminal controller processes the event and takes appropriate action (e.g. updates the screen buffer on the terminal).

The above would be immediately familiar if we replaced the word ‘event’ with the word ‘message’.  There are two such messages—one in, one out.  Hence what we have just described is the processing of a ‘message pair’

Figure 4.8 shows the similarity between transaction programming and event-loop programming.  On the left are some character-based terminal-style transaction programs.  On the right is the same code, but built in an ‘event-driven’ style.  Indeed, the right and left parts of this diagram are in a real sense the same; the difference is in the packaging.  For high-performance transaction processing systems like CICS and IMS, packaging is as shown on the left of the figure because of performance constraints. 

PC GUI Code

It is the nature of GUI code (unlike code driving non-programmable terminals) that it must respond to a huge variety of user keyboard and mouse actions.  It must also manage a vast array of GUI constructs such as push-buttons, scroll-bars, windows, icons, etc.  This makes such code far more complex than character-based terminal code.

The way that much of this complexity is managed is to structure GUI code such that instead of trying to drive the user interface, it merely sits back and responds to events.  Thus GUI code is built as ‘event-driven’ code.


Figure 4.8.    ‘Event-loop’ programming.

In contrast to traditional transaction programs, however, instead of having one piece of code for each event/message, GUI code is structured as larger lumps of code, each of which handles several different event/message types.  Such code will loop around a ‘get next event’ statement, —or will be woken up only when an event arrives—that is, something outside of the application code provides the ‘get-next-event’ loop and invokes the appropriate piece of application code.  This is where the phrase ‘event-loop programming’ comes from.

In the PC, different performance constraints lead to packaging as shown on the right of Figure 4.8.  This is because the ‘events’ are much more likely to include GUI stuff – e.g. an event might be ‘button “Apply Changes”  pressed’—which have to be processed at very high speed (remember the tenth of a second response time requirement), and so cannot afford the overhead of code loading implied by transaction programming packaging.

Thus in event-loop programming on the PC, the programmer expects to receive a message at the beginning of his code, handle the message, then return (to wherever he was called from).  So the programmer's model is not that he writes a complete program (although in practice he has to do so!); what he thinks of are independent modules—each relating to a given window on the GUI—which fit into a system-provided context.[14]  That context is one of freely-flowing messages.  At a high level, this is broadly similar to Transaction Programming—although quite different at the detail level!

 

Conclusion

We can achieve ease-of-programming for asynchronous connection as follows:     

        Adopt an event-loop program structure.

        Use messaging as the API model.

        Create an infrastructure to support the event loop (both of which are invisible to the programmer).  When a message is received by the infrastructure, it will invoke the relevant piece of event-loop code.  This will handle the message and return to what it sees as the system, that is, to the infrastructure.

It may seem odd, but today's operating systems do not support this kind of software structure for the average application programmer.  The nearest to it are the various transaction processing systems (or the low-level GUI APIs such as those provided by Windows and PM).  So we need some supporting software Infrastructure which will do such things as:

        Handle the messages and the event loop.

        Dispatch the appropriate ‘module’ of application code by passing the message data to it and ‘starting’ it.

        Take over when the application module returns control after processing the message.

        Manage the thread or process boundary to provide asynchronous messaging.

        Route messages to the right place, and return responses to the requestor.

With this structure, the application code might look as shown in Figure 4.9, where:

        The code (high-level pseudocode) on the left of the figure (A) is invoked by some incoming message X (perhaps an ‘Apply button pressed’ message from the GUI).

        In processing that message, A sends message Y to B.

        B processes message Y, and sends the response back to A in the form of message Z.

        A then processes the response to its initial message Y in the code which ‘catches’ message Z.

Well, that may be fine, but there's still one problem.  Looking at the code in Figure 4.9, it is hard to understand how an application (or part of an application) can be written in that style.  If we want to produce a module or program of that shape, then what is the basis of modularisation?  It's difficult to see how it can be functionally-based.

 

Figure 4.9.    ‘Event loop’ with messaging.

 

The answer is to modularise on the basis of data—of things—of objects.  But how might this work?  Consider Figure 4.10.  On the left, we see two traditional transactions.  We assume that the user enters one of two such transactions—‘Update customer’, or ‘Enquire on Order’.  Each of these will invoke the appropriate function.  The ‘order enquiry’ function will read the customer and orders (and, not shown, parts) databases, format the reply, and return it (perhaps display it directly on a character-based terminal).  Similarly, the ‘update customer’ function will write to the customer database.

Figure 4.10.    Functional vs. object modularization.

Suppose we were to modularise on the basis of data (objects), and build a piece of event-loop code which dealt with all the functions done on customer data, and another which dealt with all the functions done against an Order.  This might look something like the right side of Figure 4.10, and would work as follows.

Consider the two user transactions—‘update customer’ and ‘enquire on order’.  Let's assume that there is some infrastructure (middleware) code which will transform our user requests into messages.  The ‘update customer’ action would result in a ‘write’ message being sent to the customer object, which would write to the customer database. 

Now consider the order enquiry.  A ‘query’ message (or, perhaps better, a ‘display yourself’ message) would be routed to the order object.  Now, an order object will typically not have, for example, the customer name and address in its data, although it will certainly hold the key (customer number) of the customer who placed the order.  So how does the order object display the customer name and address as part of itself?

The answer is that the order object sends a message to the customer object, asking for its name and address. (Object-oriented readers may notice that this breaches encapsulation rather badly.  If you'd rather, let's say that the customer object returns an address object, which the order object then uses appropriately; or that the order object sends a ‘display your address object here’ message to the customer object.)  The order is then displayed—maybe by sending a message across the network to a PC.

 

4.2.5             : CBOs Again

But haven't we seen something like this before?  Yes, in Chapter 3 we developed the idea of cooperative business objects (CBOs) as the answer to the problem of object-based user interfaces.  A CBO looks very similar to the event loop structure we've developed here.  Indeed, in all its main aspects, it's identical.  So we find that the idea of CBOs is much more generally applicable than just in the GUI area—it also provides the structural base for a solution to the ease-of-programming problem for cooperative client/server systems.

What luck! 

CBOs solve not only the Object-Based GUI problem, but also the Cooperative Processing problem!

In subsequent chapters, we will examine CBOs in greater depth. 

4.2.6             Synchronous Connections

We have focused on the need for asynchronous messages, and have concluded that this can be achieved—with ease-of-programming—through a messaging API plus an event-loop style of program structure.  However, there are many situations where a synchronous connection is required.

Why is this?  Well, consider the event-loop code shown on the left of Figure 4.9.  Notice how, for each message sent, we provide is a block of code to handle the response (the programmer will know that the response to his message Y will come back to him in the form of message Z).

 

Now, there are many occasions when, in processing some single message, the programmer will want to send several messages whose responses are required in order to complete processing of the first incoming message.  If asynchronous messages are all that's available to the programmer, he will need to provide a piece of ‘catcher’ code for each different incoming message.  Not only that, he will have to manage the ‘state’ of things—the fact that processing of one thing cannot continue until the result of another thing has been received.

Now while managing a limited number of states (two or three) is not too onerous, managing ten (say) is extremely complex.  This complexity can be avoided by allowing synchronous as well as asynchronous messages.

As described in Appendix 3, it is useful to provide the programmer with the ability—at the API level—to specify whether a message will be synchronous or asynchronous.  Thus we might have a ‘send’ API for synchronous messages, and a ‘post’ (say) for asynchronous.

We can then have the underlying infrastructure ensure that—regardless of the location of the target code, the behavior of the messaging system will always be the same.  This leads to much easier programming.  Consider, for example, the pseudocode shown in Figure 4.11, which assumes that only asynchronous messaging is available.

Now consider the exact functional equivalent in Figure 4.12, where both synchronous and asynchronous messages are available.  This is clearly much simpler code than in Figure 4.11. (The pseudocode shown in Figure 4.11 and Figure 4.12 are focused on messaging, and hence omit other important things such as the handling of message data, scratch-pad areas, etc.)

While the availability of both synchronous and asynchronous messaging will not guarantee it, clearly ease of programming is enabled by the presence of both options.  As always, bad program design can make anything messy; but good design is made infinitely easier if facilities which support that design are available.

What about the server end—the recipient of messages?  How do we provide ease-of-programming if a given Server might be ‘sent’ a message on one occasion, and ‘posted’ a message on another?

One approach might be as shown in Figure 4.13.  However, since we need a messaging infrastructure anyway, why not have that infrastructure hide whether the sender sends or posts the message?  This might work as shown in Figure 4.14. In this figure, the complexity is hidden in the piece the programmer doesn't see—the messaging infrastructure.  The server programmer writes the same thing (program C) regardless of whether a message has been sent or posted.

 

 

 

 

Figure 4.11.    Messaging with asynchronous messages only.

Also hidden in the messaging infrastructure are things not shown in the figure, such as how the message queuing works, and how task and/or process boundaries are managed.  Finally, it's worth noting that the messaging infrastructure invokes the application programs directly.  Now it may be the case—especially in the PC—that several such programs are loaded concurrently in a single thread (task).  This means that the Infrastructure must handle both intra- and inter-task messaging.


Figure 4.12.    Messaging with asynchronous and synchronous messages.

 

Of course, a synchronous message as shown in Figure 4.14 is really the same as an RPC.  However—and here's the difference—we can build very similar APIs for both ‘send’ and ‘post’, thereby presenting the programmer with a consistent messaging model.  Notice that the programmer never sees queues.  He or she merely sends a message, and knows that the returned data will be available on the next sequential instruction, or posts a message, knowing that a subsequent incoming message (‘event’) will carry the response (the programmer should always specify in the post statement the response message that he or she would like to receive).

Note that the asynchronous message is quite different from the form of RPC where the handle of the routine to be invoked by the response is passed with the request. With the asynchronous message, the programmer has nothing further to do. With RPC-with-handle, the programmer has to additionally handle such things as:

        What to do if the response is needed before the response routing is invoked

Figure 4.13.    Server visibility of message type.


Figure 4.14.    Server transparency of message type.

        How to re-synch if the response routine is invoked whe4n in the middle of something else

        Checking  if the response  routine has been called.

Recursive Messaging

Consider Figure 4.15.  Here, a message X is received by A (1).  Processing of message X requires message Y to be sent to B (2).  But B's processing of message Y includes sending message Z (back) to A (3).

In this case, the messaging infrastructure feeds message Z to A just as if it were any other message.  However, remember that A started off by processing message X.  But before it completes this processing, it has to handle message Z.  Note that when it receives message Z, it has not received the response from B to the first message X.

This means that the same code (A) is entered twice.  Although no problem for modern stack-based compilers, this requirement for recursiveness on the part of the language being used should not be overlooked.

Figure 4.15.    Synchronous messages and recursive code.

The knowledge that his code may be called recursively (or, to be precise, re-entered by a second thread of control before the first thread has completed and returned) cannot be hidden from the programmer.  Thus the Infrastructure must ensure serial re-usability between clearly-defined points.  In this case, the programmer should have no more difficulty that he does with similar situations in transaction processing.  He or she does not have to deal with re-entrant code, merely with serially-reusable code.

GUI Events

Turning to the PC specifically for a moment, we notice that system-provided GUI APIs are event-driven.  So if we can have the messaging infrastructure present incoming GUI messages to us in the same way as other ‘events’, then we will add even more ease-of-programming, since the programmer starts to see all events in the same way—as incoming messages, all conforming to the same programming model, of the same format, and having the same API.

Conclusion

Of the various connection models, we choose messaging to implement the client/server connection, for these reasons:

 

        It maps well to the necessary model of cooperative processing.

        It handles asynchronous requests in a natural way for event-loop programming (which is natural on the PC).

        The server can be of several different styles, and event-loop programming at the server is not forced.

        It provides a natural way to get over the task/process boundary transparently to the programmer.

        It does not always force asynchronous operation, and so can be used for synchronous messaging as well; however, the same API model can be used for both.

        It can handle a server responding to a request with multiple responses (multiple messages).

        It can be implemented (under the covers) with any of the normal mechanisms:  RPC, message queuing, conversation, inter-process communication (IPC) mechanisms, pipes, etc.

Overall, then, a messaging API (both at the model and at the code level) provides an excellent balance between programming simplicity on one hand, and wide applicability on the other.  This will be of vital importance in the further development of our cooperative processing model.

Meanwhile, we must pick up again the notion of separation of the user logic from the data logic referred to at the beginning of this chapter.

4.3 The user vs. the data

Earlier in this chapter, we briefly discussed the separation of ‘user logic’ (application code on the PC that serves the user) from ‘data logic’ (application code that accesses shared resources).  We then added ‘business logic’ (application code that implements business processes and rules) on the side of data logic.

This separation is a fundamental concept for client/server systems.  For, although there are some similarities in required functions at a server and at a PC, the two systems are differentiated by their prime purpose.  Thus:

        The PC is there to serve a user; to provide the user with a human-computer interface that will make it possible to use—to exploit—available IT resources, such that the user's business objectives can be achieved.

        The server is there to make valuable IT shared resources available to multiple concurrent requesters, while at the same time protecting the integrity of those resources, and ensuring that business rules are enforced.

These two quite different aims make the client and the server different in the facilities they have to provide.  For example, data commit and roll-back would not generally be expected of a PC, while providing the user with a choice of fonts on a graphical user interface would not be expected of a server.

In today's client/server systems, both of these are equally important if full value is to be gained.  Often the user aspects are said to be less important than data integrity and business rule enforcement.  From the point of view of protection of invaluable corporate resources, that's true.  From the point of view of using those resources to best effect, it's false.  Client/server systems must not only manage shared resources; they must also provide the best possible access to them.

We formalise these two aspects into a general model defined in terms of a ‘user interface domain’ and a ‘shared resource domain’, as follows:

4.3.1             .The user interface domain

The user interface domain (or UID) is the application code (CBOs) responsible for serving a single user.  There will be as many UIDs in a given system as there are active users.

The key characteristic of the UID is that it has a user interface.  Thus in the UID, CBOs will be focused on presenting data and function (from whatever source) to a single user, and allowing that user effectively to interact with it. 

The UID has no responsibility for data integrity, nor for the enforcement of business rules.  All operations on data (and other shared resources) which originate in the UID will be requested of the shared resource domain.

4.3.2             The shared resource domain

The shared resource domain (or SRD) is the application code (CBOs) responsible for ensuring the security and integrity of corporate shared resources such as data.   

CBOs in the SRD will be concerned with things like concurrent update protection, accessing shared data (and other resources) and commit scopes.

The SRD will normally provide a number of specific ‘one-shot’ processes (such as ‘update customer’), and ensures resource integrity by encapsulating the commit/roll-back processing.

The boundaries of an SRD are defined by the boundaries of the underlying resource manager (such as a transaction processor and/or a database manager).

4.3.3             UID and SRD interaction

A simple example of UID and SRD is shown in Figure 4.16, where a PC is connected (perhaps across a LAN, through a Gateway, and across a WAN) to a mainframe.  Requests can flow both ways, and are always in the form of messages.  One domain issues a message to the other; the other responds by sending one or more messages back to the requestor.  Normally, the requestor will be the UID.

Our domain model admits of many other cases than this, however.  For example, Figure 4.17(a) shows the case of a distributed database management system (DBMS), where some or all accesses to the database are issued from the PC.  In that case, the SRD extends from the mainframe to the PC.  The PC contains a UID and part of the SRD.


Figure 4.16.    Cooperative processing domains (1).

Figure 4.17(b) shows the case of part of the UID being on a LAN server, with the SRD on both the LAN server and a mainframe.

Although each domain will have some similarities, they will differ at the detail level.  In particular, the detail of the CBO code in each will be very different.  For example, the SRD assumes the existence of an underlying resource manager (such as IBM's CICS with DB2, or one of the several DBMS’s which also have transactional capabilities), while the UID will be greatly concerned with the object-based user interface.

Figure 4.17    Cooperative processing domains (2).

There will be as many UIDs as there are users.  In addition, there may be more than one SRD (depending on the resource manager used).  While the SRD is normally the server, there will be occasions when something in an SRD needs to initiate a request of a UID.  Thus the connection between the two, while always ‘client/server’ in the sense of request/response pairs, is more of a peer-to-peer nature.  This is because both can be clients (that is, both can make a request of something else), and both can be servers (that is, both can receive requests from somewhere else).

An SRD may well access another SRD directly.  For example, a UID on a PC may send a request to an SRD on a LAN server, which, to handle the request, may need to read some data from the mainframe server.  Without an underlying distributed database, there would be two SRDs involved—one at the mainframe, one at the LAN server.

Both the UID and the SRD are implemented with CBOs.  Within each domain, there are a number of identifiable types of CBO.  The next two chapters discuss these different types.

 

4.4 Summary

In this chapter, we have identified and discussed the need for:

        Ease-of-programming to implement cooperative systems.

        Asynchronous requests between the two pieces of cooperative code.

        A requester/responder design approach for cooperative processing—that is, an atomic request/response interaction, where the responder (the ‘server’) knows nothing about the state of the requester (the ‘client’), and where the responder, between invocations, does not retain knowledge of previous invocations.

        Messaging (as opposed to RPC or conversation)—together with an ‘event-loop’ style of programming—as the single approach that meets the ease-of-programming and asynchronous request requirements.

        Synchronous requests between different pieces of event-loop code.

We have also introduced the following concept:

        Separation of user-oriented and shared resource responsibilities into the ‘user interface domain’ (UID) and the ‘shared resource domain’ (SRD).

So we have seen how messaging, combined with event-loop programming, can potentially give us the best of all worlds—both synchronous and asynchronous requests with a single simple API and programmer's model.

This also gives us a way to handle incoming unsolicited messages.  And, of course, the messaging infrastructure can cross the task/process boundary without the programmer being aware of it.

Furthermore, we discover that the structure of application code which best delivers ease of programming in this environment is the same structure that we found we needed to handle the object based user interface.  By applying the techniques of object orientation to event-loop code, we gain the benefits of both OO and cooperative processing.  The resulting structure—the thing that the application programmer writes—we call a ‘cooperative business object’ (or CBO).

Finally, we see an important separation of function in client/server systems into the UID and the SRD.  This separation is not only technically necessary (for asynchronous requests) but also hugely useful from a design point of view, as will become apparent in later chapters.

 


5              The user interface domain

 

In this chapter, we look in more detail at the area we called the ‘user interface domain’, or UID.  In client/server systems, the UID is the province of the ‘client’ CBOs, and is located (either completely or mainly) in the PC.  Since the UID is all about serving a single user, we focus in this chapter on the CBOs immediately behind the object-based GUI. 

Consideration of the responsibilities of the UID leads to our finding that two different types of CBOsthe ‘view’ and the ‘local model’are needed.  This is a refinement of a concept which has been around for several years, the idea of a ‘view’ object and a ‘model’ object.  We find that, while a single model object may be adequate for stand-alone systems, it is not so for distributed client/server systems: hence the introduction of a ‘local model’.  In this book, we replace the general notion of a ‘model’ object by specific types of CBO in the SRDthe ‘focus’ and ‘entity’ objects.  These are discussed in depth in the next chapter.  Appendix 4 discusses the difference between the ‘local model’ and the ‘model’ in some detail.

In the body of this book, the term ‘model’ is used exclusively to mean the local model (in the UID).  For this reason, whenever you see the word ‘model’, then you should take it as meaning ‘local model’.

So, the UID is populated with view and model CBOs, and between them, they contribute significantly to our overall objectives, which were:

        Object-based GUIs,[15] plus:

        Cooperative Processing, plus:

        Software re-use and integration potential

all with:

        Ease of programming for the average application programmer


At this point, and before developing the main argument, I'd like to interpose a brief note on what I call the ‘mainframe culture’.

The ‘mainframe culture’

Among some IT professionals today, there exists the notion that the PC end of a client/server system is merely a trivial user interface, of no import, and best left to some GUI layout tool whose selection, use and implementation should be assigned to the most junior and least experienced members of the IT department.  I even heard someone say, within the last two years, that, ‘Advances in business modelling, data analysis and CASE tools have been such that we can now implement an entire system—right through to system test—without considering the user interface at all.’

Client/server developments where this impression has been dominant have always (in my experience) run into serious trouble.  For such attitudes not only indicate a breathtaking arrogance (the user doesn't matter); they are also a sure recipe for disaster.  At the beginning of a new technology curve, the system as a whole is the proper realm of the more senior and experienced IT professionals.  This means understanding equally what is needed for both the UID and the SRD. 

As it is, there seems to be a lamentable level of ignorance among many senior IT professionals about a key part of a client/server system—the PC.  Test this for yourselves.  Look around your own IT department.  List the more senior and knowledgeable professionals—the ones whose technical judgement is trusted by senior IT management.  Now list the people with acknowledged expertise in PCs.  If there is more than 10 per cent commonality, you're probably well ahead of the game.

This is, however, more than just an attitude.  It tends towards being a culture, and two conflicting corollaries can be observed. Sometimes, senior IT management can be overawed by someone whose experience is limited to the PC and LAN environment only, resulting in the person being over-valued in wider systems contexts. Alternatively, someone who understands PCs is often seen as thereby necessarily being inexperienced in other areas of IT expertise.  I call this the ‘mainframe culture’.  It is not mainframes that are dinosaurs, but the culture of the mainframe.

PCs and the notion of client/server systems have been with us for some five years at the very least (I first met this idea some ten years ago).  The reason we have not yet got to grips with the shape of client/server applications is arguably because of the mainframe culture.  It has been a significant failure of the IT industry.  If this book helps, in any way at all, to change that failure to a success, then it will have served its purpose.

The truth is that the UID is equally as important as the SRD.  After all, the whole reason for putting expensive PCs on users' desks instead of cheap character displays is to enable users to exploit the SRD.  If this cannot be done, then why waste money?

 


It may well be that in five years or so, we will have discovered how to generate object-based user interfaces automatically.[16]  We may also have discovered how to generate high performance databases and business logic in the SRD automatically as well.  Until then, the serious designer of client/server systems must pay equal attention to both ends of the system.  They are very different in their technical detail; but one without the other will severely decrease the value of the system to the business.

5.1 User logic

In the previous chapter, we developed our general model to the point shown as (c) in Figure 5.1.  We used the example of a PC connected to a shared resource system as the vehicle for discussion, and we now focus on the PC end of that system.

We begin by examining further the User Logic part of the model.  What do we mean by ‘user logic’?  Many people have the strong opinion that ‘business rules’ and ‘business logic’ (or ‘application logic’) have no place in a PC; for integrity reasons, it is said, such code must be restricted to the server. 

Well, consider the following examples: 

        Application-level logic may need to be run to handle some drag/drop situations.  For example, a user may not be allowed to drop an item on to an Order Form if the customer is over credit limit.

 

Figure 5.1.    Cooperative processing model (2).

        An ‘apply’ push button on a window may need to be greyed out (made temporarily unavailable to the user) while a response is obtained from a server.

        A Post Code (Zip Code) keyed in by a user may need to be edited to ensure that it's at least in the right format.

        A request needs to be sent from the UID to an SRD to get some data which needs to be displayed.

        The current value of a customer order may need to be shown as items are added to it.

All of these are things which will require application logic to be run; all of these are things which you really don't want to go back to the server to do.

For the first example, indeed, you could not go back to the server.  The application logic to handle drag/drop needs a very short response time to satisfy user feedback requirements.  The mouse pointer needs to change from ‘OK’ to ‘Not OK’ as the user drags the customer over the border of the order form.  This means that the application logic which decides whether a drop is OK or not must run certainly in the PC.

The last example clearly requires some ‘business’ or ‘application’ logic, just as it may require the application of ‘business rules’ to calculate the correct total.  And again, we don't want to go all the way back to the server just to accumulate a total!

The one thing which distinguishes all of the above examples is that they all serve the user in some way.  They are not essential to the core business process; but they (and others like them) are essential in enabling a person using a PC to drive those core processes.

Whether we call such code ‘business logic’ or not becomes then a matter of definition.  In the last chapter, we implied that business logic existed only on the server.  We now amend this, and make the following definition:

Business Logic is all the code which is peculiar to specific business data and/or processes.

Thus code which greys out an ‘Apply’ button on the screen is not business logic; but code which applies a business rule to decide whether or not a button should be greyed out is business logic.  An EXEC SQL statement which accesses the business' prime ‘customer’ data is not business logic; but code which checks whether the customer is over his credit limit is.[17]

However, we also differentiate between UID business logic and SRD business logic.  Thus there are two kinds of business logic; that relating to the UID, which is there to assist a person (the user), and that relating to the SRD, which is there to assist the business as a whole.

Now, the application logic on the PC has to do two things:

        Handle the GUI (graphical user interface)

        Provide UID-related business logic, some of which will involve making requests from the UID to the SRD (typically across a task boundary)

Because these are quite different kinds of logic, it becomes very useful to split the user logic into two parts: the ‘view’ (or ‘user presentation logic’ ) and the ‘model’ (or ‘user business logic’ ).  This is shown in Figure 5.1(d), where the ‘focus’ and ‘entity’ boxes on this diagram relate to the SRD, which we shall address in the next chapter.  Thus:

        The GUI requests services from the view.

        The view responds to the GUI and requests services from the model.

        The model responds to the view, and makes requests of the server.

        The server responds to the model (and may make requests of other servers before doing so).

This ‘model-view’ structure allows us to separate UID code to do with presentation and interaction (that is, connecting to the tool-produced GUI code which drives what the user sees) from UID code to do with the business entity, and is illustrated in Figure 5.1(e). Further, a view will be implemented as a CBO, as will a model. 

But didn't we imply that, for each object on the object-based user interface, there would be a (single) CBO?  Well, yes, we did.  But before we expand on this, let's look a little more closely at the idea of models and views.

5.2 ‘Models’ and ‘views’

Consider the kind of object-based user interface we discussed in the first part of this book, in Part One. Now an object on the screen (such as, say, a ‘customer’) can have one or more ‘views’.  Why is this?  Well, consider the case of a user who needs to understand both a customer's mailing address and the physical location of that address.  Such a user might well want to look at both a text presentation of the address, and at a map showing the location of the address.  An example of this is shown on the left of Figure 5.2, where there is an icon (the customer) and two windows.  One window—or view—shows the address of the customer as a text description, while the other shows exactly the same data, but presented as a map of how to get there. 

5.2.1             Multiple views

This is paradoxical; for inasmuch as both views are showing the same thing—the address—they are the same.  But inasmuch as one is a graphical map-like thing and the other is a text-like thing with formatted fields of data, they are quite different.

We deal with this paradox by recognising both the differences and the sameness in different bits of code.  Thus the difference between the two windows is handled by structuring the code which handles each of them into two separate CBOs.  This is shown on the right of Figure 5.1, where one CBO deals with a map (which shows how to get to the customer's location), and another deals with a text window (which shows the customer's address in text).  We call each of these a ‘view’ CBO, as each handles one visible aspect of an object—what the user can see (or view)—on the screen.

 

Figure 5.2.    Model-view object structure

But where is the code which deals with the customer per se?  Again, we find it useful to separate the ‘sameness’ aspect of the two views into a separate CBO—which has no knowledge of the details of the views, nor of the fact that there is more than one.  This we call the ‘model’ object, as it relates to the user's model of ‘customer’. (The terms ‘view’ and ‘model’ are often attributed to the Smalltalk ‘model/view/controller’ framework, whose context is a GUI.  In this book, the same terms are used with expanded meanings to fit the wider distributed system context.)

Note that the ‘customer model’ CBO handles a particular instance of customer—for example, it will handle customer Jones, not the whole database.  If it is required to handle other customers concurrently (for example, customer Smith), then there will be a second instance of the customer model CBO.  Such multiple instances will be handled by the ‘infrastructure’ or ‘middleware’ code which is necessary to deliver the ease-of-programming for cooperative processing and advanced GUIs.  As the discussion progresses, we shall add additional requirements for this infrastructure, and will summarise them in Chapter 8.

Thus we have expanded our general model in a way especially important for object-based user interfaces.  The ‘user logic’ part in fact consists of two different kinds of CBO—views and models.

5.2.2             A single CBO?

Now let's come back to the question of the object-based user interface, and the idea that each object will be implemented by one CBO.  Well, that single CBO is the model CBO.  A model may have one or more views associated with it.  Each view will be implemented as a separate CBO.

The implications of this are:

        A model will not know how many views it has; but it will know that it may have views.  Hence on occasion, it may need to send a message to whatever views it may have (for example, if its data has changed, then it may need to have its views reflect that change).[18] 

        A view may not know what model it is viewing.  This allows for general purpose views, usable by several models.[19]

        The linkage between views and models should be provided by the CBO Infrastructure.

5.2.3             Model and view differences

We can best describe the differences between views and models by looking at their responsibilities.    

The view CBO is responsible for:

        Understanding what data—and in what format—is displayed in a window

        Handling user interactions (such as understanding that a push-button has been pressed, that check boxes have been checked, that data has been entered, etc)

        Providing any business logic directly associated with a window on the screen.  For example, the view would do simple edit checking, display error messages, or provide application-related dynamics beyond the possible knowledge of a GUI layout tool.  For example, if the ‘apply’ button on a ‘new installation form’ must be disabled if the customer is found to be over their credit limit, then it would be the view's responsibility to disable the button.

The model CBO is responsible for:

        Handling business logic common to several views.

        Being the place to send messages to in the UID (other CBOs should not know about views).  For example, if an order form object is interested in a specific customer object's address, it might get that address by sending the customer object an appropriate message.

        Providing other CBOs in the same UID with data or status (an order form may need to know the zip code, for example.  The order form CBO would send a request to the customer model CBO asking for the zip code—not to one of the customer's views).

        Handling garbage collection.  For example, a user might have looked at a given customer some time ago, and may have closed its views.  The underlying infrastructure may send a ‘discard yourself’ message to the customer, since it recognises that it has not been referenced for some time.  That message would be sent to the customer model, which would then decide whether or not to discard itself.  The implications of this are further discussed in Section 8.3.11.

        Handling initialisation processing (for example, when a user drags a new order form from a ‘pad’ of order forms, it would be the order form model CBO that obtains the new order number).

        Handling requests which result in an interaction with the SRD.  The model CBO is the object that knows how to ‘talk’ to the SRD(s).

        Handling consistency within the UID (for example, a business rule might be that a change to a customer address initiated from a given UID should be reflected anywhere else in that UID where the address is in a view.  In this case, the relevant model CBOs would handle things, then report to any views that a change had been made).

        An object might be a ‘background’ object, with no views or icon.  Such an object would be implemented as a model CBO.

        Handling user authorisation, by communicating (probably) with some security and/or authorisation object.

        If certain common user interface actions are disallowed (for example, making a copy), then the model object is the appropriate place for the corresponding application code.

        Handling application-related things within the user interface domain, and which are beyond the competence of a single view.  For example, if the user drags something across a customer icon (an icon is not a window, and hence does not have a view CBO), the model would be asked if a drop is OK or not. 

Model-to-model interaction

Clearly in implementing business processes, it will be useful—if not essential—for one model object to be able to ‘talk’ to another.  For example, if the user wants to enter customer details on an order form, and if we have both an order form model object and a customer model object, then we would want the order form model object to talk to the customer model object so that, for example, the order form can display customer details in its view.

Thus model objects must be able to communicate with each other, as well as with their views.  Figure 5.3 illustrates this.  It also illustrates the model talking to the ‘server’, using the same API.  We discuss this later, in Section 5.4.

5.2.4             Merging model and view

The idea of views and models discussed so far has a number of implications which, while not rules, are useful design guidelines:

Figure 5.3Model-view and model-model interaction.

        A single model can have several views

        Any given instance of a view CBO has only one model. 

        The model knows where to get its data from.  The view will get its data from its model.

        A view cannot exist without a model

The last implication is interesting.  What about the situation where the View encompasses all of the function required, and hence the model is extremely simple?  Does that mean that we are forced to build two CBOs?  The answer is no (certainly from the programmer's viewpoint).  Where a single combined CBO is all that is required, then what is built is a single model which has lots of view behaviour.  We call this a ‘view/model’.

By the way, note also that if you were to build a large view/model CBO, with several secondary windows, and which ignored all messages from other CBOs, then you would have built something that looks very much like a traditional PC application.

5.2.5             Views and the GUI

Let us go back to views for a moment.  For we have to ask, how does the view CBO relate to the system-level GUI API (such as Windows, or Presentation Manager)?  For ease-of-programming, we can say that the view CBO programmer should not have to understand such APIs; he or she should be completely shielded from them.

The answer is to have some GUI layout tool and the Infrastructure work together to produce the window layout in such a manner that:     

         When the user does something to a window, the view CBO receives messages such as ‘Apply Button Pushed’, or ‘Menu Item `Save' Selected’—where those messages are exactly the same in format and structure as those received from other CBOs.

        The view CBO programmer can treat the window as if it were another CBO, and send it messages (such as ‘write ‘Joe Smith’ into the field ‘CustName’).

5.3 The view ‘layout’

This is an attractive solution, as it means that the view CBO does not have to understand the layout of the window, just what fields and controls or widgets are on it.  (‘Controls’ in the context of GUI windows refers to units such as scroll bars, push buttons, check boxes, entry fields, title bars, menu bars, etc.  A ‘widget’ is the equivalent in X-windows.)  Thus if the layout is altered, the view CBO need not change.

Figure 5.4 shows how this might be implemented, where the window is an Order Form, and the View CBO is called the ‘order view’ CBO.  At build time (on the left of the diagram), a WYSIWYG (what you see is what you get) window layout tool allows the designer to design the order form window.  All windows on GUIs are driven through the system-level GUI API, and the layout tool would talk directly to that API.  The output from the layout tool is a script file (or layout definition file)—the ‘OrderView’ Layout Script.  This script file is so called because it is a human-readable ‘script’ of the window layout.  Mind you, not many humans would understand it.  By ‘human readable’, we mean that a knowledgeable human could understand it using a normal text editor to look at it.

At run time, the ‘window manager’ part of the CBO infrastructure reads this script file, and talks via the GUI API to produce the window on the screen.  Any events signalled from the GUI API to the infrastructure are filtered, and a useful subset of those events are

Figure 5.4.   View CBO and window separation.

converted to CBO messages, and sent to the view CBO (‘OrderView’ in Figure 5.4).  In turn, the view CBO sees the window as another object, and talks to it by sending messages to it.  These messages are converted by the window manager into the required system-level GUI APUI calls.

This approach has a number of advantages in addition to those noted above:     

No code generation

The window layout tool does not generate code.  This means that a programmer using (for example) some high-level interpretive language such as REXX does not have to understand a development environment which may be quite alien to him or her just to be able to use the layout tool.

No code integration problem

Some layout tools which generate code have to deal with the problem of incorporating application logic into the generated code.  Not generating code avoids this problem altogether.

Late binding of layout and view CBO

The layout script and the view CBO can be associated—or ‘bound’—at run-time, thus allowing great flexibility during development (and maybe for the user as well—see Part Five).  Run-time association means that we can, if we wish, move towards the idea of several layouts having the same view CBO.  Again, since the view CBO is an object, we could place significant behaviour for different categories of windows into superclasses. (See Appendix 1 for a description of superclasses.)  Thus one might consider a `Transaction View' superclass, complete with an `apply' button, and the logic to handle delays when a message is sent to the SRD via the model as a result of the user pressing the `apply' button).  Such common logic might include (for example) greying out the Apply button until a response is received from the SRD.[20]

Layout change without code change

Separation of layout and the view CBO code means that the layout can be changed without touching the code in the view CBO.  Association of the two can be done through an external control file, or via some run-time utility CBO.

Dynamic layout creation

Although for simplicity Figure 5.4 shows the layout script being read in by the window manager, in fact it would be read in by a view superclass which would be provided by the CBO Infrastructure.  This enables a view to build a layout dynamically at run-time if that's what's required.

To illustrate this, suppose there's a superclass is called SuperView (not shown in Figure 5.4), and what we are building is ‘OrderView’.  When the window manager detects that a window must be drawn (for example, the user double-clicks on the customer icon, meaning ‘please show me a view of this object’), it will send a ‘QueryLayout’ message to ‘OrderView’.

Normally, the ‘OrderView’ CBO will ignore this message, and it would be passed to SuperView which would read in the layout script file into memory, and return the layout script to the window manager.  However, ‘OrderView’ could over-ride the ‘QueryLayout’ message, build its own layout, and return it to the window manager.

This design allows layouts to be generated either by an external layout tool, or dynamically by a view CBO itself.

Direct access to the GUI API

Figure 5.4 shows a dotted line from the ‘OrderView’ CBO, across the system GUI API, direct to the window.  This illustrates that the ‘power’ programmer (for example, a C programmer skilled in the use of the System GUI API) is not locked out from doing advanced GUI work directly.

Thus although the average application programmer need know nothing about the system-level GUI API, the knowledgeable programmer can use it if required.

Dynamic addition of new controls

The window manager component of the CBO Infrastructure can be built such that new GUI controls (for example, a special kind of entry field which only allows entry of properly-formatted part numbers—or a pink polka-dotted blinking blob—or whatever) can be added without changes to the infrastructure.  This is hugely simplified if controls are treated as objects in their own right.

 

5.4 Connecting to the server

In addition to model-view and model-model interaction, we also need a way for models to communicate with the server.  For ease of programming, we'd like to use the same messaging model and API for all requests, whether local or remote (as indicated in Figure 5.3).

In order to do this, we need to have the CBO infrastructure handle the low-level communications APIs.  In this way, we achieve something rather attractive, in that the programmer of a model sees the server as just another CBO—with the infrastructure handling the requirement of getting over a task boundary.

Of course, the ‘server object’ here is not the server at all.  It is something which adapts an object message sent to it into some form of communications code.  This communications code routes the message to the real server.  We call this object an ‘adapter’.  The model CBO, however, sees it as the real server. 

Ideally we'd like that routing to be done by the infrastructure code.  Given that we will use messaging as the vehicle for models, views and servers to ‘talk’ to on another, let's have a closer look at the adapter.

 

The `adapter' object

The key point about an adapter is that the application programmer never sees it.  The programmer of the model sends a message to what he or she sees as the server.  However, the CBO infrastructure routes this message to the adapter, as shown in Figure 5.5.

The main characteristics of an adapter are:

        It appears to a Model Object as just another CBO

        It provides for getting over the task or process Boundary (without programmer knowledge)

        It routes a request from a Model to the server (using some system-level communication technology such as APPC, RPC, the CICS/OS2 ECI, Named Pipes, TCP/IP etc.)

        It routes responses from the server back to the requesting Model Object

        It is not written by the application programmer

        The Adapter may itself be a CBO (which neither the user nor the application programmer ever see).  That is, it can be an Object, and can benefit from the usual inheritance mechanisms of OO.  For example, there may be a general-purpose Adapter which handles the multi-tasking or multi-processing aspects.  A subclass of this might filter messages, such that only valid messages are sent to specific Servers.

Figure 5.5.    Model—adapter—server.  


The adapter is built as a ‘hybrid’ object—displaying object behaviour and appearance to Model objects, but driving procedural session-level communications code on its ‘other side’.  The essential idea here is that a message sent by a model to a ‘remote object’ such as, say, a ‘customer database’ server (which might actually be a CICS transaction), is mapped to a send/receive transaction by the adapter.  This process would normally have the general shape shown by Figure 5.5 (which illustrates the asynchronous case).

’Xmit’ (transmit) in Figure 5.5 indicates a communications send function (such as issuing a CPI-C (common programming interface for communications) ‘CMSEND’ verb—as opposed to sending the sort of message implemented for models and views.  (There could be several transmits from the server; and of course, in the communications code task there may be a loop round the receive.  It is also sometimes useful to separate the communications send and receive into separate modules; this option is not shown in the figure.)

Finally, the adapter must maintain the semantics of the two ways of issuing messages from a CBO—send and post—as follows:

1.      Send     In this case, to allow for the situation shown in Figure 4.15, the infrastructure must arrange things such that although the message is synchronous with respect to the sending CBO, the sending CBO must not block.  This is regardless of whether the communications link is synchronous or asynchronous.  In addition, the data returned must be available at the next sequential instruction, even though it might have been returned across a network.  In this way, the programmer's model of what ‘send’ means, and what it does, is maintained regardless of whether the message is sent to a local or remote CBO.

2.      Post     The semantics of a ‘post’ require a return of control from the adapter as soon as it has initiated its communication to the server, even though the communications mechanism might be synchronous (such as some terminal emulations, or some RPC implementations).  The adapter must also catch the response, and route it back to the requesting CBO as another message.

Messages are routed from the model object to the adapter by a routing function in the infrastructure code.  It's worth noting that there are really two different routers at work here:

1.      The message router which picks up a message sent from a CBO and routes it directly to another CBO in the same system—for example from the model to the adapter (or to another model)—and from the adapter back to the model

2.      The piece of communications code which sends the message to another system and catches the response(s).

If the receiving end is a CBO, rather than a traditional transaction program (say), then of course the CBO infrastructure must provide a ‘listening’ function, and also an initiation function such that the receiving CBO can be identified and (if not already loaded) loaded into memory.  The incoming message can then be routed correctly to the receiving CBO.


There are also other considerations here:  Suppose messages for (say) 20 remote objects are all to be carried on the same APPC LU-LU link.  A way must be provided for routing messages to any of the 20 remote objects to the same adapter, which should handle multiple sessions concurrently (assuming an independent logical unit (LU)), each in a separate task.  Thus a form of ‘aliasing’ is required (preferably table-driven).

 

5.5 Summary

In this chapter we have explored further the user interface domain—the area in which the user presentation logic, and that business logic which assists a single user, reside.

We determined that it is useful to create two separate types of CBO to handle the two areas of programming logic—a view CBO (user presentation logic) and a model CBO (user business logic).

Each model object may have several views, allowing the user to view an object in several different ways.  Where a single view only is required, and where the application code is simple, then the model and view may be combined into a single CBO (the ‘view/model’).

We discussed how a view CBO could communicate with a window on the GUI via a ‘layout’ (a definition of the window layout), and how a model could communicate with a server via an adapter object in such a way as to provide the programmer with local/remote transparency.

Throughout, a central theme has been making things easy for the application programmer, and we have defined the functions required of a software infrastructure (middleware) to achieve this.  Such a definition is not academic; it has been proven in practice.  The infrastructure defined, and the structure of CBOs, is designed specifically to make the production of cooperative application code much easier than it has been previously.

 


6              The shared resource domain

 

In this chapter, we examine the structure of the shared resource domain (SRD)—the Server end of a client/server system.  The context for discussion is that of core business systems (mission-critical systems), where the server contains transaction processing and database management software.  The transaction processor and database between them are assumed to provide an environment within which concurrent transactions can each be either entirely completed (and any changes made to shared resources committed), or, in the case of a problem, can be rolled back (and any changes backed out so that resources are restored to their pre-transaction state).[21]

First, we discuss the nature of the SRD.  We then define the design points for the SRD.  The most important of these is that, to protect resource integrity, a single request to the SRD should be entirely handled within one interaction, so that the response to the requester (typically a UID) reflects committed work, or some error (implying that no changes have been made).

A structural overview of the SRD structure is then presented, in terms of traditional transaction programs.  Just as the UID consists of a number of different elements, so the SRD consists of:

        A front-end which hides the programmer from underlying communications complexities.

        Focus modules, which focus on the commit scope, such that a transaction is handled properly.

        Entity modules which manage a specific logical entity.

        Resource modules which manage a specific resource (these are very often combined with entity modules).

Each of these components is then discussed. 

Finally, building the SRD with CBOs is discussed.  Having CBOs in the SRD raises the question of the concurrent object (an object through which multiple transactions run concurrently), and this—in the context of CBOs—is discussed.


We find that CBOs can indeed be used profitably at the server end of a client/server system (though for transaction processing systems, some technical questions are still outstanding).  We also find that there is a distinct difference between the user interface area on the one hand, and the area where shared resource integrity is essential on the other.  This leads to a specific set of design guidelines which ease considerably questions such as ‘where does the function go?’

6.1 The nature of the SRD

A high-level view of the cooperative processing model developed in previous chapters is shown in Figure 6.1, where each box is application-level code, the lines linking them are messages, and the ‘task boundary’ shows where messages must be asynchronous.

At a high level of abstraction, this model is one of PCs connected to systems that manage shared resources.  The view and model are to do with the user interface, and have nothing to do with accessing shared resources.  The ‘server’ is the code which accesses shared resources (such as a database).  Often the ‘server’ will be on a mainframe or a LAN server.  However, even in the case of a distributed DBMS with APIs available on the PC, this design still applies (but in this case the ‘server’ might be located on the PC).

With this separation, we can talk about the ‘user interface domain’ on the one hand, and the ‘server domain’ on the other.  However, using the term ‘server’ here now becomes difficult, as it can easily be confused with the same term used elsewhere (e.g. ‘client/server’, ‘file server’).  What we are really talking about is the ‘shared resource domain’, or SRD (introduced in Section 4.3.2.  In the SRD, shared resources are accessed by application-level code (the ‘server’ module in Figure 6.1), and managed by system-level resource management code (such as is found in IBM's CICS and DB2 products). 

Now, an over-riding imperative in the design of any system which provides access to shared resources is to ensure resource integrity.  In general, this means using the available capabilities of the system-level resource manager, and conforming to any constraints it might impose.  (We do not want to write our own two-phase commit function!).  From the point of view of the cooperative processing designer, there are two important aspects of these capabilities which must be taken into account.  These are:

 

Figure 6.1.    General model of cooperative processing.

        The scope of the resource manager (single system, distributed, etc)—that is, the scope over which it can manage multiple concurrent and recoverable units of work.

        The commit scope as seen by the programmer (not always the same as the scope of the resource manager).

6.1.1             Resource manager scope

A good example of what we mean by ‘shared resource’ is a database accessed concurrently by more than one user, where a system resource manager provides concurrency control, backup and commit function, perhaps logging, etc.  For example, IBM's CICS and DB2 products are both good examples of resource managers.

The essence of a resource manager is that it keeps resources secure.  This means not only that there is adequate backup and recovery, but also that resources are kept in a consistent state when errors occur during changes.  This means that database changes are either rolled back to the previous consistent state, or they are committed.  In transaction processing systems, the unit of consistency is typically the transaction.  The scope of the commit may be a single database, several databases (not necessarily in the same machine if there is a distributed DBMS), or both databases and some other resource (such as a communications link).  It is the resource manager provided by system software which defines the commit scope.

For simplicity, we shall in general assume that the shared resource is always a database.  This does not affect the essential argument.

Concurrency is typically controlled through serialisation of changes by the underlying resource manager.  Application-level code should not see this serialisation.  The first transaction to request the start of a commit scope (sometimes done simply by virtue of the transaction starting) will lock the resources required by that transaction, and commit (again, sometimes automatically) at the end of the transaction.  While the database is being changed, it is locked to other transactions (or at least the records or rows in question are locked against changes by other transactions).  These other transactions are (normally) queued, or serialised by the resource manager.

The scope of the resources controlled in this way by a resource manager we will call the resource manager domain.

Thus when considering DBMSs (database management systems), the resource manager domain is the area which the DBMS controls.  If the DBMS is distributed, then this area may extend over several physically separate systems.

However, there is an additional constraint on commit scope as it applies to the designer of application code.  Consider a single business process which requires two databases to be updated, one on machine A and one on machine B, where both A and B are managed by a single (distributed) DBMS.

This operation could potentially be written in two ways:

        A program could be written on Machine A which updates data on both machine A and B.

        A program could be written on machine A which updates data on A, then calls a program on B to update data on B.

Many distributed DBMSs limit the effective commit scope to a single program—or a transaction which runs (as far as the application code is concerned) on a single machine—even though that single program can, through the DBMS, update data on several physically separate databases.  In this case, to ensure data integrity, only the first option is available to the application designer.  The second option is only possible where the DBMS (or DBMS and transaction processor working together) allows code in several different machines to participate in a single controlled transaction.

Thus although a DBMS may give the programmer access to data on several machines (the resource manager domain), it may impose limits on where he can place his code to access that data.  The area within which he can place this code such that a recoverable unit of work can be initiated, run and completed, we call the ‘shared resource domain’ (or SRD). 

If a transaction (a ‘commit scope’) is further limited on some system to a single process or address space, and that machine can run multiple processes concurrently, then there would be multiple SRDs on that one machine.

However, there is an additional constraint on commit scope as it applies to the designer of cooperative `server' application code.  Commit scope may be limited to a single transaction.  This means it is limited to the application-level code which runs in that transaction, even though the same database may be accessed concurrently by many other transactions.  Sometimes a distributed DBMS will allow application code in several different machines to participate in the commit scope; but the start and end of the commit scope must always be issued from the same program—which runs in a single given address space (partition, job, process, task, etc).  The important thing for our purposes is the effective commit scope as available to the application-level programmer from the point of view of his or her application code.  This scope defines the boundaries of what we call this the `shared resource domain'—or `SRD'—as opposed to the resource manager domain.[22]

Here is an example to illustrate the concept of ‘shared resource domain’.  A certain computer system provides a single (non-distributed) database resource manager, accessible by all concurrently-running programs.  We would say that the resource manager domain is bounded by that system.  Further, each program runs in a different process (address space, partition, task, job, etc.).  Now assume a system-imposed rule: that, for any given unit of work (transaction) to be recoverable, all database accesses within that unit of work must be issued from a single process or address space (that is, from a single program).

In this example, the resource manager domain is the entire system.  The SRD, on the other hand, is a single process—or application-level code running within a single address space. 

This would mean that the designer would have to ensure that all server code handling a given recoverable transaction must run within a single address space.

Another way of putting this is that the resource manager domain is the controllable commit scope as seen by the designer of the resource manager, while the shared resource domain is the commit scope provided by the resource manager domain, but as seen by the application programmer of a single transaction.[23]

6.2 SRD design points

The requirements for on-line transaction processing (OLTP) are perhaps more stringent than for most other forms of cross-system connection.  We will therefore consider the design of code within the SRD with OLTP in mind.  Requirements on SRD design from less stringent environments we find are often a subset of the OLTP requirements.

In designing SRD application code, the following design points should be taken into consideration:

1.      ‘One-shot’ interactions     The first and most important rule for the SRD is this: An incoming message is treated as a single unit of work by the SRD.  This rule enables commit processing in the SRD to be properly completed.  By ‘commit processing’ is meant the ability of the SRD either to update resources and commit those changes, or roll back changes to their state prior to the message being received.

There should always be at least one response message sent back to the requester.  This will contain a return code indicating the success of the request, plus (maybe) some data.  When a large amount of data must be returned, there may be several response messages sent back to the requester.

Experience in the client/server area suggests that even when an OLTP can manage multiple interactions within a single commit scope, design and programming are made much easier by implementing this ‘one-shot’ approach.  In addition, the binding between the SRD and UID is made more loose. 

2.      Messaging     We adopt the messaging model for interaction with the SRD.

3.      No ‘dialog’ state information     Servers are atomic; they do not retain knowledge of any user conversation, dialogue sequence, or data, between invocations.  This implies that the SRD never holds dialogues with requesters.  However, it does not mean that the SRD is always ‘passive’.  It means that it is passive with respect to a requester (which might be a UID or another SRD).  An SRD can itself send messages to other SRDs—and to UIDs.  Some of these situations are discussed in Chapter 10.

The above may seem somewhat restrictive.  However, they are not hard and fast rules—they are an extremely useful set of ‘rules of thumb’.  They can be broken—if the designer knows exactly what he or she is doing.

Finally, it's worth noting that the SRD can be accessed from a different kind of UID—one which knows nothing about PCs, but implements dialogues for character-based terminal.  This is especially useful where both PCs and older non-programmable terminals (dependent on a mainframe computer) must be supported by the same SRD code.  In that case, a separate UID for the dependent terminals would be built.  Both PC UIDs and character-based terminal UIDs would request services of the SRD application code.  In this way, SRD code can be used for both client/server and centralised environments.

6.3 Overview of SRD structure

In Figure 5.5, the server (the SRD) was shown as a single module.  However, we need a little more than shown there.  In particular, we do not want the application programmer to have to write communications code.  Thus we provide a ‘front-end’, which handles the communications code and various other related matters.  This front-end will be an essential part of the client/server software infrastructure.

The server end of the client/server link, then, is as shown in Figure 6.2 (which also shows a model object in a UID for completeness).  In this figure, the server of Figure 5.5 has been split into a ‘focus’ part and an ‘entity’ part.  (Terminology in this area is developing.  The term ‘focus’ derives from the BSDM system development method.  Other terms sometimes used are ‘control’ and ‘business process’.  The ‘entity’ is also sometimes referred to as the ‘CRUD’ (for ‘create, read, update, delete’—a term first used in E/R modelling, where processes are mapped to entities in a CRUD matrix).

The focus provides the commit scope for an SRD transaction.  It is the ‘focus’ of the transaction, and will reflect a defined atomic process against some entity or group of entities.  The idea of the entity is to have a single place where handling of a given database (or perhaps a given logical view of several tables) is done.  Other code, if it wants to access some piece of data, would make a request of the relevant entity module.

 

Figure 6.2.    SRD structure.

Often the SRD requirements will be handled by a single (coalesced) focus and entity.  We might call this the server(!).  This is reflected in Figure 6.2, where both the focus and entity are seen issuing I/O against a shared resource (shown by EXEC SQL).

The general idea with this model is that the programmer should only have to build the view(s), the model, the focus and the entity.  In practice, we find that the SRD parts can generally be built independently from the UID elements.  The adapter, the communications code and the server front-end are part of the underlying infrastructure, and should not be seen by the application programmer.

Figure 6.2 shows a messaging connection between the focus and the entity.  In the general case, such a link is required if SRD components are to be the clients to other SRDs (which may themselves be on yet another system). 

An important implementation point here is the precise nature of this link.  It should be designed so as to allow both synchronous and asynchronous messages—and also to allow multiple responses (messages sent back) to a single message.  It should also be designed to use the same messaging API as at the PC end—at least at the programming concept level.  Finally, it must also be designed such that the commit scope is retained (that is, all messages between server components for a given request must be in a single SRD).

An example of how this could look to the application programmer is shown in Figure 6.3.   As can be seen, each of the three modules addresses application-related logic (rather than GUI or communications details).  This simplicity derives from two key aspects of the model:

        Use of code structured for event-loop processing.

        Availability of both synchronous and asynchronous messaging.

Clearly, to enable this level of simplicity, our infrastructure code must provide a wealth of function.  But it is not until we have defined the user's view—that is, what the average programmer would like to be able to do to handle both advanced user interfaces and cross-system cooperative processing—that we can fully define its characteristics.

Figure 6.2.    Programming simplicity.

Finally, since in essence our model defines a general easy-to-program messaging mechanism between independently-developed event-loop software modules, it can address many different systems scenarios, including:

        Stand-alone PC.

        Stand-alone mainframe (with non-programmable terminals).

        Interconnected systems where messages flow freely between systems.

 

6.4 SRD components

Experience suggests that the design of server code is made considerably easier when an object analysis and design approach is taken.  This is so for both CBO design and non-CBO design.  For this reason, the SRD components discussed in this section are referred to generically as ‘modules.’  Later, we will consider CBOs in the SRD ( Section 6.5).

 

6.4.1             The ‘front-end’

The major responsibility of the SRD ‘Front-End’ is to hide communications coding and protocol from the programmer.  It receives incoming requests (as messages), routes them to their destination (a Server), and routes responses back to the requestor.  As such, this is part of the SRD infrastructure rather than part of SRD application-level code.

The front-end will also handle such things as:

        Invoking one or more servers, in the correct sequence

        Code page translation (e.g. ASCII-to-EBCDIC)

        Access authorisation

        Version levels (where a system might have—at any one time—two or more version of PC and/or server application code).

In invoking the appropriate server(s), the front-end may use a table, or may assemble a set of requests dynamically from the incoming message.  From a design point of view, it seems better that the incoming requests be for a single service.  The front-end may then decide to invoke several servers to process that request (within a single transaction).  This approach better encapsulates the SRD as seen by requesters.

Tibbetts and Bernstein (1992) describe the front-end as a ‘function dispatcher’, and describe its function in some detail. 

Finally, it is worth noting that the front-end code may be more or less complex depending on the system environment.  For example, if there is a transaction processing system in the server system, then it will be considerably simpler than if there is none.

 

6.4.2             The ‘focus’

The ‘focus’ module is responsible for ensuring the integrity of all shared resources involved in a single request from a client (that is, from a UID or perhaps from another SRD).  The focus knows where shared resources are located, and manages the commit scope (assuming, of course, an underlying commit manager such as is found in CICS and/or DB2). 

To clients, the focus is the server, and as such does not retain state information about clients outside of a single client request.  Therefore a client cannot have a dialogue with a server. 

The focus encapsulates a given atomic business process (such as ‘add a new order’).  This should not be confused with a business process as seen by a user.  The kind of ‘process’ implemented by a focus module is one where a given single entity is being changed, but as a result of that change, business rules require that one or more other entities also need to be changed.

Consider, for example, an incoming message requesting the creation of a customer order.  Clearly data must be written to the orders database to reflect the new customer order.  But it is normally the case that business rules require other data to change because of the addition of a new order.  And these other changes must be done within a single commit scope.  The ‘process’ encapsulated by the focus might then look like this:

        Update the customer balance outstanding.

        Create the new order (order header and order detail lines).

        For each item ordered, update that item's stock-on-hand.

Often a single transaction would do all of these.  But it's sometimes useful to separate out the individual data accesses into separate modules.  Then they can potentially be re-used by other focus modules.  Data is often organised in a way which reflects an Entity Analysis; for this reason, the separate modules are called ‘entity’ modules.

To handle our example, then, we might build an ‘order’ focus module which would use several Entity modules—the customer entity, the order entity, and the item entity.  The order focus module would provide the start and end of the commit scope.

The main characteristics of a focus module are:

        Encompass the commit scope for the entire server request.

        Understand which resources are to be accessed, and how to access them.

        Package up the response to be returned to the client, and return it (to the infrastructure, for the infrastructure to deliver).[24]

        May handle referential integrity (if not handled elsewhere)

        Apply any business rules that apply to the transaction as a whole.

When a focus does not have to apply business rules, the focus can be seen as a general-purpose ‘transaction shell’, possibly combined with the front-end.  In this case, the client must compose a ‘package’ of requests, which are unpacked by the focus and sent to the appropriate entity modules, all within a single transaction.

 

6.4.3             The ‘entity’

The entity module is responsible for accessing some specific instance of a shared resource, such as a given data table.  Typically it will act as a server to one or more focus CBOs.  One entity CBO might handle a single physical table, a logical view (ie a DB view), or perhaps several tables where each table is really part of a whole (example—a ‘supplier order’ entity might manage both a supplier order header table and a supplier order detail table).

Entity modules handle specific data entities in databases (e.g. ‘customer’, ‘account’, ‘order header’).  For some incoming messages (e.g. ‘read customer record for customer number A1234’), the entity modules will provide all of the server function needed; in that case, the single entity module encompasses both focus and entity roles.

Entity modules can be re-used in various different business process areas.  This is really an extension into application-level code of the idea of having a single database, accessible to all applications which need that data.

Some of the characteristics of entity modules are:

        Referential integrity—if not handled by the DBMS—will be handled in the entity modules.

        Unless we can be sure that incoming messages are always received from a secure and trusted environment, then entity modules will also re-run key business rules, to ensure not only update integrity, but also business integrity.

        Entity modules may invoke other entity modules (e.g. an order entity may invoke order header and order details entities—assuming that the latter are required as independent entities for some reason).

        Entity modules may also be invoked as servers in their own right—for example, a customer entity module may be invoked from a UID to handle an update.  In this case, the entity must play the role of a combined focus and entity.  This dual role may be constrained by the underlying resource manager.  Some resource managers require the programmer to issue explicit start and end commit scope instructions, and others provide implicit commit scopes.  If explicit start and end commit statements are required, and if an entity is to play a dual role, then the underlying resource manager must allow nested commit instructions.  This is because the entity module should not have to determine how it is invoked, and so decide whether or not to issue commit instructions.[25]


Occasionally, the data access function implicit in an entity is broken out into another kind of module, which I call the ‘resource’ module.

 

6.4.4             The ‘resource’

The resource module is responsible for providing transparency as to the underlying resource manager to the entities.  The resource module issues the SQL, or does the VSAM I/O, etc.  If resource manager transparency is not required, then the resource module function would be collapsed into the entity.

Resource CBOs can, in some circumstances, be an extremely useful agent for lessening the impact of change.  The entity CBO applies business rules; the resource CBO issues the I/O.  Encapsulating the I/O in the resource CBO can help where:

        It is planned to move from one form of DB access to another (e.g. chained files to relational).

        Several different kinds of data access must be catered for (each will have a different resource CBO; but the entity CBOs should be the same).

        Several different SQL dialects may have to be catered for.

To those who have a single dialect of SQL as a corporate standard, and who do not intend to change, then an approach which encapsulates all the SQL into one kind of CBO—well away from the application logic, may seem odd.  Suffice to say that such an arrangement is an option, not a rule.  Where a single database API is certain to be persistent for the foreseeable future, then one might not bother with separate resource CBOs.

 

6.5 SRD CBOs

Previously, when discussing the UID, we showed how building cooperative business objects (CBOs) instead of applications provides a much closer fit to the problems of cooperative processing than more traditional approaches, and also opened up dramatic new potential for software re-use—for the user as well as for the developer.  In particular, we showed that:

        CBOs provide ease of programming to the average application programmer for both synchronous and asynchronous requests between the two separately-developed pieces of application code, with local/remote transparency.

        CBOs (and the required middleware or infrastructure) provide a superior design for the PC end of a cooperative system.

        The functional requirements for CBO-enabling middleware had many similarities with those provided by transaction processing systems (such as IBM's CICS):

         Task management

         Program invocation

         Storage management

         High-level APIs

         Communications management

         Concurrency management

The question now is whether CBOs show similar advantages for the server end of a client/server system—or, to be more specific, for the SRD.  Since a transaction-based SRD, with concurrent update requests and resource integrity is perhaps the more challenging of the possible SRD environments, we will address this question from the point of view of transaction processing servers.

The remainder of this chapter discusses CBOs in a transaction processing environment.  First, in the next section, we list the potential advantages of CBOs.  Then we illustrate these advantages with an example.  This example is used to generate the key questions about CBOs and transactions.  Subsequent subsections discuss answers to these questions.

 

6.5.1             CBO advantages

There are a number of advantages to using CBOs rather than traditional structures in the SRD:

        CBOs provide all the advantages of object orientation, including re-use, incremental development through inheritance, encapsulation of data (reducing code redundancy) and reduction of the scope of changes.

        CBOs map well to the emerging ideas of ‘networked objects’—the idea of networks of free-flowing messages, unconstrained by connection protocols, and providing ease-of-programming for the average programmer.  The existence of a standard messaging protocol and formats between CBOs (not the same as between systems!) will enhance open interchanges.

        CBOs can form the basis for a market in objects.  A given CBO (perhaps a ‘customer order’, for example) may even be purchased by several different companies from the same specialist software house.  Using inheritance, the CBO purchased might have been specialised without IS touching the bought-in CBO (no re-compile, no re-link).  The other CBOs could have been built, or bought in from other vendors.

        CBOs can substantially improve customisation of software.  For example, suppose the corporate CBO department (located overseas) develops an account object.  If some local legal constraint requires the behaviour of this CBO to be modified, then a subclass can be built.  This can avoid any need to modify the code delivered by the corporate CBO department.  It also makes maintenance of that code much easier (the modifications don't have to be re-applied).  (This approach requires class names to be assigned at run-time rather than at build time—another requirement on the CBO infrastructure.


Figure 6.4.    Networked CBOs.

Suppose we could execute CBOs within one of today's transaction processors, such as IBM's CICS.  Figure 6.4 shows how this might look.  We see two companies, A and B.  Assume that B is a supplier to A.  In company A, a user places an order for one of company A's customers.  The order is built in the user's UID, involving several UID CBOs.  One of these sends a ‘create order’ message to an SRD, and here's what happens (note that, for clarity, only a few of the CBOs which would be involved in this process are shown, and the focus, entity and resource CBOs are not shown separately):

        The Customer Order X CBO sends a message to the Item #1 CBO to decrement stock-on-hand.  Assume that on-hand falls below re-order level.  Then:

        The Item #1 CBO sends a ‘Place Supplier Order (for me, on company B)’ message to a ‘Bring Forward File’ CBO.  For present purposes, we assume that it is the Item CBO that has the responsibility for knowing from whom to replenish itself.

        Sometime later (indicated in the figure by a bar across the message line), the Bring-Forward File CBO issues a ‘Create New Supplier Order (on Company B)’, which results in the creation of the Supplier Order Y CBO.

        The Supplier Order Y CBO sends itself to Company B, by sending a ‘Create New Customer Order’ message across an EDI link to Company B.  This results in a new Customer Order Y being created in Company B.

        The Customer Order Y is for Item #2 (this is Company B's item number for the product that Company A calls ‘Item #1’).  During processing, Customer Order Y sends a ‘Decrement Stock On-Hand’ message to the Item #2 CBO.

Meanwhile, a user in Company B is also placing an order for Item #2, using the same customer order class (a different instance—Customer Order Z), which in turn sends a ‘Decrement Stock On-Hand’ message to the Item #2 CBO.

Another user in Company B is enquiring on the stock status of Item #2.

Several questions arise from this example:

        What is the difference between a UID ‘model’ CBO (say ‘customer order’) and the SRD ‘customer order’ CBO?

        If the SRD is implemented with a transaction processing system, how are the CBOs mapped to transactions and tasks?

        Is there a single instance of the Item #2 CBO being accessed by three transactions concurrently?  If not, what's happening?  How is commit scope retained?

We now address these questions.

6.5.2             CBOs—UID vs. SRD

In the previous chapter, we saw ‘customer’ objects, ‘order’ objects, etc., in the UID.  Now we see them again in the SRD.  Are they identical, and if not, why not? 

 

A common confusion

SRD CBOs will often be implemented as focus and entity CBOs.  For example, the ‘customer order CBO’ in Figure 6.4 could be implemented as an order focus CBO, an order header entity CBO and an order detail entity CBO.  However, when discussing the SRD in general, those three CBOs will often be referred to as a single object—the Order CBO’, or as just ‘the server object’.  To make the confusion worse, at this level of abstraction, a ‘server object’ is sometimes referred to as the ‘model’ object.

If this terminology is used, then, when discussing the client/server object structure, there seem to be two ‘model’ objects.  Clearly (the argument goes) there can only be one, and it must be in the SRD.  Therefore, it is concluded, the model cannot be in the UID.  This conclusion leaves only view objects in the UID, and can lead to implementations which are unnecessarily complicated.

To avoid this confusion, and possible undesirable consequences, in this book we have used the term ‘local model’ to refer to the UID model, and focus and entity (or just server object) to refer to the SRD ‘model’. 

 

Different responsibilities

The major differences between UID and SRD Objects are as follows:

 

A UID local model object:

        Provides the user with a representation of something he needs to do his work

        Do not have any knowledge of sharing data, but provides a local surrogate for the SRD object, filtering requests from views and routing some of them to the SRD.

        Has very fast response times (may often have to be less than 100 milliseconds for trivial interactions such as drag-and-drop)

        Will often reflect changes to other objects in the UID (depending on user interface requirements)

        Is single-user, with no need to take concurrency (from multiple users) into account

        Will be long-running and conversational (to avoid the overhead of program loading).

 

An SRD server object:

        Provides the organization with a representation of a valuable shared resource

        Is responsible for enforcing business rules, for providing access to shared resources, and for cooperating with resource managers to ensure resource integrity

        Is not constrained by the same order of user interface response-time requirements; this means they can (for example) issue SQL requests within their methods.  UID models are generally designed to avoid this.

        Understands that data is shared

        Can reflect changes to other objects in the SRD (depending on business requirements)

        May be accessed by multiple users, and must take this into account

        Will often be short-running, and if so, would be ‘unloaded’ between transactions (by the infrastructure, and transparently to the programmer).

One might say that these differences are somewhat pragmatic, and in the future, they may become less important.  Nevertheless, with today's resource manager, network and database response times, the pragmatics dominate, leading to a requirement for different objects depending on whether they live in the UID or the SRD.

On the other hand, even with infinite computing resources, the differences in responsibility between (say) a ‘customer’ object which is built to satisfy UID requirements, and one built to live in an SRD, seem sufficient to justify a separation.  We will discuss this further in Chapter 13.

 

6.5.3             Mapping CBOs to transactions

When we look at the messages flowing between objects for a given piece of work, we find that the flow is kicked off at a given object by some external event (such as an incoming request), passes through several other objects, and finally ends up back at the starting object.  I call this set of messages a ‘flurry’. (Flurry is defined in the Concise Oxford Dictionary as: ‘gust, squall; commotion, excitement; nervous hurry, agitation; sudden burst of activity.’)

If we then look at what is required for a flurry to run in an SRD, we find that many underlying infrastructure capabilities are required—such as multi-tasking (we might run each flurry in a separate task).  Interestingly, when we match these required capabilities to a transaction processing system such as CICS, then we find that many of them are already there.

There is a distinct similarity, then, between a flurry and a transaction.  From one point of view, indeed, a flurry is really the same as a transaction.  We give it a different name to avoid confusion with the usual structure of ‘transaction programs’.  Why should there be any confusion?  Well, thinking in terms of a ‘flurry’ helps with discussing the next question: is there a single instance of any given CBO instance (e.g. Item #2), or are there multiple copies of that single instance, one per SRD? (See Appendix 1 for an introduction to the concepts of classes and instances.)

6.5.4             Single vs. multiple instances

Suppose several users access the same customer data concurrently.  Is there a separate instance of the customer object in the SRD for each transaction?  Or is there a single instance which is shared between transactions?

In other words, just as with traditional transactions programs a separate ‘copy’ or instance of a transaction program is run in each transaction, should we have a separate ‘copy’ of the customer object for each transaction?  Or, just as there will only be one record on disk of the customer record in question, should there also be one single object through which all transactions accessing that data run?

This question is of key importance in implementing CBOs for shared resource domains, and its answer leads to two quite different design approaches (and to different Infrastructure requirements) as follows:

        Create a separate instance within each transaction, and so have multiple instantiations of the same object.  This is feasible today with little extension of the transaction processing environment.

        Create a single instance which is shared by several transactions, and so have a single instantiation of the object.  This approach requires additional infrastructure capabilities to handle the concurrency.

To explain the difference, we will consider each approach in turn, and then discuss the outstanding technical challenges.

Multiple Instantiation

‘Traditional’ design

The top part of Figure 6.5 illustrates a traditional approach to implementing an order creation transaction.  Server 1 is invoked by an incoming ‘create order’ message, and issues DBMS calls (e.g. SQL) directly to the DBMS.  Note that the transaction will be designed such that all of its code runs within an SRD.

Server 2 at the bottom of the figure shows a focus/entity implementation of server 1.  The focus will invoke the appropriate entities.  The entity modules would then issue the appropriate DBMS calls.

Whatever the connection between the focus and entity modules, the designer will have to ensure that all the invoked code stays within the SRD—else the commit scope will be lost.

Suppose server 2 is invoked at the same time as server 1.  If the same item was ordered on both orders, then there would be two update accesses (one from each transaction) to the same specific item row or record in the item database.  There would be two (essentially) identical pieces of code accessing the item database, and the concurrent update problems are those which face any on-line transaction processing system.  (These are discussed in some detail in Section 10.3.)

CBO Design

Now consider how this same design might be implemented using CBOs.  The essential thing here is to run each transaction—or activity—within a single SRD.  The CBOs might look as shown in Figure 6.6.

 

Figure 6.5.    Server vs. focus/entity design.

Figure 6.6.    CBO ‘server’ design.

Here, the incoming message is for the order object (specifically, instance A100 of the class ‘order’).  This object sends messages to the customer object (instance C200 of class ‘customer’) and to the two item objects (instances X123 and Y456, both of the class ‘item’).  The order, customer and item objects all call the DBMS (an example of a focus CBO handling DB access).  All four objects are instantiated within the same SRD.

What would have happened if another message had been received concurrently which also wanted to update item X123?  In that case, another transaction would have been started by the transaction processor, and a second instance of object Item X123 would have been instantiated—within the second transaction.

This is shown in Figure 6.7, where we see two identical instances X123 of class Item within the same system (for simplicity, we assume a single transaction processing system—such as a single CICS region).

The problem of data integrity is effectively the same as for the traditional approach, and again details can be found in Section 10.3.

Implementing this approach requires additional infrastructure functions over and above those provided by today's transactions processor. They include:

        Object instantiation

        Message routing

        Superclass handling

        Storage management (for instance data)

        Code loading (depending on the specific transaction processing system)

All CBOs within a transaction would lie within one and only one SRD. 

Figure 6.7.    CBO concurrency—multiple instantiation.

Single Instantiation

The ‘single instantiation’ approach says that there will only ever be one instance of a given class at any one time.  If two transactions both needed to access the same instance, then the infrastructure would have to:

        Provide for multiple concurrent access to an object (without the CBO programmer having to handle the complexity).

        Provide some form of ‘activity token’ so that multiple activities accessing the same instance could be identified.

The single instantiation approach is shown in Figure 6.8, where we see three transactions.  The customer C200 CBO instance is accessed concurrently by two transactions, as is the item X123 CBO instance.

Since a single instance implies a single copy of data in storage, then we would have multiple activities all accessing the same (in-storage) data.  However, if this occurs, then one transaction may see inconsistent data.

If we then add the notion that we might like a given instance to be located remotely (in the network) from some other CBO—but have both within the same SRD)—then we can see significant problems in implementing the single instantiation approach.

Currently, work is being done on at least two approaches to these difficulties, but it would be premature to report more fully on them at the present.  One consideration for a partial solution is that if one transaction had to change an instance outside its SRD, then the commit scope would be lost.  However, as long as the infrastructure can at least report a ‘not fully committed’ error, then this may be acceptable in some circumstances.

 

Figure 6.8.   CBO concurrency—single instantiation.

The reasons why the single instantiation approach is likely to be the more fruitful in the long term include the potential to:

        Locate object instances anywhere in a network.

        Allow better for a future blurring of the current difference between DBMSs and CBOs—and hence provide a migration path (where appropriate) towards possible future ‘active’ OODBMSs.

        Avoid a possible need for pre-execution ‘binding’ of objects into a single transaction.

Full support for two-phase commit across any set of objects will require not only object concurrency, but support for a network-wide SRD by the resource managers.

In brief, the conclusion is that CBOs can run in today's transaction processing systems (given, of course, the CBO-enabling infrastructure) using the multiple instantiation approach.  More work is required before we can say that the full single instantiation approach is viable.

Today, we can move towards CBOs in systems like IBM's CICS by designing focus and entity modules (and maybe resource modules) to run in a single transaction. 

 

6.6 Summary

In this chapter we have further explored the shared resource domain—the application code responsible for ensuring the security and integrity of corporate shared resources.  Specifically, we defined the difference between the resource manager domain (the scope of resources controlled by a resource manager such as a transaction processing system) and the shared resource domain (the area within which a programmer can place his code such that a recoverable unit of work can be done).  Within the SRD, three different kinds of CBO can be identified:

        The focus—responsible for ensuring shared resource integrity.

        The entity—responsible for accessing a specific resource.

        The resource—responsible for providing transparency to the entity by understanding the precise system-dependent resource access mechanisms.

We then discussed whether separate SRD CBOs should be created for each transaction, or whether a multiple instantiation approach was better.  The single instantiation approach appears to provide several benefits, but more work is needed before it is known if the advantages outweigh the additional Infrastructure function required.

 


 

 

7              End-to-end summary

 

In this chapter, we summarise the discussion in the preceding two chapters, showing an ‘end-to-end’ CBO model of client/server. 

 

 

7.1 Domain interaction

There will be as many UIDs as there are users.  In addition, there may be more than one resource manager domain (depending on the resource manager used), and hence, for a given client/server interaction, more than one SRD.  While the SRD is normally the server, and the UID the client, there will be occasions when something in an SRD needs to initiate a request of a UID.  Thus the connection between the two, while always ‘client/Server’ in the sense of request/response pairs, is more of a peer-to-peer nature.  This is because both can be clients, and both can be servers (in different situations).

Again, one SRD may well access another SRD directly.  For example, a UID on a PWS may send a request to an SRD on a LAN server, which, to handle the request, may need to read some data from a mainframe server.  Without an underlying distributed database, there would be two SRDs involved.

 

 

7.2 Components of the end-to-end model

Both the UID and the SRD are implemented with CBOs.  (Although the model is presented in terms of CBOs, it is also generally applicable to non-CBO application structures.  In particular, the separation of UID and SRD is a generic concept, and not limited to CBOs.)  Within each domain, there are a number of identifiable types of CBO.  The overall structure is shown in Figure 7.1.  In the figure, solid boxes are CBOs, dotted boxes are infrastructure (middleware) components.  There are two types of CBO in the UID (the view and the model), and three in the SRD (the focus, the entity and the resource).

 


 

Figure 7.1.    End-to-end structure.

7.3 Component Behaviour

The various CBO types interact in certain ways, and design guidelines restrict this interaction as follows:

1.      Component Interaction    The following defines inter-component request and response behaviour:

(a)    Requests from one CBO to another (also between view and window, CBO and connection) are made by sending messages.  Messages may be synchronous or asynchronous.

(b)   Message routing is done outside of the CBO application code by CBO infrastructure or ‘middleware’ code.

(a)    CBOs in the same machine may be in different address spaces or the same; in different threads of control or the same.

(b)   The programmer writes CBOs, which are event-loop code entities such that the loop handling is done by the infrastructure.  This means that concurrency handling for asynchronous messages can be done by the infrastructure, and the programmer does not see this complexity.  CBOs are also true objects (to be precise, they are classes) in the OO sense.  One CBO class can inherit behaviour from another CBO class.

(c)    Data in messages is in self-defining format (containing metadata—data about data—as well as data values).  This approach makes for extremely loose binding between components (effectively at the level of data labels), and makes CBOs insensitive to data types and data positions within a message.  This does however, trade loose binding for execution cycles.

2.      Component Restrictions    In the model presented, we can define some restrictions on the CBO types which can ‘talk’ to each other CBO type.  While these are not necessarily technical restrictions, nor hard and fast rules, they have been found to be useful design guidelines, and are:

(a)    A view can only talk to its model (by ‘model’ we mean ‘local model’).

(b)   A model can talk to its view(s), to another model, or to a focus

(c)    A focus can talk to entities, and to models

Figure 7.1.    Generalized client/server CBO model.

(d)   A focus can talk to another focus in a different SRD

(e)    A model can talk directly to entities if the entity encompasses the required commit scope (that is, if the entity is also a focus)

(f)     An entity can talk to resources

(g)    A resource can only talk to an entity

With these behaviours and restrictions taken into account, a slightly more general picture can be generated from that shown in Figure 7.1, and is shown in Figure 7.2.  In this figure, the different CBO types are shown in a typical situation.  Note that:

        the focus is talking (perhaps concurrently) to three UIDs

        One of the entities is being talked to from another SRD (perhaps a read; if a change is being requested, then this implies that this entity is also a focus.)

        One of the entities is being talked to by a UID (again, if a change is requested, then this entity is also a focus).

7.4 Model subsets

It should not be thought that this model defines an over-rigid separation of CBO types.  There will be many instances where one or more different types can be merged, or collapsed, into a composite type.  Thus subsets of the general model can be defined.  For example:

        Resource and entity can be combined

        Focus and entity can be combined.  The result would be very similar to a traditional server-style transaction.

        A view can never exist without a model.  If there is only one view for a given model, then view and model can be combined.  The result is a model which also has view characteristics.

        A model can exist without a view.

        The model (that is, local model) and the entity can be combined.  This results in what is essentially a ‘remote model’.

Note also:

        If all possible subsetting is done, then the result is a single component in each of the UID and the SRD.  If two systems are involved, and the connection component is included with the application code, the result is today's typical cooperative processing structure—for example, a PC application talking to a CICS application (perhaps using APPC).

        If the SRD is exceedingly trivial, then both the SRD and the UID can be combined.  The result is a typical PC stand-alone application.  Note, however, that in the typical client/server system, the UID and SRD should never be combined; the boundaries between them should never be mingled.  Indeed, from a design point of view, even when CBOs are destined to run stand-alone on a PC, I have found it useful to build separate UID and SRD CBOs.  The SRD can then be designed as a trivial subset—a shared resource domain shared by one person(!).  But if ever a need arose to share for real, then only the SRD CBOs might have to be re-written.

 


8              The CBO infrastructure

At various times in the previous chapters, reference has been made to ‘the CBO infrastructure’.  In this chapter, we now discuss some of the more important aspects of such an infrastructure.

The code which implements a CBO Infrastructure is certainly ‘system-level’ code, and is something of a challenge to build.  Until very recently, such software was not available on the market.  At the time of writing, it is just starting to become available in industrial-strength products.  (One such, called ‘Newi’ (New World Infrastructure), is produced by Integrated Object Systems Limited, located in Newbury, UK.)

This chapter is in seven sections:

        Firstly, we define what we mean by ‘Infrastructure’, and review why it is necessary.

        Then we discuss the common elements of a client/server Infrastructure, suitable for both CBOs and more traditionally-structured applications.

        Following this, we look at the specific requirements of a CBO Infrastructure.

        So far, we have completely ducked the question of the data within a message.  In real systems, this turns out to be an extremely important consideration, and in the fourth section we show how a self-defining data mechanism, which carries metadata (data about data) is required.

        When there are several different types of computers in a distributed system, data conversion (e.g. ASCII to EBCDIC) becomes important.  We look at some of the lessons learned.

        Then we briefly discuss how the Infrastructure might handle connection of CBOs to a non-CBO environment, so enabling higher levels of integration.

        Finally, we look at some of the Operating System implications for the design of the Infrastructure.  For example, should each CBO run in its own address space?  Or not?

8.1 Why an Infrastructure?

In a real system, there are several interlocking infrastructures, including:     

        The hardware, operating system and networking infrastructure

        A systems management infrastructure which would handle such things as software change management, user access authorisation, directory services for cross-system names, etc.[26]

        An application enabling infrastructure, which supports a given ‘model’ of how application code should be structured, and provides significant ease-of-programming for that model. 

This chapter addresses the last of these, one which supports the CBO model. 

We often find in practice that each client/server application has had to design and implement its own software infrastructure.  This is so in spite of products aimed at this area; some make programming simple, but do not allow full exploitation (or sometimes even partial exploitation) of the object-based user interface; some provide useful APIs for connection, but do not provide for other complex areas (such as multi-tasking); others provide a closed environment, often for a single language only.

The reason for this is that there has been no general model of what—in software structure terms—a cooperative processing system is.  There has been no generally-accepted notion of what it is that we'd like application programmers to build—and hence no well-understood appreciation of the services and facilities that should be made available to those programmers.

We need such a model because building a cooperative application is different.  If we treat it like previous application structures, then we will find things difficult.  If we accept it's different, but don't have a model that explains the differences, then it will be difficult.

The concept of CBOs developed in the previous chapters provides such a model.

An analogy is this: at the end of the 1960's and beginning of the 70's, it became apparent that if you wanted to build an on-line application which exploited the new character-based terminal technology, then the thing you really wanted to build was ‘transactions’.  But this required much non-application code, which did things like task and storage management, screen handling, transaction initiation, etc.  This code was not only difficult to write, requiring high design and programming skills; it was also re-invented for each application.

Over time, it was realised that to make programming easy, all the non-application code should be provided in common for each application through a layer of middleware—of software infrastructure—which sat on top of the operating system.  You needed this layer because the operating systems did not provide it.  Thus transaction processing (TP) monitors such as CICS, GCOS, IMS, and CCP (on the System/3) were introduced to provide the required function.  Today, systems such as OS/400 (the operating system on IBM's AS/400) provides this transaction environment as an integral part of the base operating system.  (The IBM System/34 was an excellent example of one of the first systems to provide a degree of transaction processing built into the operating system.)

 

Figure 8.1.    AD deliverables

 

For interactive applications (such as decision support), much the same argument applies, and we saw the emergence of systems such as CMS (the VM interactive operating system), and TSO.  Today, systems such as OS/2 and UNIX provide this interactive environment as an integral part of the base operating system.

Figure 8.1 illustrates this.  A printout is shown at the bottom left.  If that's what you want to produce, then a batch application seems ideal.  In turn, the batch application requires an operating system upon which to run.  If the thing you want to produce is an on-line system, then the things you need to build and deliver are transaction programs (T1, T2 and T3—each writing to an SPA or scratch pad area).  Transaction programs require a transaction processor to run.  Likewise an interactive application (such as a spreadsheet) needs an interactive computing (IC) environment.  CBOs in turn require a CBO infrastructure.  Thus the general ‘shape’ of the software you want to deliver (batch suite, transaction, interactive application, CBO) determines the services and functions you need to provide either in, or on top of, the operating system at run-time—if you want to make programming easy!

One very good example of the kind of run-time support required is drag and drop (direct manipulation).  The real business benefit of drag/drop lies in its application integration potential.  It is the mechanism whereby the user can introduce two objects—behind which are two pieces of software—that have been separately developed.  CBOs, and the CBO Infrastructure, allow this inter-application function to operate easily.

8.2 Common Components

Much of the discussion in this section applies to non-CBO application function.  However, the assumption is made that the application code, whether CBO or not, communicates by means of messaging.

8.2.1             The UID

We can summarise the user requirements (the expected characteristics of an effective UID on a PC) in terms of flexibility.  That is, no rigid screen dialogue constraint, the ability to conform to an unstructured customer dialogue, and so forth.  This translates into a number of capabilities which the GUI should enable, including the ability to:

        Handle several business processes concurrently

        Jump from object to object (and window to window) as required

        Display multiple instances of one thing (e.g. two customers) concurrently.

These required capabilities can be equated with specific desired PC characteristics; in particular, aside from how the GUI is designed, they imply that the necessary ‘background’ access to servers should be as non-intrusive as possible.  This in turn points to:

        Concurrent sessions

        Queuing of requests when sessions are all busy

        Separation of GUI code from the communications code, so that windows are not ‘frozen’ while relatively long-running communications are happening—in other words, asynchronous connections to the server

Thus one of the first requirements of the infrastructure is to enable application code to make simple requests of the server.  This can be done through an infrastructure-provided ‘request sender’, as shown in Figure 8.2.  At the server end, there is a matching ‘request catcher’.

 

Figure 8.2.   Request sender and catcher.

 

In considering these key usability aspects, together with some important ease-of-programming factors, the following requirements arise:

        Concurrent requests for access to the SRD should be honoured, even though the number of communication ‘sessions’ may be less than the number of outstanding requests.  This means that such outstanding requests be queued.

        Access to servers is often, from the PC GUI response-time point of view, a long-running function.  For example, with a slow network, or an overloaded server—or just a large request—access to the server may take (say) 10 or more seconds.  An asynchronous connection between the UID (client) and the SRD (server) should therefore be provided.  Otherwise, the user interface may block, producing for the user an unwelcome ‘freeze’ of the window.

        A response to an asynchronous request needs to be routed back to the requester.  (The response could be returned to a piece of code other than the requester; but this would break the client/server model—a response would arrive at code which had not issued a request!  This will lead to greater complexity in design and programming.)

        Different communications required to access different servers should be hidden from the CBO application code.

        Drag/drop between objects on the screen must be effected by some function external to the CBO application code.  The CBO should see only incoming messages.

        CBO-to-CBO messages must be provided (for example, to allow CBOs to exchange data after a drag/drop).  Although this can be done with system function (e.g. DDE), for ease-of-programming reasons, such facilities should be ‘front-ended’.

For the UID, these considerations lead to an expansion of the ‘request sender’ function shown Figure 8.2 to the set of functional components shown in Figure 8.3.  Note that this figure shows function as opposed to actual software detail design and packaging.  The components illustrated in Figure 8.3 are as follows:

Figure 8.3.    UID infrastructure components (PC).


Router

The router catches requests sent out from application code and decides where they should be sent.  Thus the API which the application programmer sees is that provided by the Router.  In some cases, requests will be sent to other application code on the same system; in other cases to a different system.

Adapter

The adapter is so called because it adapts message requests to some communications mechanism in order to send them to remote servers.  It is responsible for mapping the destination name specified by the application to a specific communications route, and sending it on its way through that route to the server.

In this way, aliases can be provided for remote objects—which might indeed not be objects at all, but, for example, transaction programs.  The sending CBO does not need to understand the transaction name; as far as it is concerned it is talking to another CBO.

Session manager

The session manager knows where the communications code is running (in another thread or process) and how to pass the request data over to it.  It also knows whether a communication session is busy, and hence whether to queue a request until the session is available.  (Some communications implementations—such as APPC—handle session queuing themselves.)  Should the particular communications module (or subsystem) provide an API which can be run against it from different threads or processes, then the session manager will handle a pool of threads or processes.

The session manager also handles things like communications error logging, any re-try strategy, etc.  In general, on an error, the session manager should return to the requesting application (via the router) an indication of failure only (rather than all of the communications-level sense codes, etc.).

Queue

A queue is where outstanding requests are waited.  This is only required for communications subsystems that are not themselves queued.  By ‘queue’ I mean that if there are n sessions available, and there are n+m concurrent requests made, then the communications subsystem is able to accept all the requests, returning to the requesters when a session is available.  How the subsystem deals with the outstanding requests is immaterial; it appears to queue them.

An example of a queued communications subsystem is LU6.2 (with its APPC API).  An example of a non-queued subsystem is a terminal emulator, where the communications API interacts pretty directly with a logical terminal screen (in-storage).  If all terminal sessions are busy, then the emulator does not ‘queue’ a subsequent request, but will just return ‘busy’, and leave it to the caller to work out what to do.


For a non-queued subsystem, when a communication session comes free, the session manager interrogates the queue to see if there are any outstanding requests.

Communications module

One or more communications modules handle the actual communications protocols.  This really refers to the communications mechanisms offered by the operating environment, such as APPC, terminal emulation, sockets, RPC, message queuing, etc.

DM manager

In addition (at the left of Figure 8.3), the DM Manager (direct manipulation manager—or drag/drop manager) provides a layer of function on top of any system-level capability, such that high-level direct manipulation messages can be sent to the appropriate application code.  In the figure it is shown doing this through the router.

8.2.2             The SRD (server)

The major difference at the server end is that the SRD is multi-user.  That is, it must be able to handle multiple concurrent requests from one or more UIDs (and, indeed, from other SRDs).

Now, while PC UID environments are very generally similar in their facilities and capabilities, typical SRD (server) environments can range from that same PC being used as a LAN server all the way up to the largest of mainframes.  But more importantly for present purposes, we find that the system facilities are widely different.  However, we can identify two different system environments for servers: one where there is a transaction processor and one where there is not.

Since the essence of the server is to handle what are really transactions, then an environment without a transaction processor—or effective transaction management in the database manager—will have to have the required transaction processing elements built. 

So let's now consider the case of a system without a transaction processor.  Assume we have a typical server program, say a Customer Update module.  This might look like this:     

_______________________________________________________________________

    Start

      Get Request from Client

      Validate the Request

      If OK, then

        Perform the Update

        Return 'OK'

      Else

        Return 'Error'

    End

_______________________________________________________________________

 


Now come the difficult questions:     

        How is this application program loaded into memory?  What piece of software does the loading?

        Is it loaded once for each request, or does it hang around waiting for the next request?

        If it hangs around, what software wakes it up—and how?

        Who queues or schedules or otherwise handles the second of two concurrent requests?

        How does the second line (Get Request from the Client) actually work—what underlying software is behind the ‘Get’?

These questions are referred to by Orfali and Harkey (1992, p.1077) as ‘…an area that can really get wild.’  However, if there's no transaction processor, then someone has to answer these questions, and build the code needed to implement the answers—either as part of the application, or as middleware.  In effect, they'd be building some aspects of a transaction processor.

As someone (else) once said, ‘If you're processing transactions, then you need a transaction processor.’

It might seem that distributed database is the answer.  Then you can just put the above code on the PC.  But this only moves the problem (although it does answer the multi-user aspects—assuming that the DBMS provides concurrency control).  Something still has to initiate the code, pass the request to it, etc.  Then again, with an object-based user interface, it's not impossible that two CBOs may concurrently make a request of the same server.  Certainly such an event needs to be catered for in a general-purpose solution (which is why even with a stand-alone PC application, it's still useful to separate UID and SRD).

Because of the wide disparity of likely server environments, there are few definable components that are common across all.  Even mainframe transaction processor environments differ widely.  For example, the infrastructure components required for CICS and for GCOS are quite different.  Perhaps the main common component is the communications front-end; after that, there are a number of considerations, the structural answers to which may vary widely.

The communications front-end

As shown in Figure 8.2 a ‘front-end’ is needed to separate the server application code from the code containing the communications API.  Note that the underlying communications subsystem may itself address a number of the problems to do with code initiation (for example, APPC can automatically initiate the transaction program required).


The major function of the communications front-end is handling incoming messages, and return responses.  It should also handle such things as:     

        Sign-on or other general access authorisation (as opposed to specific resource authorisation).

        Invocation of the appropriate SRD server or servers

        Version control—where the incoming message may be checked for its version, and the appropriate SRD application module invoked

        Conversion of data formats where they differ between machines.

Other considerations

Besides the communications front-end, the infrastructure at the SRD must also consider:

        Commit scope management.  If the SRD consists of several independently-developed transaction programs, then some way has to be devised to invoke them all within a single commit scope.  This may require commit scopes to be embeddable.

        Storage management of message data passed between transaction programs

        Implementation of a router function may be required

        Megadata.  This is about handling the situation where a great deal of data needs to be returned—either in a single response, or as a series of messages sent back.  Additional considerations are discussed in Chapter 11.

Object design

At its simplest, an SRD may be little more than a set of transactions, running under a transaction processor.  If each transaction maps to a single transaction program, then the communications front-end may be little more than a subroutine called by each transaction program.

Whether this is the best design approach is arguable.  However, it has been found that even such simple design schemes as this can benefit greatly from taking an object approach to transaction program design.  That is, design objects, and implement them using normal transaction programs.

In this way, since objects are essentially event- or message-driven, the resulting transaction programs have a high probability of being reasonably good servers.  They will not make assumptions about clients which use them; they will be complete; they will exhibit the required ACID (atomicity, consistency, isolation, durability) properties.  This is so even though the various attributes of CBOs (inheritance, encapsulation, etc.) are not provided.

 


8.3 CBO aspects

A CBO Infrastructure does not only supports CBOs.  It must also support the concept of peer-to-peer messaging between CBOs, as well as the more traditional client/server interaction.  Thus a CBO infrastructure is also a distributed object manager.

One of the first such infrastructure offerings (if not the first) has recently been jointly developed by the IBM and Softwright joint venture comp-any Integrated Object Systems Ltd (in Newbury, Berkshire, UK).  Much invaluable early experience has been gained over the past few years through some leading-edge projects in applying this technology.  From these projects, we have excellent indications as to the benefits of CBO technology; we know that it works.

An exhaustive discussion of the internals of the CBO infrastructure is beyond the scope of this book.  However, in this section, we mention a number of facilities needed, and discuss further a few of the more important of them.

8.3.1             CBO support

In the course of previous chapters, we have identified a number of the services required to support CBOs.  These include:

        A high-level messaging API (together with appropriate message data architecture)

        Message handling and routing (both synchronous and asynchronous)

        Storage management for class and instance data

        High-level standard messaging ‘wrapper’ for low-level GUI functions

        Code initiation

        Task (thread) management

        Communications management

        Table-driven aliasing scheme for routing over networks

        Object persistence (over power-off, for example, and over code unloading)

        Object management (multiple instances of classes, instantiation, inheritance, etc.)

Other aspects not so far addressed are:     

        Separation of class and its associated code (sometimes called the class' ‘implementation’)

        Code loading and unloading

        High performance message routing

        Message data storage management

        Retaining messaging semantics regardless of source and target object locations (a key aspect of distributed object management)

        Differing communications structures depending on whether there are CBOs or more traditional servers at the remote end

        Peer-to-peer (unsolicited messages)

        Multi-tasking and concurrency support

        Support for self-defining data streams to provide for dynamic data binding at run-time

        Object instantiation

        Garbage collection

        Dynamic class name allocation (essential for ease of customisation)

        Infrastructure administration

On some systems, a transaction processing subsystem can provide some of these facilities. However, even when the facilities are in the right place, and usable in the way required for CBOs, they still need software ‘glue’ to provide the ease-of-programming required, and to provide the object and GUI management.  On the PC, typically most of these facilities are not present to the level required, and so have to be provided by the infrastructure.

IBM's ‘System Object Model’ (currently available on OS/2 Version 2 and on AIX for RS/6000, and often referred to as ‘SOM’) should in future provide some of the system-level facilities that will make building such an infrastructure easier.  We anticipate that all these facilities will eventually be provided by (or as an adjunct to) the operating system.

8.3.2             Synchronous or asynchronous?

Any discussion of synchronicity must state to what the synchronicity refers.  In this section, we discuss briefly the major aspects of this question.

To the programmer

To start with, consider the user of the infrastructure—that is, the programmer.  To him or her, the question of synchronous or asynchronous is always with reference to his application code.  Thus, for example, a program statement ‘Send’, which the programmer has been told is synchronous, should not return until the request has been responded to.  Also, the response data must be returned with the return of control (that is, must be available at the next sequential instruction).  This should apply regardless of the location of the target CBO, and of the nature of the communications mechanism.  The infrastructure should provide for this.

If the request takes a long time (e.g. more than a few milliseconds—or less!), then the infrastructure should prevent blocking while the request is being satisfied.  A brief discussion of this can be found in Appendix 2.

 


A program statement ‘Post’ (which the programmer has been told is asynchronous) should always return control immediately (without the response, of course), with some indication as to whether the message was sent on its way correctly.  The response should arrive at the CBO which issued the Post as an incoming message (which will invoke the CBO). 

A good way of allowing programmer control over the specific message to be received with his response data is to specify it in the Post statement.  This behaviour must apply regardless of whether the target CBO is local (even if it's in the same thread/task as the sending CBO) or remote.

A brief discussion of send vs. post is provided in Appendix 3.

To the infrastructure designer

To the Infrastructure designer, the question relates to the communications mechanism, and refers to the request sender.  Thus if the designer starts a task which will do the communications business, he or she gets control back into the request sender before the communications task has returned the response.  This is an asynchronous connection. 

If on the other hand, he or she does not start a separate task, but issues communications statements from within the same thread as the request sender, then this will be a synchronous connection.

In either case, the infrastructure designer must ensure:

        That the semantics of send/post are preserved for the CBO programmer, regardless of whether the target CBO is local or remote; and:

        That CBOs are not blocked while a synchronous connection completes.

The latter requirement can be particularly difficult to design in the infrastructure; but it can be done.

8.3.3             Name space for objects

For performance reasons, an object ID is an internal token, often only of meaning to the system in which it is used.  But a given CBO cannot always be expected to know the Object ID of the CBO to which it wished to send a message.  So how does it send the message?  The answer is to have a scheme such that the infrastructure can construct the correct object ID from an external name, such as class ‘Customer’, instance ‘A123’ (the instance name in this case is the customer number).

Similarly, the infrastructure must be able to provide the programmer with class and instance names given an Object ID.

 


8.3.4              Class name vs. code

By ‘code’ I mean the code which implements a given class.  It might be thought that code for a given class of CBO cannot be separated from the class.  However, consider the following:

Code for a class ‘Customer’ is developed.  It includes methods for display, update, etc.  Another class ‘Account’ is also developed.  The code for the two classes was developed independently; each code module is a separate executable.

Within the ‘Account’ class code, assume that a message is sent (by name) to an instance of the class ‘Customer’.  Both customer and account are used together by some set of users.  So far so good.

Now assume that code for a third class is required by another set of users.  This class is called ‘Customer_1’, and is intended to be a subclass of ‘Customer’, adding (say) a delete method.  If this second set of users also uses the ‘Account’ class, then it would appear that Account cannot interact with ‘Customer_1’, since it sends messages only to instances of class ‘Customer’.  Thus the code for ‘Account’ needs to be changed for the second set of users.  

This conclusion is unwelcome.  In particular, if you wish to build CBOs which can be customized by building a subclass, it means that you cannot do it—because the customizers would also have to change code in other CBOs which send messages to the class that is being subclassed.

A solution is to have the infrastructure separate class names from the code that implements classes.  Let's assume that class Customer is implemented by code called ‘Cust’, Account by code called ‘Acct’, and Customer_1 by code called ‘Cust_1’. The infrastructure might read a definition file relating classes and implementations at start-up.  Each set of users would be provided with a different definition file, as shown in Figure 8.4.

Figure 8.4.    Class vs. implementation.

 

In this figure, ‘Code’ means the name of the executable (on disk, in some library) that implements a given class.  ‘Super’ is the name of the class's superclass (from which it inherits behaviour).  As can be seen, the second set of users have a definition file where the class Customer is implemented by the code ‘Cust_1’.  The Customer class has ‘ABC’ as its superclass.  The code for ABC is ‘Cust’ (which for the first set of users is the code for class Customer).

In this way, the code ‘Acct’ successfully sends to class Customer for both sets of users—without modifying any code.  However, the behaviour of class Customer is different for each set of users (the second set has a delete capability). 

Again, where several views have the same behaviour (regardless of differences in their appearance on the screen), then a single ‘view’ CBO could be c