UNIT – 10
FILE MANAGEMENT
FILE MANAGEMENT
What is a File?
Abstractly, a file is a collection of bytes stored on a secondary storage device, which is generally a disk of some kind. The collection of bytes may be interpreted, for example, as characters, words, lines, paragraphs and pages from a textual document; fields and records belonging to a database; or pixels from a graphical image. The meaning attached to a particular file is determined entirely by the data structures and operations used by a program to process the file. It is conceivable (and it sometimes happens) that a graphics file will be read and displayed by a program designed to process textual data. The result is that no meaningful output occurs (probably) and this is to be expected. A file is simply a machine decipherable storage media where programs and data are stored for machine usage.
Essentially there are two kinds of files that programmers deal with text files and binary files. These two classes of files will be discussed in the following sections.
ASCII Text files
A text file can be a stream of characters that a computer can process sequentially. It is not only processed sequentially but only in forward direction. For this reason a text file is usually opened for only one kind of operation (reading, writing, or appending) at any given time.
Similarly, since text files only process characters, they can only read or write data one character at a time. (In C Programming Language, Functions are provided that deal with lines of text, but these still essentially process data one character at a time.) A text stream in C is a special kind of file. Depending on the requirements of the operating system, newline characters may be convert
ed to or from carriage-return/linefeed combinations depending on whether data is being written to, or read from, the file. Other character conversions may also occur to satisfy the storage requirements of the operating system. These translations occur transparently and they occur because the programmer has signalled the intention to process a text file.
ed to or from carriage-return/linefeed combinations depending on whether data is being written to, or read from, the file. Other character conversions may also occur to satisfy the storage requirements of the operating system. These translations occur transparently and they occur because the programmer has signalled the intention to process a text file.
Binary files
A binary file is no different to a text file. It is a collection of bytes. In C Programming Language a byte and a character are equivalent. Hence a binary file is also referred to as a character stream, but there are two essential differences.
1. No special processing of the data occurs and each byte of data is transferred to or from the disk unprocessed.
2. C Programming Language places no constructs on the file, and it may be read from, or written to, in any manner chosen by the programmer.
Binary files can be either processed sequentially or, depending on the needs of the application, they can be processed using random access techniques. In C Programming Language, processing a file using random access techniques involves moving the current file position to an appropriate place in the file before reading or writing data. This indicates a second characteristic of binary files
– they a generally processed using read and write operations simultaneously.
– they a generally processed using read and write operations simultaneously.
For example, a database file will be created and processed as a binary file. A record update operation will involve locating the appropriate record, reading the record into memory, modifying it in some way, and finally writing the record back to disk at its appropriate location in the file. These kinds of operations are common to many binary files, but are rarely found in applications that process text files.
Creating a file and output some data
In order to create files we have to learn about File I/O i.e. how to write data into a file and how to read data from a file. We will start this section with an example of writing data to a file. We begin as before with the include statement for stdio.h, then define some variables for use in the example including a rather strange looking new type.
/* Program to create a file and write some data the file */
#include
#include
main( )
{
char stuff[25];
int index;
fp = fopen(“TENLINES.TXT”,”w”); /* open for writing */
strcpy(stuff,”This is an example line.”);
for (index = 1; index <= 10; index++)
fprintf(fp,”%s Line number %dn”, stuff, index);
fclose(fp); /* close the file before ending program */
}
The type FILE is used for a file variable and is defined in the stdio.h file. It is used to define a file pointer for use in file operations. Before we can write to a file, we must open it. What this really means is that we must tell the system that we want to write to a file and what the file name is. We do this with the fopen() function illustrated in the first line of the program. The file pointer, fp in our case, points to the file and two arguments are required in the parentheses, the file name first, followed by the file type.
The file name is any valid DOS file name, and can be expressed in upper or lower case letters, or even mixed if you so desire. It is enclosed in double quotes. For this example we have chosen the name TENLINES.TXT. This file should not exist on your disk at this time. If you have a file with this name, you should change its name or move it because when we execute this program, its contents will be erased. If you don’t have a file by this name, that is good because we will create one and put some data into it. You are permitted to include a directory with the file name. The directory must, of course, be a valid directory otherwise an error will occur. Also, because of the way C handles literal strings, the directory separation character ‘’ must be written twice. For example, if the file is to be stored in the PROJECTS sub directory then the file name should be entered as “\PROJECTS\TENLINES.TXT”. The second parameter is the file attribute and can be any of three letters, r, w, or a, and must be lower case.
Reading (r)
When an r is used, the file is opened for reading, a w is used to indicate a file to be used for writing, and an indicates that you desire to append additional data to the data already in an existing file. Most C compilers have other file attributes available; check your Reference Manual for details. Using the r indicates that the file is assumed to be a text file. Opening a file for reading requires that the file already exist. If it does not exist, the file pointer will be set to NULL and can be checked by the program.
Here is a small program that reads a file and display its contents on screen. /* Program to display the contents of a file on screen */
#include
void main()
{
int c;
fp = fopen(“prog.c”,”r”);
c = getc(fp)
while (c!= EOF)
{
putchar(c);
c = getc(fp);
}
fclose(fp);
}
Writing (w)
When a file is opened for writing, it will be created if it does not already exist and it will be reset if it does, resulting in the deletion of any data already there. Using the w indicates that the file is assumed to be a text file.
Here is the program to create a file and write some data into the file.
#include
int main()
{
/*Create a file and add text*/
fprintf(fp,”%s”,”This is just an example :)”); /*writes data to the file*/
fclose(fp); /*done!*/
return 0;
}
Appending (a):
When a file is opened for appending, it will be created if it does not already exist and it will be initially empty. If it does exist, the data input point will be positioned at the end of the present data so that any new data will be added to any data that already exists in the file. Using the a indicates that the file is assumed to be a text file.
Here is a program that will add text to a file which already exists and there is some text in the file.
#include
int main()
{
fprintf(fp,”%s”,”This is just an example :)”); /*append some text*/
fclose(fp);
return 0;
}
Outputting to the file
The job of actually outputting to the file is nearly identical to the outputting we have already done to the standard output device. The only real differences are the new function names and the addition of the file pointer as one of the function arguments. In the example program, fprintf replaces our familiar printf function name, and the file pointer defined earlier is the first argument within the parentheses. The remainder of the statement looks like, and in fact is identical to, the printf statement.
Closing a file
To close a file you simply use the function fclose with the file pointer in the parentheses. Actually, in this simple program, it is not necessary to close the file because the system will close all open files before returning to DOS, but it is good programming practice for you to close all files in spite of the fact that they will be closed automatically, because that would act as a reminder to you of what files are open at the end of each program.
You can open a file for writing, close it, and reopen it for reading, then close it, and open it again for appending, etc. Each time you open it, you could use the same file pointer, or you could use a different one. The file pointer is simply a tool that you use to point to a file and you decide what file it will point to. Compile and run this program. When you run it, you will not get any output to the monitor because it doesn’t generate any. After running it, look at your directory for a file named TENLINES.TXT and type it; that is where your output will be. Compare the output with that specified in the program; they should agree! Do not erase the file named TENLINES.TXT yet; we will use it in
some of the other examples in this section.
some of the other examples in this section.
Reading from a text file
Now for our first program that reads from a file. This program begins with the familiar include, some data definitions, and the file opening statement which should require no explanation except for the fact that an r is used here because we want to read it.
#include
main( )
{
char c;
funny = fopen(“TENLINES.TXT”, “r”);
if (fp == NULL)
printf(“File doesn’t existn”);
else {
do {
c = getc(fp); /* get one character from the file
*/
putchar(c); /* display it on the monitor
*/
} while (c != EOF); /* repeat until EOF (end of file)
*/
}
fclose(fp);
}
In this program we check to see that the file exists, and if it does, we execute the main body of the program. If it doesn’t, we print a message and quit. If the file does not exist, the system will set the pointer equal to NULL which we can test. The main body of the program is one do while loop in which a single character is read from the file and output to the monitor until an EOF (end of file) is detected from the input file. The file is then closed and the program is terminated. At this point, we have the potential for one of the most common and most perplexing problems of programming in C. The variable returned from the getc function is a character, so we can use a char variable for this purpose. There is a problem that could develop here if we happened to use an unsigned char however, because C usually returns a minus one for an EOF – which an unsigned char type variable is not
capable of containing. An unsigned char type variable can only have the values of zero to 255, so it will return a 255 for a minus one in C. This is a very frustrating problem to try to find. The program can never find the EOF and will therefore never terminate the loop. This is easy to prevent: always have a char or int type variable for use in returning an EOF. There is another problem with this program but we will worry about it when we get to the next program and solve it with the one
following that.
capable of containing. An unsigned char type variable can only have the values of zero to 255, so it will return a 255 for a minus one in C. This is a very frustrating problem to try to find. The program can never find the EOF and will therefore never terminate the loop. This is easy to prevent: always have a char or int type variable for use in returning an EOF. There is another problem with this program but we will worry about it when we get to the next program and solve it with the one
following that.
After you compile and run this program and are satisfied with the results, it would be a good exercise to change the name of TENLINES.TXT and run the program again to see that the NULL test actually works as stated. Be sure to change the name back because we are still not finished with TENLINES.TXT.
UNIT 11
C – PREPROCESSOR
Overview
The C preprocessor, often known as cpp, is a macro processor that is used automatically by the C compiler to transform your program before compilation. It is called a macro processor because it allows you to define macros, which are brief abbreviations for longer constructs.
The C preprocessor is intended to be used only with C, C++, and Objective-C source code. In the past, it has been abused as a general text processor. It will choke on input which does not obey C’s lexical rules. For example, apostrophes will be interpreted as the beginning of character constants, and cause errors. Also, you cannot rely on it preserving characteristics of the input which are not significant to C-family languages. If a Makefile is preprocessed, all the hard tabs will be removed, and the Makefile will not work.
Having said that, you can often get away with using cpp on things which are not C. Other Algol-ish programming languages are often safe (Pascal, Ada, etc.) So is assembly, with caution. -traditional-cpp mode preserves more white space, and is otherwise more permissive. Many of the problems can be avoided by writing C or C++ style comments instead of native language comments, and keeping macros simple
Include Syntax
Both user and system header files are included using the preprocessing directive `#include’. It has two variants:
#include <file>
This variant is used for system header files. It searches for a file named file in a standard list of system directories. You can prepend directories to this list with the -I option (see Invocation).
#include “file“
This variant is used for header files of your own program. It searches for a file named file first in the directory containing the current file, then in the quote directories and then the same directories used for <file>. You can prepend directories to the list of quote directories with the -iquote option.
The argument of `#include’, whether delimited with quote marks or angle brackets, behaves like a string constant in that comments are not recognized, and macro names are not expanded. Thus, #include specifies inclusion of a system header file named x/*y.
However, if backslashes occur within file, they are considered ordinary text characters, not escape characters. None of the character escape sequences appropriate to string constants in C are processed. Thus, #include “xn\y” specifies a filename containing three backslashes. (Some systems interpret `’ as a pathname separator. All of these also interpret `/’ the same way. It is most portable to use only `/’.)
It is an error if there is anything (other than comments) on the line after the file name.
Object-like Macros
An object-like macro is a simple identifier which will be replaced by a code fragment. It is called object-like because it looks like a data object in code that uses it. They are most commonly used to give symbolic names to numeric constants.
You create macros with the `#define’ directive. `#define’ is followed by the name of the macro and then the token sequence it should be an abbreviation for, which is variously referred to as the macro’s body, expansion or replacement list. For example,
#define BUFFER_SIZE 1024
defines a macro named BUFFER_SIZE as an abbreviation for the token 1024. If somewhere after this `#define’ directive there comes a C statement of the form .
foo = (char *) malloc (BUFFER_SIZE);
then the C preprocessor will recognize and expand the macro BUFFER_SIZE. The C compiler will see the same tokens as it would if you had written .
foo = (char *) malloc (1024);
By convention, macro names are written in uppercase. Programs are easier to read when it is possible to tell at a glance which names are macros.
The macro’s body ends at the end of the `#define’ line. You may continue the definition onto multiple lines, if necessary, using backslash-newline. When the macro is expanded, however, it will all come out on one line. For example,
#define NUMBERS 1,
2,
3
int x[] = { NUMBERS };
==> int x[] = { 1, 2, 3 };
The most common visible consequence of this is surprising line numbers in error messages.
There is no restriction on what can go in a macro body provided it decomposes into valid preprocessing tokens. Parentheses need not balance, and the body need not resemble valid C code. (If it does not, you may get error messages from the C compiler when you use the macro.)
The C preprocessor scans your program sequentially. Macro definitions take effect at the place you write them. Therefore, the following input to the C preprocessor
foo = X;
#define X 4
bar = X;
produces
foo = X;
bar = 4;
When the preprocessor expands a macro name, the macro’s expansion replaces the macro invocation, then the expansion is examined for more macros to expand. For example,
#define TABLESIZE BUFSIZE
#define BUFSIZE 1024
TABLESIZE
==> BUFSIZE
==> 1024
TABLESIZE is expanded first to produce BUFSIZE, then that macro is expanded to produce the final result, 1024.
Notice that BUFSIZE was not defined when TABLESIZE was defined. The `#define’ for TABLESIZE uses exactly the expansion you specify—in this case, BUFSIZE—and does not check to see whether it too contains macro names. Only when you use TABLESIZE is the result of its expansion scanned for more macro names.
This makes a difference if you change the definition of BUFSIZE at some point in the source file. TABLESIZE, defined as shown, will always expand using the definition of BUFSIZE that is currently in effect:
#define BUFSIZE 1020
#define TABLESIZE BUFSIZE
#undef BUFSIZE
#define BUFSIZE 37
Conditional Syntax
A conditional in the C preprocessor begins with a conditional directive: `#if’, `#ifdef’ or `#ifndef’.
Ifdef
The simplest sort of conditional is
#ifdef MACRO
controlled text
#endif /* MACRO */
This block is called a conditional group. controlled text will be included in the output of the preprocessor if and only if MACRO is defined. We say that the conditional succeeds if MACRO is defined, fails if it is not.
The controlled text inside of a conditional can include preprocessing directives. They are executed only if the conditional succeeds. You can nest conditional groups inside other conditional groups, but they must be completely nested. In other words, `#endif’ always matches the nearest `#ifdef’ (or `#ifndef’, or `#if’). Also, you cannot start a conditional group in one file and end it in another.
Even if a conditional fails, the controlled text inside it is still run through initial transformations and tokenization. Therefore, it must all be lexically valid C. Normally the only way this matters is that all comments and string literals inside a failing conditional group must still be properly ended.
The comment following the `#endif’ is not required, but it is a good practice if there is a lot of controlled text, because it helps people match the `#endif’ to the corresponding `#ifdef’. Older programs sometimes put MACRO directly after the `#endif’ without enclosing it in a comment. This is invalid code according to the C standard. CPP accepts it with a warning. It never affects which `#ifndef’ the `#endif’ matches.
Sometimes you wish to use some code if a macro is not defined. You can do this by writing `#ifndef’ instead of `#ifdef’. One common use of `#ifndef’ is to include code only th
e first time a header file is included. See Once-Only Headers.
e first time a header file is included. See Once-Only Headers.
If
The `#if’ directive allows you to test the value of an arithmetic expression, rather than the mere existence of one macro. Its syntax is
#if expression
controlled text
#endif /* expression */
expression is a C expression of integer type, subject to stringent restrictions. It may contain
- Integer constants.
- Character constants, which are interpreted as they would be in normal code.
- Arithmetic operators for addition, subtraction, multiplication, division, bitwise operations, shifts, comparisons, and logical operations (&& and ||). The latter two obey the usual short-circuiting rules of standard C.
- Macros. All macros in the expression are expanded before actual computation of the expression’s value begins.
- Uses of the defined operator, which lets you check whether macros are defined in the middle of an `#if’.
- Identifiers that are not macros, which are all considered to be the number zero. This allows you to write #if MACRO instead of #ifdef MACRO, if you know that MACRO, when defined, will always have a nonzero value. Function-like macros used without their function call parentheses are also treated as zero.
Defined
The special operator defined is used in `#if’ and `#elif’ expressions to test whether a certain name is defined as a macro. defined name and defined (name) are both expressions whose value is 1 if name is defined as a macro at the current point in the program, and 0 otherwise. Thus, #if defined MACRO is precisely equivalent to #ifdef MACRO.
defined is useful when you wish to test more than one macro for existence at once. For example,
#if defined (__vax__) || defined (__ns16000__)
would succeed if either of the names __vax__ or __ns16000__ is defined as a macro.
Conditionals written like this:
#if defined BUFSIZE && BUFSIZE >= 1024
can generally be simplified to just #if BUFSIZE >= 1024, since if BUFSIZE is not defined, it will be interpreted as having the value zero.
If the defined operator appears as a result of a macro expansion, the C standard says the behavior is undefined. GNU cpp treats it as a genuine defined operator and evaluates it normally. It will warn wherever your code uses this feature if you use the command-line option -pedantic, since other compilers may handle it differently.
Else
The `#else’ directive can be added to a conditional to provide alternative text to be used if the condition fails. This is what it looks like:
#if expression
text-if-true
#else /* Not expression */
text-if-false
#endif /* Not expression */
If expression is nonzero, the text-if-true is included and the text-if-false is skipped. If expression is zero, the opposite happens.
You can use `#else’ with `#ifdef’ and `#ifndef’, too.
Elif
One common case of nested conditionals is used to check for more than two possible alternatives. For example, you might have
#if X == 1
…
#else /* X != 1 */
#if X == 2
…
#else /* X != 2 */
…
#endif /* X != 2 */
#endif /* X != 1 */
Another conditional directive, `#elif’, allows this to be abbreviated as follows:
#if X == 1
…
#elif X == 2
…
#else /* X != 2 and X != 1*/
…
#endif /* X != 2 and X != 1*/
`#elif’ stands for “else if”. Like `#else’, it goes in the middle of a conditional group and subdivides it; it does not require a matching `#endif’ of its own. Like `#if’, the `#elif’ directive includes an expression to be tested. The text following the `#elif’ is processed only if the original `#if’-condition failed and the `#elif’ condition succeeds.
More than one `#elif’ can go in the same conditional group. Then the text after each `#elif’ is processed only if the `#elif’ condition succeeds after the original `#if’ and all previous `#elif’ directives within it have failed.
`#else’ is allowed after any number of `#elif’ directives, but `#elif’ may not follow `#else’.
SYSTEM DEVELOPMENT
fig:show source code processing mechanism
Introduction
A Programming language is a notational system for describing tasks/computations in a machine and .human readable form.
Most computer languages are designed to facilitate certain operations and not others
: Numerical computation, or text manipulation, or I/O.
More broadly, a computer language typically embodies a particular programming paradigm
Characteristics of a programming language:
Every language has syntax and semantics:
? Syntax: The syntax of a program is the form of its declarations, expressions, statements and program units.
? Semantic: The semantic of a program is concerned with the meaning of its program.
Programming Languages
Languages are used to communicate between different entities. Computer language makes it possible to talk to the computers and ask the computer to perform specific work. Computer language produces programs which are executed by CPU, and then CPU instructs all the other parts of computers to perform work accordingly. Computers only understand programs in their own machine language. Machine language is the language of 0’s and 1’s. It is difficult to write program in machine language.
Example of computer Languages:
(i) High – level language
(ii) Machine language
Example of computer Languages:
(i) High – level language
(ii) Machine language
High-Level Languages (HLLs)
A high level language is one which hide details on how a computer operates in favor of making more abstract,human way at instructing it to perform tasks.
HLLs are programming languages that look like natural language text.
ADVANTAGES :
(1) They make programming easier and more abstract
(2) HLLs programs are machine independent, they can run on different hardware platforms i.e Difficult computer with different instruction sets and a compiler is a program which do this.
Machine language:
sometimes referred to as machine code or object code.Machine language is a collection of binary digits or bits that the computer reads and interprets .Machine language is only language a computer is capable of understanding .
HLLs are programming languages that look like natural language text.
ADVANTAGES :
(1) They make programming easier and more abstract
(2) HLLs programs are machine independent, they can run on different hardware platforms i.e Difficult computer with different instruction sets and a compiler is a program which do this.
Machine language:
sometimes referred to as machine code or object code.Machine language is a collection of binary digits or bits that the computer reads and interprets .Machine language is only language a computer is capable of understanding .
Mapping Between HLL and Machine Language
Translating HLL programs to machine language programs is not a one-to-one mapping. A HLL instruction (usually called a statement) will be translated to one or more machine language instructions. The number of mapped machine instructions depends on the efficiency of the compiler in producing optimized machine language programs from the HLL programs. A machine language program produced by a compiler or an assembler. Usually, machine language programs produced by compilers are not efficient (i.e. they contain many unnecessary instructions that increase processing and slow down execution).
Assembly language
Assembly language is the most basic programming language available for any processor. With assembly language, a programmer works only with operations implemented directly on the physical CPU. Assembly language lacks high-level conveniences such as variables and functions, and it is not portable between various families of processors. Nevertheless, assembly language is the most powerful computer programming language available, and it gives programmers the insight required to write effective code in high-level languages.
Machine code for displaying $ sign on lower right corner of screen.
10111000, 00000000, 10111000, 10001110, 11011000, 11000110, 00000110, 10011110, 00001111, 00100100, 11001101, 00011111
The program above, written in assembly language, looks like this:
MOV AX, 47104MOV DS, AXMOV [3998], 36INT 32
When an assembler reads this sample program, it converts each line of code into one CPU-level instruction.
Advantages of learning Assembly language
1. Very useful for making efficient and fast running programs.
2. Very useful for making small programs for embedded system applications.
3. Easy to access hardware in assembly language4. Writing compact code.
The Compiler
Compiler is a program that translates the high level programs to machine code either directly or via assembler
The Assembler
The program that translates from assembly language to machine language is called an assembler. It allows the programmer to specify the memory locations for his data and programs and symbolically refer to them in his assembly code. It will translate these symbolic addresses to actual (physical) addresses in the produced machine code.
The Linker
This is the program that is used to link together separately assembled/compiled programs into a single executable code. The linker program is used to create executable code which finally runs on the CPU.
Debugger and Monitor
These are the tools that allow the assembly programmers to:
1. Display and alter the contents of memory and registers while running their code.
2. Perform disassemble of their machine code (show the assembly language equivalent).
3. Permit them to run their programs stop (or halt) them, run them step-by-step or insert break points. Break points: Positions in the program that if are encountered during run time, the program will be halted so the programmer can examine the memory and registers contents and determine what went wrong.
Compilers and Interpreters
All high level language (HLL) programs need to be translated into machine code (aka 1′s and 0′s/binary). There are two forms of translators to make this happen; Compilers and Interpreters.
Compilers – These turn the whole HLL program into machine code in one go. This results in a file (object code) that can then be run on any computer system (is very portable) and runs very quickly (as it does not need to be translated again). Unfortunately if you have made a mistake (called a syntax error) in your program then it won’t run at all.
Interpreters – These turn the HLL program into binary each time you try and run it. An interpreter will go through each line of your program each time it translates it. This means that it is slow to translate your program.
fig: show translation of program in HLL to machine language
1. Software Paradigms
Introduction
“Paradigm” (a Greek word meaning example) is commonly used to refer to a category of entities that share a common characteristic.
We can distinguish between three different kinds of Software Paradigms:
∙ Programming Paradigm is a model of how programmers communicate and calculation to computers
∙ Software Design Paradigm is a model for implementing a group of applications sharing common properties
∙ Software Development Paradigm is often referred to as Software Engineering, may be seen as a management model for implementing big software projects using engineering principles.
Fig: show software development paradigm
Programming Paradigm
A Programming Paradigm is a model for a class of Programming Languages that share a set of common characteristics.
A programming language is a system of signs used to communicate a task/algorithm to a computer, causing the task to be performed. The task to be performed is called a computation, which follows absolutely precise and unambiguous rules.
Components to any language
(i ) The language paradigm is a general principles that are used by a programmer to communicate a task/algorithm to a computer.
(ii) The syntax of the language is a way of specifying what is legal in the phrase structure of the language; knowing the syntax is analogous to knowing how to spell and form sentences in a natural language like English. However, this doesn’t tell us anything about what the sentences mean.
(iii) The third component is semantics, or meaning, of a program in that language. Ultimately, without semantics, a programming language is just a collection of meaningless phrases; hence, the semantics is the crucial part of a language.
There have been a large number of programming languages. Back in the 60’s there were over 700 of them – most were academic, special purpose, or developed by an organization for their own needs.
Fortunately, there are just four major programming language paradigms:
∙ Imperative (Procedural) Paradigm (FORTRAN, C, Ada, etc.)
∙ Object-Oriented Paradigm (Smalltalk, Java, C++)
∙ Logic Paradigm (Prolog)
∙ Functional Paradigm (Lisp, ML, Haskell)
Generally, a selected Programming Paradigm defines main property of a software developed by means of a programming language supporting the paradigm.
• Scalability/modifiability
• Inerrability/reusability
• Portability
• Performance
• Reliability
• Ease of creation
Software Design Paradigm
Software Design Paradigm embody the results of people’s ideas on how to construct programs, combine them into large software systems and formal mechanisms for how those ideas should be expressed.
Thus, we can say that a Software Design Paradigm is a model for a class of problems that share a set of common characteristics.
Software design paradigms can be sub-divided as:
∙ Design Patterns
∙ Components
∙ Software Architecture
∙ Frameworks
It should be especially noted that a particular Programming Paradigm essentially defines software design paradigms. For example, we can speak about Object-Oriented design patterns, procedural components (modules), functional software architecture, etc.
Design Patterns:
A design pattern is a proven solution for a general design problem. It consists of communicating ‘objects’ that are customized to solve the problem in a particular context.
Patterns have their origin in object-oriented programming where they began as collections of objects organized to solve a problem. There isn’t any fundamental relationship between patterns and objects; it just happens they began there. Patterns may have arisen because objects seem so elemental, but the problems we were trying to solve with them were so complex.
∙ Architectural Patterns: An architectural pattern expresses a fundamental structural organization or schema for software systems. It provides a set of predefined subsystems, specifies their responsibilities, and includes rules and guidelines for organizing the relationships between them.
∙ Idioms: An idiom is a low-level pattern specific to a programming language. An idiom describes how to implement particular aspects of components or the relationships between them using the features of the given language.
Components:
A component is a physical and replaceable part of a system that conforms to and provides the realization of a set of interfaces,typically represents the physical packaging of otherwise logical elements, such as classes, interfaces, and collaborations
Software components are binary units of independent production, acquisition, and deployment that interact to forma a functioning program
Software components are binary units of independent production, acquisition, and deployment that interact to forma a functioning program
A component must be compatible and inter operate with a whole range of other components.
Examples of components: “Window”, “Push Button”, “Text Editor”, etc.
Software Architecture:
Software architecture is the structure of the components of the solution. A particular software architecture decomposes a problem into smaller pieces and attempts to find a solution (Component) for each piece. We can also say that an architecture defines a software system components, their integration and interoperability:
∙ Integration means the pieces fit together well.
∙ Interoperation means that they work together effectively to produce an answer.
There are many software architectures. Choosing the right one can be a difficult problem in itself.
Frameworks:
A software framework is a reusable mini-architecture that provides the generic structure and behavior for a family of software abstractions, along with a context of metaphors which specifies their collaboration and use within a given domain.
Frameworks can be seen as an intermediate level between components and a software architecture.
Example: Suppose architecture of a WBT system reuse such components as “Text Editing Input object” and “Push buttons”. A software framework may define an “HTML Editor” which can be firther reused for building the architecture.
Software Engineering and Software Paradigms
The term “software engineering” was coined in about 1969 to mean “the establishment and use of sound engineering principles in order to economically obtain software that is reliable and works efficiently on real machines”.
This view opposed uniqueness and “magic” of programming
in an effort to move the development of software from “magic” (which only a select few can do) to “art” (which the talented can do) to “science” (which supposedly anyone can do!). There have been numerous definitions given for software engineering (including that above and below).
in an effort to move the development of software from “magic” (which only a select few can do) to “art” (which the talented can do) to “science” (which supposedly anyone can do!). There have been numerous definitions given for software engineering (including that above and below).
Software Engineering is not a discipline; it is an aspiration, as yet unarchived. Many approaches have been proposed including reusable components, formal methods, structured methods and architectural studies. These approaches chiefly emphasize the engineering product; the solution rather than the problem it solves.
Software Development current situation:
∙ People developing systems were consistently wrong in their estimates of time, effort, and costs
∙ Reliability and maintainability were difficult to achieve
∙ Delivered systems frequently did not work
i.e 1979 study of a small number of government projects showed that:
* 2% worked
* 3% could work after some corrections
* 45% delivered but never successfully used
* 20% used but extensively reworked or abandoned
* 30% paid and undelivered
∙ Fixing bugs in delivered software produced more bugs
∙ Increase in size of software systems
∙ NASA
∙ Star Wars Defense Initiative
∙ Social Security Administration
∙ financial transaction systems
∙ Changes in the ratio of hardware to software costs
∙ early 60’s – 80% hardware costs
∙ middle 60’s – 40-50% software costs
∙ today – less than 20% hardware costs
∙ Increasingly important role of maintenance
∙ Fixing errors, modification, adding options
∙ Cost is often twice that of developing the software
∙ Advances in hardware (lower costs)
∙ Advances in software techniques (e.g., users interaction)
∙ Increased demands for software
∙ Medicine, Manufacturing, Entertainment, Publishing
∙ Demand for larger and more complex software systems
∙ Airplanes (crashes), NASA (aborted space shuttle launches),
∙ “ghost” trains, runaway missiles,
∙ ATM machines (have you had your card “swallowed”?), life-support systems, car systems, etc.
∙ US National security and day-to-day operations are highly dependent on computerized systems.
Manufacturing software can be characterized by a series of steps ranging from concept exploration to final retirement; this series of steps is generally referred to as a software lifecycle.
Steps or phases in a software lifecycle fall generally into these categories:
∙ Requirements (Relative Cost 2%)
∙ Specification (analysis) (Relative Cost 5%)
∙ Design (Relative Cost 6%)
∙ Implementation (Relative Cost 5%)
∙ Testing (Relative Cost 7%)
∙ Integration (Relative Cost 8%)
∙ Maintenance (Relative Cost 67%)
∙ Retirement
Software engineering employs a variety of methods, tools, and paradigms.
Paradigms refer to particular approaches or philosophies for designing, building and maintaining software. Different paradigms each have their own advantages and disadvantages which make one more appropriate in a given situation than perhaps another (!).
A method (also referred to as a technique) is heavily depended on a selected paradigm and may be seen as a procedure for producing some result. Methods generally involve some formal notation and process(es).
Tools are automated systems implementing a particular method.
Thus, the following phases are heavily affected by selected software paradigms
∙ Design
∙ Implementation
∙ Integration
∙ Maintenance
The software development cycle involves the activities in the production of a software system. Generally the software development cycle can be divided into the following phases:
∙ Requirements analysis and specification
∙ Design
∙ Preliminary design
∙ Detailed design
∙ Implementation
∙ Component Implementation
∙ Component Integration
∙ System Documenting
∙ Testing
∙ Unit testing
∙ Integration testing
∙ System testing
∙ Installation and Acceptance Testing
∙ Maintenance
∙ Bug Reporting and Fixing
∙ Change requirements and software upgrading
Software life cycles that will be briefly reviewed include:
∙ Build and Fix model
∙ Waterfall and Modified Waterfall models
∙ Rapid Prototyping
∙ Boehm’s spiral model
Build and Fix model
This works OK for small, simple systems, but is completely unsatisfactory for software systems of any size. It has been shown empirically that the cost of changing a software product is relatively small if the change is made at the requirements or design phases but grows large at later pha
ses.
ses.
The cost of this process model is actually far greater than the cost of a properly specified and designed project. Maintenance can also be problematic in a software system developed under this scenario.
Figure: Build and Fix model
Waterfall and Modified Waterfall models
Waterfall Model
Offered a means of making the development process more structured, expresses the interaction between subsequent phases.
Figure: Waterfall model
Each phase cascades into the next phase. In the original waterfall model, a strict sequentially was at least implied. This meant that one phase had to be completed before the next phase was begun.
It also did not provide for feedback between phases or for updating/re-definition of earlier phases. Implies that there are definite breaks between phases, i.e., that each phase has a strict, non-overlapping start and finish and is carried out sequentially.
Critical point is that no phase is complete until the documentation and/or other products associated with that phase are completed.
Modified Waterfall Model
Needed to provide for overlap and feedback between phases. Rather than being a simple linear model, it needed to be an iterative model. To facilitate the completion of the goals, milestones, and tasks, it is normal to freeze parts of the development after a certain point in the iteration. Verification and validation are added. Verification checks that the system is correct (building the system right). Validation checks that the system meets the users desires (building the right system).
Figure: Modified Waterfall model
The waterfall model (and modified waterfall model) is inflexible in the partitioning of the project into distinct phases. However, they generally reflect engineering practice.
Considerable emphasis must be placed on discerning users’ needs and requirements prior to the system b
eing built. The identification of users’ requirements as early as possible, and the agreement between user and developer with respect to those requirements, often is the deciding factor in the success or failure of a software project. These requirements are documented in the requirements specification, which is used to verify whether subsequent phases are complying with the requirements. Unfortunately specifying users’ requirements is very much an art, and as such is extremely difficult. Validation feedback can be used to prevent the appearance of a strong divergence between the system under development and the users’ expectations for the delivered system.
eing built. The identification of users’ requirements as early as possible, and the agreement between user and developer with respect to those requirements, often is the deciding factor in the success or failure of a software project. These requirements are documented in the requirements specification, which is used to verify whether subsequent phases are complying with the requirements. Unfortunately specifying users’ requirements is very much an art, and as such is extremely difficult. Validation feedback can be used to prevent the appearance of a strong divergence between the system under development and the users’ expectations for the delivered system.
Unfortunately, the waterfall life cycle (and the modified waterfall life cycle) are inadequate for realistic validation activities. They are exclusively document driven models. The resulting design reality is that only 50% of the design effort occurs during the actual design phase with 1/3 of the design effort occurring during the coding activity! This is topped by the fact that over 16% of the design effort occurs after the system is supposed to be completed! In general the behavior of many individuals in this type of process is opportunistic. The boundaries of phases are indiscriminately crossed with deadlines being somewhat arbitrary.
Rapid Prototyping
Prototyping also referred to as evolutionary development, prototyping aims to enhance the accuracy of the designer’s perception of the user’s requirements. Prototyping is based on the idea of developing an initial implementation for user feedback, and then refining this prototype through many versions until an satisfactory system emerges. The specification, development and validation activities are carried out concurrently with rapid feedback across the activities. Generally, prototyping is characterized by the use of very high-level languages, which probably will not be used in the final software implementation but which allow rapid development, and the development of a system with less functionality with respect to quality attributes such as robustness, speed, etc.
Figure: Rapid Prototyping model
Prototyping allows the clarification of users requirements through, particularly, the early development of the user interface. The user can then try out the system, albeit a (sub) system of what will be the final product. This allows the user to provide feedback before a large investment has been made in the development of the wrong system.
There are two types of prototypes:
∙ Exploratory programming: Objective is to work with the user to explore their requirements and deliver a final system. Starts with the parts of the system which are understood, and then evolves as the user proposes new features.
∙ Throw-away prototyping: Objective is to understand the users’ requirements and develop a better requirements definition for the system. Concentrates on poorly understood components.
Boehm’s Spiral Model
Need an improved software life cycle model which can subsume all the generic models discussed so far. Must also satisfy the requirements of management.
Boehm proposed a spiral model where each round of the spiral
∙ a) identifies the sub problem which has the highest risk associated with it
∙ b) finds a solution for that problem.
Imperative (Procedural) Programming Paradigm
Any imperative program consists of
∙ Declarative statements which gives a name to a value. A named value is called a variable. Thus, declarative statements create variables. In procedural languages it is common for the same variable to keep changing value as the program runs.
∙ Imperative statements which assign new values to variables
∙ Program flow control statements which define order in which imperative statements are evaluated.
Example:
var factorial = 1; /*Declarative statement*/
var argument = 5;
var counter = 1;
while (counter <= argument) /*Program flow statement*/
{
factorial = factorial*counter; /*Imperative statement*/
counter++;
}
Variables and Types
Different variables in a program may have different types. For example, a language may treat a two bytes as a string of characters and as a number. Dividing a string ‘20’ by number ‘2’ may not be possible. A language like this has at least two types – one for strings and one for numbers.
Example:
var PersonName = new String(); /*variable type “string”*/
var PersonSalary = new Integer(); /*variable type “integer”*/
Types can be weak or strong. Strong type means that at any point in the p
rogram, when it is running, the type of a particular chunk of data (i.e. variable) is known. Weak type means that imperative operators may change a variable type.
rogram, when it is running, the type of a particular chunk of data (i.e. variable) is known. Weak type means that imperative operators may change a variable type.
Example:
var PersonName; /*variable of a weak type”*/
PersonName = 0; /*PersonName is an “integer”*/
PersonName = ‘Nick’; /*PersonName is a “string”*/
Obviously, languages supporting weak variable types need sophisticated rules for type conversions.
Example:
var PersonName; /*variable of a weak type”*/
PersonName = 0; /*PersonName is an “integer”*/
PersonName = PersonName + ‘Nick’ + 0; /*PersonName is a string “0Nick0”*/
To support weak typing, values are boxed together with information about their type – value and type are then passed around the program together.
Fig . show how to support weak typing
Functions (Procedures)
Programmers have dreamed/attempted of building systems from a library of reusable software components bound together with a little new code.
Imperative (Procedural) Programming Paradigm is essentially based on concept of so-called “Functions” also known as “Modules”, “Procedures” or “Subroutines”.
A function is a section of code that is parceled off from the main program and hidden behind an interface:
function factorial(parameter)
{
var i = 1;
var result = 1;
while(i <= parameter)
{
result = result * i;
i++;
}
return(result);
}
∙ The code within the function performs a particular activity, here generating a factorial value
∙ The idea of parceling the code off into a subroutine is to provide a single point of entry. Anyone wanting a new factorial value has only to call the “factorial” function with the appropriate parameters.
Here’s what the conventional application based on the Imperative (Procedural) Programming Paradigm looks like:
∙ Main procedure determines the control flow for the application
∙ Functions are called to perform certain tasks or specific logic
∙ The main and sub procedures that comprise the implementation are structured as a hierarchy of tasks.
∙ The source for the implementation is compiled and linked with any additional executable modules to produce the application
Data Exchange between Functions (Procedures)
When a software system functionality is decomposed into a number of functional modules, data exchange/flow becomes a key issue. Imperative (Procedural) Programming Paradigm extends the concept of variables to be used as such data exchange mechanism.
Thus, each procedure may have a number of special variables called parameters. The parameters are just named place-holders which will be replaced with particular values (or references to existing values) of arguments when the procedure is called.
Example:
function main()
{
var argument = 25;
var result = factorial(argument)
/* Note, the imperative operator replaces the “parameter” place holder with a current value of the variable “argument”*/
}
function factorial(parameter)
{
var i = 1;
var result = 1;
while(i <= parameter)
{
result = result * i;
i++;
}
return(result);
}
Passing arguments to a function
There might be two different techniques for such replacement which are known as: passing an argument value and passing an argument reference.
In case of passing a value, a current argument value is duplicated as a value for new parameter variable dynamically created for the procedure. In this case, variables used as arguments for calling sub-routines cannot be modified by imperative operators inside of the sub-routines.
In case of passing a reference, the sub-routine gets control (i.e. reference) to a current value of the argument variable. In this case, variables used as arguments for calling sub-routines can be modified by imperative operators inside of the sub-routines.
Thus, types of variables defined as parameters of a function should be equivalent to (or at least compatible with) types of variables (constants) used as arguments.
Polymorhic Languages
When strong static typing is enforced it can be difficult to write generic algorithms – functions that can act on a range of different types. Polymorphism allows “any” to be included in the type system. For example, the types of a list of items are unimportant if we only want to know the length of the list, so a function can have a type that indicates that it takes lists of “any” type and returns an integer. Moreover, polymorphism allows combining functions implemented by means of different programming languages supporting potentially different types of variables.
Pragmatically speaking, polymorphic languages allow to define new types as hidden functions which should be automatically applied to values of such “user-defined type” to convert it to values of a “standard” language type.
Variable Scope
Normally, variables that are defined within a function are created each time the function is used and destroyed again when the function ends. The value that the function returns is not destroyed, but it is not possible to assign a value to the variable inside the function definition from outside.
Example:
function one()
{
var dynamicLocalVariable = 25;
two();
/* at this point just one variable “dynamicLocalVariable” exists */
alert(dynamicLocalVariable);
/* this operator displays the current value “25” */
}
function two()
{
var dynamicLocalVariable = 55;
/* at this point two variables “dynamicLocalVariable” exists */
alert(dynamicLocalVariable);
/* this operator displays the current value “55” */
}
Such variables are called dynamic local variables. There may be also so-called static local variables. Static local variables that are defined within a function, are created only once when the function is used for a first time. The value of such variable is not destroyed and can be reused when the function is called again.
Example:
function one()
{
var x = two();
alert(x);
/* this operator displays the current value “10” */
x = two();
alert(x);
/* this operator displays the current value “20” */
}
function two()
{
var static staticLocalVariable = 0;
staticLocalVariable = staticLocalVariable + 10;
return(staticLocalVariable);
}
Note that function “two” returns different values for one and the same set of arguments. Such functions are called reactive functions. Generally, testing and maintenance of projects having many reactive functions becomes a very difficult task. For practical reasons many software projects do use some static data.
Note, it is still not possible to assign a value to the local static variable inside a function from outside.
There may be also so-called static global variables. Static global variables that are defined within any function, are created only once when the whole software system is initiated. The value of such variable is never destroyed and can be reused by imperative operators inside any function.
Example:
function one()
{
var global globalLocalVariable = 0;
two();
alert(globalLocalVariable);
/* this operator displays the current value “10” */
two();
alert(globalLocalVariable);
/* this operator displays the current value “20” */
}
function two()
{
globalLocalVariable = globalLocalVariable + 10;
}
Here, the function “two” also demonstrates a “reactive” behavior. Maintaining and testing of projects heavily based on global variables becomes even more difficult than in case of local static variables. Nevertheless, for practical reasons many software development paradigms do use such global static variables.
Software Design Methodology (Procedural Paradigm)
Benefits of the Paradigm:
Re-usability: anyone that needs a particular functionality can use an appropriate module, without having to code the algorithm from scratch.
Specialization: one person can concentrate on writing a best possible module (function) for a particular task while others look after other areas.
Upgradability: if a programmer comes up with a better way to implement a module then he/she simply replace the code within the function. Provided the interface remains the same – in other words the module name and the order and type of each parameter are unchanged – then no changes should be necessary in the rest of the application.
However procedural modules have serious limitations:
∙ For a start, there is nothing to stop another programmer from meddling with the code within a module, perhaps to better adapt it to the needs of a particular application.
∙ There is also nothing to stop the code within the function making use of global variables, thus negating the benefits of a single interface providing a single point of entry.
Obviously, the paradigm is best suited for the waterfall model of software development.
Design
A particular software system is viewed in terms of its modules and data flowing between them starting with a high-level view.
In this case, software design methodology can be categorized as a Top-down modular design
(functional design viewpoint).
The basic design concepts include:
∙ Modularity
∙ Modules are used to describe a functional decomposition of the system
∙ A module is a unit containing:
∙ executable statements
∙ data structures
∙ other modules
A module:
∙ has a name
∙ can be separately compiled
∙ can be used in a program or by other modules
€€€€ ∙ System design generally determines what goes into a module
Cohesive
∙ Single clearly defined function
∙ Description of when and how used
∙ Loosely Coupled Modules (Modules implement functionality, but not parts of other modules)
€ Black Boxes (information hiding)
∙ each module is a black box
∙ each module has a set of known inputs and a set of predictable outputs
∙ inner workings of module are unknown to user
∙ can be reusable
€Preliminary and Detailed Design specify the modules to carry out the functions in the DataFlow Diagrams (DFD).
Preliminary design deals mainly with Structure Charts:
Hierarchical tree structure
∙ Modules – rectangle boxes
∙ calling relationships are shown with arrows
∙ arrows are labeled with the data flowing between modules
Module Design
∙ Title
∙ Module ID – from structure charts
∙ Purpose
∙ Method – algorithm
∙ Usage – who calls it
∙ External references – other modules called
∙ Calling sequence – parameter descriptions
∙ Input assertion
∙ Output assertion
∙ Local variables
∙ Author(s)
∙ Remarks
Preliminary Design Document
∙ Cover Page
∙ Table of Contents
∙ Design Description
∙ Software Structure Charts
∙ Data Dictionary
∙ Module Designs
∙ Module Headers
∙ Major Data Structures Design
∙ Design Reviews (Examination of all or part of the software design to find design anomalies )
Overview of Detailed Design
∙ select an algorithm for each module
∙ refine the data structures
∙ produce detailed design document
Implementation
Coding (for each Module)
∙ Source Code
∙ Documentation
Integration
∙ Decide what order the modules will be assembled
∙ Assemble and test integration of modules
∙ After final assembly perform system test
∙ Note, coding and testing are often done in parallel
Testing
Types of testing
∙ Unit testing
∙ Integration testing
∙ Acceptance testing
As it was mentioned above, the paradigm is best suited for the waterfall model of software development. Implementing change requirements and especially rapid prototyping are weak points of the programming paradigm.


1 Comment