A PROJECT REPORT ON LEXICAL ANALYZER SESSION: 2008 - 2009
Guided By:
Submitted By:
Ms. Deepti Arora
Ankita Verma (B.E. VIIth sem I.T.)
Submitted to: Department of Information Technology Laxmi Devi Institute of Engineering & Technology, Alwar (Raj.) University of Rajasthan
INDEX 1. Project description
1.1
Objective of the project
2. Project contents 2.1
Software development life cycle
2.2
Study and formulation
2.3
Project category
2.4
Platform (technology/tools)
2.5
Software and hardware used
2.6
Feasibility
2.7
System design
2.8
Data flow diagram
2.9
Entity relationship diagram
2.10 Testing 2.10.1testing methodology 2.10.2testing strategy 2.11 System security 2.12 Implementation and maintenance 2.13 Evaluation 2.14 code 3. Advantages and disadvantages 4. Conclusion 5. reference
ACKNOWLEDGEMENT We thank Mr. Sudhir Pathak, Head of Department, Department of Computer Science and Information Technology, LIET Alwar (Raj.) for his guidance and co-operation. We also acknowledge the advice and help given to us by Ms. Deepti Arora. We would like to extend our gratitude to the entire faculty and staff of the Department of CS & IT, LIET Alwar (Raj.), who stood by us in all pits and falls we had to face during the development phase of this project.
CERTIFICATE OF APPROVAL SESSION: 2008-2009
This is o certify that Ms. Ankita Verma, Ms. Harshi Yadav, Ms. Sameeksha Chauhan have successfully completed the project entitled
“LEXICAL ANALYZER” Under the able guidance of Ms. Deepti Arora towards the partial fulfillment of the of Bachelor’s degree course in Information Technology.
Head of Department
Guide By:
Mr. Sudhir Pathak
Ms. Deepti Arora
Preface The lexical analyzer is responsible for scanning the source input file and translating lexemes into small objects that the compiler can easily process. These small values are often called “tokens”. The lexical analyzer is also responsible for converting sequences of digits in to their numeric form as well as processing other literal constants, for removing comments and whitespace from the source file, and for taking care of many other mechanical details. The lexical analyzer reads a string of characters and checks if a valid token in the grammar. Lexical analysis terminology: • Token: Terminal symbol in a grammar Classes of sequences of characters with a collective meaning Constants, operators, punctuations, keywords. • Lexeme: Character sequence matched by an instance of the token.
Project Description
Lexical analyzer converts stream of input characters into a stream of tokens. The different tokens that our lexical analyzer identifies are as follows: KEYWORDS: int, char, float, double, if, for, while, else, switch, struct, printf, scanf, case, break, return, typedef, void IDENTIFIERS: main, fopen, getch etc NUMBERS: positive and negative integers, positive and negative floating point numbers. OPERATORS: +, ++, -, --, ||, *, ?, /, >, >=, <, <=, =, ==, &, &&. BRACKETS: [ ], { }, ( ). STRINGS: Set of characters enclosed within the quotes COMMENT LINES: Ignores single line, multi line comments For tokenizing into identifiers and keywords we incorporate a symbol table which initially consists of predefined keywords. The tokens are read from an input file. If the encountered token is an identifier or a keyword the lexical analyzer will look up in the symbol table to check the existence of the respective token. If an entry does exist then we proceed to the next token. If not then that particular token along with the token value is written into the symbol table. The rest of the tokens are directly displayed by writing into an output file. The output file will consist of all the tokens present in our input file along with their respective token values.
SYSTEM DESIGN: Process: The lexical analyzer is the first phase of a compiler. Its main task is to read the input characters and produce as output a sequence of tokens that the parser uses for syntax analysis. This interaction, summarized schematically in fig. a.
Upon receiving a “get next token “command from the parser, the lexical analyzer reads the input characters until it can identify next token. Sometimes, lexical analyzers are divided into a cascade of two phases, the first called “scanning”, and the second “lexical analysis”. The scanner is responsible for doing simple tasks, while the lexical analyzer proper does the more complex operations. The lexical analyzer which we have designed takes the input from an input file. It reads one character at a time from the input file, and continues to read until end of the file is reached. It recognizes the valid identifiers, keywords and specifies the token values of the keywords. It also identifies the header files, #define statements, numbers, special characters, various relational and logical operators, ignores the white spaces and comments. It prints the output in a separate file specifying the line number.
BLOCK DIAGRAM:
OBJECTIVE OF THE PROJECT
AIM OF THE PROJECT
Aim of the project is to develop a Lexical Analyzer that can generate tokens for the further processing of compiler.
PURPOSE OF THE PROJECT
The lexical features of a language can be specified using types-3 grammar. The job of the lexical analyzer is to read the source program one character at a time and produce as output a stream of tokens. The tokens produced by the lexical analyzer serve as input to the next phase, the parser. Thus, the lexical analyzer’s job is to translate the source program in to a form more conductive to recognition by the parser.
GOALS
To create tokens from the given input stream. SCOPE OF PROJECT
Lexical analyzer converts the input program into character stream of valid words of language, known as tokens.
The parser looks into the sequence of these tokens & identifies the language construct occurring in the input program. The parser and the lexical analyzer work hand in hand; in the sense that whenever the parser needs further tokens to proceed, it request the lexical analyzer. The lexical analyzer in turn scans the remaining input stream & returns the next token occurring there. Apart from that, the lexical analyzer also participates in the creation & maintenance of symbol table. This is because lexical analyzer is the first module to identify the occurrence of a symbol. If the symbol is getting defined for the first time, it needs to be installed into the symbol table. Lexical analyzer is most widely used for doing the same.
PROJECT CONTENTS
SOFTWARE DEVELOPMENT LIFE CYCLE
Systems Development Life Cycle (SDLC), or Software Development Life Cycle, in systems engineering and software engineering relates to the process of developing systems, and the models and methodologies, that people use to develop these systems, generally computer or information systems. In software engineering this SDLC concept is developed into all kinds of software development methodologies, the framework that is used to structure, plan, and control the process of developing an information system, the software development process.
Overview Systems Development Life Cycle (SDLC) is any logical process used by a systems analyst to develop an information system, including requirements, validation, training, and user ownership. An SDLC should result in a high quality system that meets or exceeds customer expectations, within time and cost estimates, works effectively and efficiently in the current and planned Information Technology infrastructure, and is cheap to maintain and cost-effective to enhance.[2] Computer systems have become more complex and usually (especially with the advent of Service-Oriented Architecture) link multiple traditional systems often supplied by different software vendors. To manage this, a number of system development life cycle (SDLC) models have been created: waterfall, fountain, spiral, build and fix, rapid prototyping, incremental, and synchronize and stabilize. Although in the academic sense, SDLC can be used to refer to various models, SDLC is typically used to refer to a waterfall methodology. In project management a project has both a life cycle and a "systems development life cycle" during which a number of typical activities occur.
The project life cycle (PLC) encompasses all the activities of the project, while the systems development life cycle (SDLC) is focused on accomplishing the product requirements.
Systems Development Phases Systems Development Life Cycle (SDLC) adheres to important phases that are essential for developers, such as planning, analysis, design, and implementation, and are explained in the section below. There are several Systems Development Life Cycle Models in existence. The oldest model, that was originally regarded as "the Systems Development Life Cycle" is the waterfall model: a sequence of stages in which the output of each stage becomes the input for the next. These stages generally follow the same basic steps but many different waterfall methodologies give the steps different names and the number of steps seems to vary between 4 and 7. There is no definitively correct Systems Development Life Cycle model, but t he steps can be characterized and divided in several steps.
Phases
Initiation Phase The Initiation Phase begins when a business sponsor identifies a need or an opportunity. The purpose of the Initiation Phase is to: • Identify and validate an opportunity to improve business accomplishments of the organization or a deficiency related to a business need. • Identify significant assumptions and constraints on solutions to that need. • Recommend the exploration of alternative concepts and methods to satisfy the need including questioning the need for technology, i.e., will a change in the business process offer a solution? • Assure executive business and executive technical sponsorship.
System Concept Development Phase The System Concept Development Phase begins after a business need or opportunity is validated by the Agency/Organization Program Leadership and the Agency/Organization CIO. The purpose of the System Concept Development Phase is to:
Determine the feasibility and appropriateness of the alternatives. Identify system interfaces. Identify basic functional and data requirements to satisfy the business need.
Establish system boundaries; identify goals, objectives, critical success factors, and performance measures. • Evaluate costs and benefits of alternative approaches to satisfy the basic functional requirements • Assess project risks • Identify and initiate risk mitigation actions, and • Develop high-level technical architecture, process models, data models, and a concept of operations.
Planning Phase During this phase, a plan is developed that documents the approach to be used and includes a discussion of methods, tools, tasks, resources, project schedules, and user input. Personnel assignments, costs, project schedule, and target dates are established. A Project Management Plan is created with components related to acquisition planning, configuration management planning, quality assurance planning, concept of operations, system security, verification and validation, and systems engineering management planning.
Requirements Analysis Phase This phase formally defines the detailed functional user requirements using high-level requirements identified in the Initiation, System Concept, and Planning phases. It also delineates the requirements in terms of data, system performance, security, and maintainability requirements for the system. The requirements are defined in this phase to a level of detail sufficient for systems design to proceed. They need to be measurable, testable, and relate to the business need or opportunity identified in the Initiation Phase. The requirements that will be used to determine acceptance of the system are captured in the Test and Evaluation Master Plan.
The purposes of this phase are to: Further define and refine the functional and data requirements and document them in the Requirements Document, Complete business process reengineering of the functions to be supported (i.e., verify what information drives the business process, what information is generated, who generates it, where does the information go, and who processes it), Develop detailed data and process models (system inputs, outputs, and the process. Develop the test and evaluation requirements that will be used to determine acceptable system performance.
Design Phase During this phase, the system is designed to satisfy the functional requirements identified in the previous phase. Since problems in the design phase could be very expensive to solve in the later stage of the software development, a variety of elements are considered in the design to mitigate risk. These include: • Identifying features.
potential
risks
and defining
mitigating
design
• Performing a security risk assessment. • Developing a conversion plan to migrate current data to the new system. • Determining the operating environment. • Defining major subsystems and their inputs and outputs. • Allocating processes to resources.
• Preparing detailed logic specifications for each software module.
Development Phase Effective completion of the previous stages is a key factor in the success of the Development phase. The Development phase consists of: • Translating the detailed requirements and design into system components. • Testing individual elements (units) for usability. • Preparing for integration and testing of the IT system.
Integration and Test Phase Subsystem integration, system, security, and user acceptance testing is conducted during the integration and test phase. The user, with those responsible for quality assurance, validates that the functional requirements, as defined in the functional requirements document, are satisfied by the developed or modified system. OIT Security staff assesses the system security and issue a security certification and accreditation prior to installation/implementation. Multiple levels of testing are performed, including: • Testing at the development facility by the contractor and possibly supported by end users • Testing as a deployed system with end users working together with contract personnel • Operational testing by the end user alone performing all functions.
Implementation Phase
This phase is initiated after the system has been tested and accepted by the user. In this phase, the system is installed to support the intended business functions. System performance is compared to performance objectives established during the planning phase. Implementation includes user notification, user training, installation of hardware, installation of software onto production computers, and integration of the system into daily work processes. This phase continues until the system is operating in production in accordance with the defined user requirements.
Operations and Maintenance Phase The system operation is ongoing. The system is monitored for continued performance in accordance with user requirements and needed system modifications are incorporated. Operations continue as long as the system can be effectively adapted to respond to the organization’s needs. When modifications or changes are identified, the system may reenter the planning phase. The purpose of this phase is to: • Operate, maintain, and enhance the system. • Certify that the system can process sensitive information. • Conduct periodic assessments of the system to ensure the functional requirements continue to be satisfied. • Determine when the system needs to be modernized, replaced, or retired.
Disposition Phase
Disposition activities ensure the orderly termination of the system and preserve the vital information about the system so that some or all of the information may be reactivated in the future if necessary. Particular emphasis is given to proper preservation of the data processed by the system, so that the data can be effectively migrated to another system or archived for potential future access in accordance with applicable records management regulations and policies. Each system should have an interface control document defining inputs and outputs and data exchange. Signatures should be required to verify that all dependent users and impacted systems are aware of disposition.
Summary The purpose of a Systems Development Life Cycle methodology is to provide IT Project Managers with the tools to help ensure successful
implementation of systems that satisfy Agency strategic and business objectives. The documentation provides a mechanism to ensure that executive leadership, functional managers and users sign-off on the requirements and implementation of the system. The process provides Agency managers and the Project Manager with the visibility of design, development, and implementation status needed to ensure delivery ontime and within budget.
SDLC OBJECTIVES The objectives of the SDLC approach are to: • Deliver quality systems which meet or exceed customer expectations when promised and within cost estimates • Develop quality systems using an identifiable, measurable, and repeatable process. • Establish an organizational and project management structure with appropriate levels of authority to ensure that each system development project is effectively managed throughout its life cycle. • Identify and assign the roles and responsibilities of all affected parties including functional and technical managers throughout the system development life cycle. • Ensure that system development requirements are well defined and subsequently satisfied. • Provide visibility to the State of Maryland functional and technical managers for major system development resource requirements and expenditures. • Establish appropriate levels of management authority to provide timely direction, coordination, control, review, and approval of the system development project. • Ensure project management accountability. • Ensure that projects are developed within the current and planned information technology infrastructure. • Identify project risks early and manage them before they become problems.
SYSTEM STUDY & PROBLEM FORMULATION A Software Requirements Specification (SRS) is a complete description of the behavior of the software of the system to be developed. It includes a set of use cases that describe all the interactions the users will have with the software. Use cases are also known as functional requirements. In addition to use cases, the SRS also contains nonfunctional (or supplementary) requirements. Non-functional requirements are requirements which impose constraints on the design or implementation (such as performance engineering requirements, quality standards, or design constraints).
Purpose The purpose of this software requirements specification (SRS) is to establish the ten major requirements necessary to develop the Software Systems Engineering.
PROJECT CATEGORY Category of this project is Compiler Design based.
COMPILER To define what a compiler is one must first define what a translator is. A translator is a program that takes another program written in one language, also known as the source language, and outputs a program written in another language, known as the target language.
Now that the translator is defined, a compiler can be defined as a translator. The source language is a high-level language such as Java or Pascal and the target language is a low-level language such as machine or assembly.
There are five parts of compilation (or phases of the compiler) 1.)Lexical Analysis 2.)Syntax Analysis 3.)Semantic Analysis 4.)Code Optimization 5.)Code Generation Lexical Analysis is the act of taking an input source program and outputting a stream of tokens. This is done with the Scanner. The Scanner can also place identifiers into something called the symbol table or place strings into the string table. The Scanner can report trivial errors such as invalid characters in the input file.
Syntax Analysis is the act of taking the token stream from the scanner and comparing them against the rules and patterns of the specified language. Syntax Analysis is done with the Parser. The Parser produces a tree, which can come in many formats, but is referred to as the parse tree. It reports errors when the tokens do not follow the syntax of the specified language. Errors that the Parser can report are syntactical errors such as missing parenthesis, semicolons, and keywords.
Semantic Analysis is the act of determining whether or not the parse tree is relevant and meaningful. The output is intermediate code, also known as an intermediate representation (or IR). Most of the time, this IR is closely related to assembly language but it is machine independent. Intermediate code allows different code generators for different machines and promotes abstraction and portability from specific machine times and languages. (I dare say the most famous example is java’s bytecode and JVM). Semantic Analysis finds more meaningful errors such as undeclared variables, type compatibility, and scope resolution.
Code Optimization makes the IR more efficient. Code optimization is usually done in a sequence of steps. Some optimizations include code hosting, or moving constant values to better places within the code, redundant code discovery, and removal of useless code.
Code Generation is the final step in the compilation process. The input to the Code Generator is the IR and the output is machine language code.
PLATFORM (TECHNOLOGY/TOOLS) In computing, C is a general-purpose computer programming language originally developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories to implement the Unix operating system. Although C was designed for writing architecturally independent system software, it is also widely used for developing application software. Worldwide, C is the first or second most popular language in terms of number of developer positions or publicly available code. It is widely used on many different software platforms, and there are few computer architectures for which a C compiler does not exist. C has greatly influenced many other popular programming languages, most notably C++, which originally began as an extension to C, and Java and C# which borrow C lexical conventions and operators.
Characteristics Like most imperative languages in the ALGOL tradition, C has facilities for structured programming and allows lexical variable scope and recursion, while a static type system prevents many unintended operations. In C, all executable code is contained within functions. Function parameters are always passed by value. Pass-by-reference is achieved in C by explicitly passing pointer values. Heterogeneous aggregate data types (struct) allow related data elements to be combined and manipulated as a unit. C program source text is free-format, using the semicolon as a statement terminator (not a delimiter). C also exhibits the following more specific characteristics: •
non-nest able function definitions
•
variables may be hidden in nested blocks
•
partially weak typing; for instance, characters can be used as integers
•
low-level access to computer memory by converting machine addresses to typed pointers
•
function and data pointers supporting ad hoc run-time polymorphism
•
array indexing as a secondary notion, defined in terms of pointer arithmetic
•
a preprocessor for macro definition, source code file inclusion, and conditional compilation
•
complex functionality such as I/O, string manipulation, and mathematical functions consistently delegated to library routines
•
A relatively small set of reserved keywords (originally 32, now 37 in C99)
•
A lexical structure that resembles B more than ALGOL, for example
•
{ ... }
•
two consecutive equal-signs are to test for equality (compare to . in Fortran or the equal-sign in BASIC)
•
&&
•
a large number of compound operators, such as
rather than ALGOL's the equal-sign is for assignment (copying), much like Fortran
and || in place of ALGOL's and & or (these are semantically distinct from the bit-wise operators & and | because they will never evaluate the right operand if the result can be determined from the left alone (short-circuit evaluation)). +=, ++,
......
Features The relatively low-level nature of the language affords the programmer close control over what the computer does, while allowing special tailoring and aggressive optimization for a particular platform. This allows the code to run efficiently on very limited hardware, such as embedded systems. C does not have some features that are available in some other programming languages: •
No assignment of arrays or strings (copying can be done via standard functions; assignment of objects having struct or union type is supported)
•
No automatic garbage collection
•
No requirement for bounds checking of arrays
•
No operations on whole arrays
•
No syntax for ranges, such as the languages
•
No separate Boolean type: zero/nonzero is used instead[6]
•
No formal closures or functions as parameters (only function and variable pointers)
•
No generators or co routines; intra-thread control flow consists of nested function calls, except for the use of the longjmp or setcontext library functions
A..B
notation used in several
•
No exception handling; standard library functions signify error conditions with the global errno variable and/or special return values
•
Only rudimentary support for modular programming
•
No compile-time polymorphism in the form of function or operator overloading
•
Only rudimentary support for generic programming
•
Very limited support for object-oriented programming with regard to polymorphism and inheritance
•
Limited support for encapsulation
•
No native support for multithreading and networking
•
No standard libraries for computer graphics and several other application programming needs
A number of these features are available as extensions in some compilers, or can be supplied by third-party libraries, or can be simulated by adopting certain coding disciplines.
Operators •
bitwise shifts (<<, >>)
•
assignment (=,
+=, -=, *=, /=, %=, &=, |=, ^=, <<=, >>=)
increment and decrement (++, --) Main article: Operators in C and C++ C supports a rich set of operators, which are symbols used within an expression to specify the manipulations to be performed while evaluating that expression. C has operators for: •
arithmetic (+, -, *, /, %)
•
equality testing (==,
!=)
•
order relations (<, <=, >, >=)
•
boolean logic (!, &&,
•
bitwise logic (~, &, |, ^)
•
reference and dereference (&, *,
•
conditional evaluation (? :)
•
member selection (., ->)
•
type conversion (( ))
•
object size (sizeof)
•
function argument collection (( ))
•
sequencing (,)
•
subexpression grouping (( ))
•
C has a formal grammar, specified by the C standard.
||)
[ ])
•
Data structures C has a static weak typing type system that shares some similarities with that of other ALGOL descendants such as Pascal. There are built-in types for integers of various sizes, both signed and unsigned, floating-point numbers, characters, and enumerated types (enum). C99 added a boolean datatype. There are also derived types including arrays, pointers, records (struct), and untagged unions (union). C is often used in low-level systems programming where escapes from the type system may be necessary. The compiler attempts to ensure type correctness of most expressions, but the programmer can override the checks in various ways, either by using a type cast to explicitly convert a
value from one type to another, or by using pointers or unions to reinterpret the underlying bits of a value in some other way.
Arrays Array types in C are traditionally, of a fixed, static size specified at compile time. (The more recent C99 standard also allows a form of variable-length arrays.) However, it is also possible to allocate a block of memory (of arbitrary size) at run-time, using the standard library's malloc function, and treat it as an array. C's unification of arrays and pointers (see below) means that true arrays and these dynamically-allocated, simulated arrays are virtually interchangeable. Since arrays are always accessed (in effect) via pointers, array accesses are typically not checked against the underlying array size, although the compiler may provide bounds checking as an option. Array bounds violations are therefore possible and rather common in carelessly written code, and can lead to various repercussions, including illegal memory accesses, corruption of data, buffer overruns, and run-time exceptions. C does not have a special provision for declaring multidimensional arrays, but rather relies on recursion within the type system to declare arrays of arrays, which effectively accomplishes the same thing. The index values of the resulting "multidimensional array" can be thought of as increasing in row-major order. Although C supports static arrays, it is not required that array indices be validated (bounds checking). For example, one can try to write to the sixth element of an array with five elements, generally yielding undesirable results. This type of bug, called a buffer overflow or buffer overrun, is notorious for causing a number of security problems. On the other hand, since bounds checking elimination technology was largely nonexistent when C was defined, bounds checking came with a severe performance penalty, particularly in numerical computation. A few years earlier, some Fortran compilers had a switch to toggle bounds checking on
or off; however, this would have been much less useful for C, where array arguments are passed as simple pointers.
Deficiencies Although the C language is extremely concise, C is subtle, and expert competency in C is not common—taking more than ten years to achieve.[11] C programs are also notorious for security vulnerabilities due to the unconstrained direct access to memory of many of the standard C library function calls. In spite of its popularity and elegance, real-world C programs commonly suffer from instability and memory leaks, to the extent that any appreciable C programming project will have to adopt specialized practices and tools to mitigate spiraling damage. Indeed, an entire industry has been born merely out of the need to stabilize large sourcecode bases. Although C was developed for Unix, Microsoft adopted C as the core language of its operating systems. Although all standard C library calls are supported by Windows, there is only ad-hoc support for Unix functionality side-by-side with an inordinate number of inconstant Windows-specific API calls. There is currently no document in existence that can explain programming practices that work well across both Windows and Unix. It is inevitable that C did not choose limit the size or endianness of its types—for example, each compiler is free to choose the size of an int type as any anything over 16 bits according to what is efficient on the current platform. Many programmers work based on size and endianness assumptions, leading to code that is not portable. Also inevitable is that the C standard defines only a very limited gamut of functionality, excluding anything related to network communications, user interaction, or process/thread creation. Its parent document, the POSIX standard, includes such a wide array of functionality that no
operating system appears to support it exactly, and only UNIX systems have even attempted to support substantial parts of it. Therefore the kinds of programs that can be portably written are extremely restricted, unless specialized programming practices are adopted.
SOFTWARE AND HARDWARE TOOLS Windows XP Windows XP is a line of operating systems produced by Microsoft for use on personal computers, including home and business desktops, notebook computers, and media centers. The name "XP" is short for "experience". Windows XP is the successor to both Windows 2000 Professional and Windows Me, and is the first consumer-oriented operating system produced by Microsoft to be built on the Windows NT kernel and architecture. Windows XP was first released on 25 October 2001, and over 400 million copies were in use in January 2006, according to an estimate
in that month by an IDC analyst. It is succeeded by Windows Vista, which was released to volume license customers on 8 November 2006 and worldwide to the general public on 30 January 2007. Direct OEM and retail sales of Windows XP ceased on 30 June 2008, although it is still possible to obtain Windows XP from System Builders (smaller OEMs who sell assembled computers) until 31 July 2009 or by purchasing Windows Vista Ultimate or Business and then downgrading to Windows XP.
Windows XP introduced several new features to the Windows line, including: • Faster start-up and hibernation sequences • The ability to discard a newer device driver in favor of the previous one (known as driver rollback), should a driver upgrade not produce desirable results • A new, arguably more user-friendly interface, including the framework for developing themes for the desktop environment • Fast user switching, which allows a user to save the current state and open applications of their desktop and allow another user to log on without losing that information • The Clear Type font rendering mechanism, which is designed to improve text readability on Liquid Crystal Display (LCD) and similar monitors • Remote Desktop functionality, which allows users to connect to a computer running Windows XP Pro from across a network or the Internet and access their applications, files, printers, and devices • Support for most DSL modems and wireless network connections, as well as networking over FireWire, and Bluetooth.
Turbo C++ Turbo C++ is a C++ compiler and integrated development environment (IDE) from Borland. The original Turbo C++ product line was put on hold after 1994, and was revived in 2006 as an introductory-level IDE, essentially a stripped-down version of their flagship C++ Builder. Turbo C++ 2006 was released on September 5, 2006 and is available in 'Explorer' and 'Professional' editions. The Explorer edition is free to download and distribute while the Professional edition is a commercial product. The professional edition is no longer available for purchase from Borland.
Turbo C++ 3.0 was released in 1991 (shipping on November 20), and came in amidst expectations of the coming release of Turbo C++ for Microsoft Windows. Initially released as an MS-DOS compiler, 3.0 supported C++ templates, Borland's inline assembler, and generation of MS-DOS mode executables for both 8086 real-mode & 286-protected (as well as the Intel 80186.) 3.0's implemented AT&T C++ 2.1, the most recent at the time. The separate Turbo Assembler product was no longer included, but the inline-assembler could stand in as a reduced functionality version. Starting with version 3.0, Borland segmented their C++ compiler into two distinct product-lines: "Turbo C++" and "Borland C++". Turbo C++ was marketed toward the hobbyist and entry-level compiler market, while Borland C++ targeted the professional application development market. Borland C++ included additional tools, compiler code-optimization, and documentation to address the needs of commercial developers. Turbo C++ 3.0 could be upgraded with separate add-ons, such as Turbo Assembler and Turbovision 1.0.
HARDWARE REQUIREMENT Processor
: Pentium (IV)
RAM
: 256 MB
Hard Disk
: 40 GB
FDD
: 4 GB
Monitor
: LG
SOFTWARE REQUIREMENT Platform Used
: TurboC++ 3.0
Operating System
: WINDOWS XP & other versions
Languages
:C
FEASIBILITY STUDY Feasibility study: The feasibility study is a general examination of the potential of an idea to be converted into a business. This study focuses largely on the ability of the entrepreneur to convert the idea into a business enterprise. The feasibility study differs from the viability study as the viability study is an in-depth investigation of the profitability of the idea to be converted into a business enterprise.
Types of Feasibility Studies The following sections describe various types of feasibility studies.
• Technology and System Feasibility This involves questions such as whether the technology needed for the system exists, how difficult it will be to build, and whether the firm has enough experience using that technology. The assessment is based on an outline design of system requirements in terms of Input, Processes, Output, Fields, Programs, and Procedures. This can be quantified in terms of volumes of data, trends, frequency of updating, etc in order to estimate if the new system will perform adequately or not.
• Resource Feasibility This involves questions such as how much time is available to build the new system, when it can be built, whether it interferes with normal business operations, type and amount of resources required,
dependencies, etc. Contingency and mitigation plans should also be stated here so that if the project does over run the company is ready for this eventuality.
• Schedule Feasibility A project will fail if it takes too long to be completed before it is useful. Typically this means estimating how long the system will take to develop, and if it can be completed in a given time period using some methods like payback period.
• Economic Feasibility Economic analysis is the most frequently used method for evaluating the effectiveness of a candidate system. More commonly known as cost/benefit analysis, the procedure is to determine the benefits and savings that are expected from a candidate system and compare them with costs. If benefits outweigh costs, then the decision is made to design and implement the system.
• Operational feasibility Do the current work practices and procedures support a new system? Also social factors i.e. how the organizational changes will affect the working lives of those affected by the system...
• Technical feasibility Centers around the existing computer system and to what extent it can support the proposed addition
SYSTEM DESIGN A lexical analyzer generator creates a lexical analyzer using a set of specifications usually in the format p1
{action 1}
p2
{action 2}
............ pn
{action n}
Where pi is a regular expression and each action actioni is a program fragment that is to be executed whenever a lexeme matched by pi is found in the input. If more than one pattern matches, then longest lexeme matched is chosen. If there are two or more patterns that match the longest lexeme, the first listed matching pattern is chosen. This is usually implemented using a finite automaton. There is an input buffer with two pointers to it, a lexeme-beginning and a forward pointer. The lexical analyzer generator constructs a transition table for a finite automaton from the regular expression patterns in the lexical analyzer generator specification. The lexical analyzer itself consists of a finite automaton simulator that uses this transition table to look for the regular expression patterns in the input buffer. This can be implemented using an NFA or a DFA. The transition table for an NFA is considerably smaller than that for a DFA, but the DFA recognises patterns faster than the NFA. Using NFA
The transition table for the NFA N is constructed for the composite pattern p1|p2|. . .|pn, The NFA recognizes the longest prefix of the input that is matched by a pattern. In the final NFA, there is an accepting state for each pattern pi. The sequence of steps the final NFA can be in is after seeing each input character is constructed. The NFA is simulated until it reaches termination or it reaches a set of states from which there is no transition defined for the current input symbol. The specification for the lexical analyzer generator is so that a valid source program cannot entirely fill the input buffer without having the NFA reach termination. To find a correct match two things are done. Firstly, whenever an accepting state is added to the current set of states, the current input position and the pattern pi is recorded corresponding to this accepting state. If the current set of states already contains an accepting state, then only the pattern that appears first in the specification is recorded. Secondly, the transitions are recorded until termination is reached. Upon termination, the forward pointer is retracted to the position at which the last match occurred. The pattern making this match identifies the token found, and the lexeme matched is the string between the lexeme beginning and forward pointers. If no pattern matches, the lexical analyser should transfer control to some default recovery routine. Using DFA Here a DFA is used for pattern matching. This method is a modified version of the method using NFA. The NFA is converted to a DFA using a subset construction algorithm. Here there may be several accepting states in a given subset of nondeterministic states. The accepting state corresponding to the pattern listed first in the lexical analyzer generator specification has priority. Here also state transitions are made until a state is reached which has no next state for the current input symbol. The last input position at which the DFA entered an accepting state gives the lexeme.
DATA-FLOW DIAGRAM A data flow diagram (DFD) is a graphical representation of the "flow" of data through an information system. It differs from the flowchart as it shows the data flow instead of the control flow of the program. A data flow diagram can also be used for the visualization of data processing (structured design).
Context Level Diagram (Level 0)
A context level Data flow diagram created using Select SSADM. This level shows the overall context of the system and its operating environment and shows the whole system as just one process. It does not usually show data stores, unless they are "owned" by external systems, e.g. are accessed by but not maintained by this system, however, these are often shown as external entities.
Level 1
A Level 1 Data flow diagram for the same system. This level shows all processes at the first level of numbering, data stores, external entities and the data flows between them. The purpose of this level is to show the major high level processes of the system and their interrelation. A process model will have one, and only one, level 1 diagram. A level 1 diagram must be balanced with its parent context level diagram, i.e. there must be the same external entities and the same data flows, these can be broken down to more detail in the level 1, e.g. the "enquiry" data flow could be split into "enquiry request" and "enquiry results" and still be valid.
Level 2
A Level 2 Data flow diagram showing the "Process Enquiry" process for the same system. This level is a decomposition of a process shown in a level 1 diagram, as such there should be level 2 diagrams for each and every process shown in a level 1 diagram. In this example processes 1.1, 1.2 & 1.3 are all children of process 1, together they wholly and completely describe process 1, and combined must perform the full capacity of this parent process. As before, a level 2 diagram must be balanced with its parent level 1 diagram.
ENTITY-RELATIONSHIP DIAGRAM An entity-relationship model (ERM) in software engineering is an abstract and conceptual representation of data. Entity-relationship modeling is a relational schema database modeling method, used to produce a type of conceptual schema or semantic data model of a system, often a relational database, and its requirements in a top-down fashion. The first stage of information system design uses these models during the requirements analysis to describe information needs or the type of information that is to be stored in a database. The data modeling technique can be used to describe any ontology (i.e. an overview and classifications of used terms and their relationships) for a certain universe of discourse (i.e. area of interest). In the case of the design of an information system that is based on a database, the conceptual data model is, at a later stage (usually called logical design), mapped to a logical data model, such as the relational model; this in turn is mapped to a physical model during physical design. Note that sometimes, both of these phases are referred to as "physical design".
FLOW CHART A flowchart is common type of chart, that represents an algorithm or process, showing the steps as boxes of various kinds, and their order by connecting these with arrows. Flowcharts are used in analyzing, designing, documenting or managing a process or program in various fields. Flowcharts are used in designing and documenting complex processes. Like other types of diagram, they help visualize what is going on and thereby help the viewer to understand a process, and perhaps also find flaws, bottlenecks, and other less-obvious features within it. There are many different types of flowcharts, and each type has its own repertoire of boxes and notational conventions. The two most common types of boxes in a flowchart are: • A processing step, usually called activity, and denoted as a rectangular box • A decision usually denoted as a diamond.
Flow chart building blocks • Symbols A typical flowchart from older Computer Science textbooks may have the following kinds of symbols: • Start and end symbols Represented as lozenges, ovals or rounded rectangles, usually containing the word "Start" or "End", or another phrase signaling the
start or end of a process, such as "submit enquiry" or "receive product".
• Arrows Showing what's called "flow of control" in computer science. An arrow coming from one symbol and ending at another symbol represents that control passes to the symbol the arrow points to. • Processing steps Represented as rectangles. Examples: "Add 1 to X"; "replace identified part"; "save changes" or similar. • Input/Output Represented as a parallelogram. Examples: Get X from the user; display X. • Conditional or decision Represented as a diamond (rhombus). These typically contain a Yes/No question or True/False test. This symbol is unique in that it has two arrows coming out of it, usually from the bottom point and right point, one corresponding to Yes or True, and one corresponding to No or False. The arrows should always be labeled. More than two arrows can be used, but this is normally a clear indicator that a complex decision is being taken, in which case it may need to be broken-down further, or replaced with the "pre-defined process" symbol. A number of other symbols that have less universal currency, such as:
• A Document represented as a rectangle with a wavy base; • A Manual input represented by parallelogram, with the top irregularly sloping up from left to right. An example would be to signify data-entry from a form; • A Manual operation represented by a trapezoid with the longest parallel side at the top, to represent an operation or adjustment to process that can only be made manually. • A Data File represented by a cylinder Flowcharts may contain other symbols, such as connectors, usually represented as circles, to represent converging paths in the flow chart. Circles will have more than one arrow coming into them but only one going out. Some flow charts may just have an arrow point to another arrow instead. These are useful to represent an iterative process (what in Computer Science is called a loop). A loop may, for example, consist of a connector where control first enters, processing steps, a conditional with one arrow exiting the loop, and one going back to the connector. Offpage connectors are often used to signify a connection to a (part of another) process held on another sheet or screen. It is important to remember to keep these connections logical in order. All processes should flow from top to bottom and left to right.
TESTING METHODOLOGY Software Testing is an empirical investigation conducted to provide stakeholders with information about the quality of the product or service under test, with respect to the context in which it is intended to operate. This includes, but is not limited to, the process of executing a program or application with the intent of finding software bugs.
Static vs. dynamic testing There are many approaches to software testing. Reviews, walkthroughs or inspections are considered as static testing, whereas actually executing programmed code with a given set of test cases is referred to as dynamic testing. The former can be, and unfortunately in practice often is, omitted, whereas the latter takes place when programs begin to be used for the first time - which is normally considered the beginning of the testing stage. This may actually begin before the program is 100% complete in order to test particular sections of code (modules or discrete functions). For example, Spreadsheet programs are, by their very nature, tested to a large extent "on the fly" during the build process as the result of some calculation or text manipulation is shown interactively immediately after each formula is entered.
Software verification and validation Software testing is used in association with verification and validation: • Verification: Have we built the software right (i.e., does it match the specification?)? It is process based. • Validation: Have we built the right software (i.e., is this what the customer wants?)? It is product based.
Testing methods Software testing methods are traditionally divided into black box testing and white box testing. These two approaches are used to describe the point of view that a test engineer takes when designing test cases.
Black box testing Black box testing treats the software as a black box without any knowledge of internal implementation. Black box testing methods include equivalence partitioning, boundary value analysis, all-pairs testing, fuzz testing, model-based testing, traceability matrix, exploratory testing and specification-based testing. Specification-based testing Specification-based testing aims to test the functionality according to the requirements. Thus, the tester inputs data and only sees the output from the test object. This level of testing usually requires thorough test cases to be provided to the tester who then can simply verify that for a given input, the output value (or behavior), is the same as the expected value specified in the test case. Specification-based testing is necessary but insufficient to guard against certain risks. Advantages and disadvantages The black box tester has no "bonds" with the code, and a tester's perception is very simple: a code MUST have bugs. Using the principle, "Ask and you shall receive," black box testers find bugs where programmers don't. BUT, on the other hand, black box testing is like a walk in a dark labyrinth without a flashlight, because the tester doesn't know how the back end was actually constructed.
That's why there are situations when 1. A black box tester writes many test cases to check something that can be tested by only one test case and/or 2. Some parts of the back end are not tested at all Therefore, black box testing has the advantage of an unaffiliated opinion on the one hand and the disadvantage of blind exploring on the other.
White box testing White box testing, by contrast to black box testing, is when the tester has access to the internal data structures and algorithms (and the code that implement these) Types of white box testing The following types of white box testing exist: • Code coverage - creating tests to satisfy some criteria of code coverage. For example, the test designer can create tests to cause all statements in the program to be executed at least once. • Mutation testing methods. • Fault injection methods. • Static testing - White box testing includes all static testing. Code completeness evaluation White box testing methods can also be used to evaluate the completeness of a test suite that was created with black box testing methods. This allows the software team to examine parts of a system that are rarely tested and ensures that the most important function points have been tested.
Two common forms of code coverage are: • function coverage, which reports on functions executed and • Statement coverage, which reports on the number of lines executed to complete the test. They both return coverage metric, measured as a percentage
TESTING STRATEGY A software testing strategy is a well-planned series of steps that result in the successful construction of the software. It should be able to test the errors in software specification, design & coding phases of software development. Software testing strategy always starts with coding & moves in upward direction. Thus a testing strategy can also divide into four phases: • Unit Testing
: Used for coding
• Integration Testing
: Used for design phase
• System Testing
: For system engineering
• Acceptance testing
: For user acceptance
Unit Testing
In computer programming, unit testing is a method of testing that verifies the individual units of source code are working properly. A unit is the smallest testable part of an application. In procedural programming a unit may be an individual program, function, procedure, etc., while in object-oriented programming, the smallest unit is a method, which may belong to a base/super class, abstract class or derived/child class. Ideally, each test case is independent from the others; Double objects like stubs, mock or fake objects as well as test harnesses can be used to assist testing a module in isolation. Unit testing is typically done by software developers to ensure that the code other developers have written meets software requirements and behaves as the developer intended.
Benefits The goal of unit testing is to isolate each part of the program and show that the individual parts are correct. A unit test provides a strict, written contract that the piece of code must satisfy. As a result, it affords several benefits. Unit tests find problems early in the development cycle.
Integration Testing
'Integration testing' (sometimes called Integration and Testing, abbreviated I&T) is the phase of software testing in which individual software modules are combined and tested as a group. It follows unit testing and precedes system testing. Integration testing takes as its input modules that have been unit tested, groups them in larger aggregates, applies tests defined in an integration test plan to those aggregates, and delivers as its output the integrated system ready for system testing. Purpose The purpose of integration testing is to verify functional, performance and reliability requirements placed on major design items. These "design items", i.e. assemblages (or groups of units), are exercised through their interfaces using Black box testing, success and error cases being simulated via appropriate parameter and data inputs. Simulated usage of shared data areas and inter-process communication is tested and individual subsystems are exercised through their input interface. Test cases are constructed to test that all components within assemblages
interact correctly, for example across procedure calls or process activations, and this is done after testing individual modules, i.e. unit testing. The overall idea is a "building block" approach, in which verified assemblages are added to a verified base which is then used to support the integration testing of further assemblages. Some different types of integration testing are big bang, top-down, and bottom-up.
System Testing System testing of software or hardware is testing conducted on a complete, integrated system to evaluate the system's compliance with its specified requirements. System testing falls within the scope of black box testing, and as such, should require no knowledge of the inner design of the code or logic. As a rule, system testing takes, as its input, all of the "integrated" software components that have successfully passed integration testing and also the software system itself integrated with any applicable hardware system(s). The purpose of integration testing is to detect any inconsistencies between the software units that are integrated together (called assemblages) or between any of the assemblages and the hardware. System testing is a more limiting type of testing; it seeks to detect defects both within the "inter-assemblages" and also within the system as a whole.
Acceptance Testing In engineering and its various sub disciplines, acceptance testing is black-box testing performed on a system (e.g. software, lots of manufactured mechanical parts, or batches of chemical products) prior to its delivery. In some engineering sub disciplines, it is known as Functional testing, black-box testing, release acceptance, QA testing, application testing, confidence testing, final testing, validation testing, usability testing, or factory acceptance testing. In most environments, acceptance testing by the system provider is distinguished from acceptance testing by the customer (the user or client) prior to accepting transfer of ownership. In such environments, acceptance testing performed by the customer is known as beta testing, user acceptance testing (UAT), end user testing, site (acceptance) testing, or field (acceptance) testing.
System security One might think that there is a little reason to be concerned about in an internet. After all, by definition an internet is internal to ones organization; outsider can not access it. There are strong arguments for the position that an intranet should be completely open to its user, with little or no security.
Information security Information security means protecting information and information systems from unauthorized access, use, disclosure, disruption, modification, or destruction.[1] The terms information security, computer security and information assurance are frequently incorrectly used interchangeably. These fields are interrelated often and share the common goals of protecting the confidentiality, integrity and availability of information; however, there are some subtle differences between them. These differences lie primarily in the approach to the subject, the methodologies used, and the areas of concentration. Information security is concerned with the confidentiality, integrity and availability of data regardless of the form the data may take: electronic, print, or other forms. Computer security can focus on ensuring the availability and correct operation of a computer system without concern for the information stored or processed by the computer.
Security classification for information
An important aspect of information security and risk management is recognizing the value of information and defining appropriate procedures and protection requirements for the information. Not all information is equal and so not all information requires the same degree of protection. This requires information to be assigned a security classification. The first step in information classification is to identify a member of senior management as the owner of the particular information to be classified. Next, develop a classification policy. The policy should describe the different classification labels, define the criteria for information to be assigned a particular label, and list the required security controls for each classification.
Identification Identification is an assertion of who someone is or what something is. If a person makes the statement "Hello, my name is John Doe." they are making a claim of who they are. However, their claim may or may not be true. Before John Doe can be granted access to protected information it will be necessary to verify that the person claiming to be John Doe really is John Doe.
Authentication Authentication is the act of verifying a claim of identity. When John Doe goes into a bank to make a withdrawal, he tells the bank teller he is John Doe (a claim of identity). The bank teller asks to see a photo ID, so he hands the teller his driver's license. The bank teller checks the license to make sure it has John Doe printed on it and compares the photograph on the license against the person claiming to be John Doe. If the photo and name match the person, then the teller has authenticated that John Doe is who he claimed to be.
Authorization Authorization to access information and other computing services begins with administrative policies and procedures. The policies prescribe what information and computing services can be accessed, by whom, and under what conditions. The access control mechanisms are then configured to enforce these policies. Different computing systems are equipped with different kinds of access control mechanisms - some may even offer a choice of different access control mechanisms. The access control mechanism a system offers will be based upon one of three approaches to access control or it may be derived from a combination of the three approaches.
Implementation & maintenance Implementation The final phase of the progress process is the implementation of the new system. This phase is culmination of the previous phases and will be performed only after each of the prior phases has been successfully completed to the satisfaction of both the user and quality assurance. The tasks, comprise the implementation phase, include the installation of hardware, proper scheduling of resources needed to put the system in to introduction, a complete of instruction that support both the users and IS environment.
Coding This means program construction with procedural specification has finished and the coding for the program begins: Once the design phase was over, coding commenced. Coding is natural consequence of design. Coding step translate a detailed design representation of software into a programming language realization. Main emphasis while coding was on style so that the end result was an optimized code. The following points were kept in to consideration while coding.
Coding style The structured programming method was used in all the modules the projects. It incorporated the following features. The code has been written so that the definitions and implementation of each function is contained in one file. A group of related function was clubbed together in one file to include it when needed and save us from the labor of writing it again and again.
Naming convention As the project size grows, so does complexity of resigning the purpose of the variable. Thus the variable were given meaningful names, which would help in understanding the context and the purpose of the variable. The function names are also given meaningful names that can be easily understood by the user.
Indentation Judicious use of indentation can make the task of reading and understanding a program much simpler. Indentation is an essential part of a good program. If code id intended without thought it will seriously affect the readability of the program. The higher level statement like the definition of the variable, constants and the function are intended, with each nested block intended, stating their purpose in the code. Blank line is also left between each function definition to make the code look neat. Indentation for each source file stating the purpose of the file is also done.
Maintenance Maintenance testing is that testing which is performed to either identify equipment problems, diagnose equipment problems or to confirm that repair measures has been effective. It can be performed at either the system level (e.g., the HVAC system), the equipment level (e.g., the blower in a HVAC line), or the component level (e.g., a control chip in the control box for the blower in the HVAC line).
Preventive maintenance The care and servicing by personnel for the purpose of maintaining equipment and facilities in satisfactory operating condition by providing for systematic inspection, detection, and correction of incipient failures either before they occur or before they develop into major defects. Maintenance, including tests, measurements, adjustments, and parts replacement, performed specifically to prevent faults from occurring. To make it simple: Preventive maintenance is conducted to keep equipment working and/or extend the life of the equipment. Corrective maintenance, sometimes called "repair", is conducted to get equipment working again.
The primary goal of maintenance is to avoid or mitigate the consequences of failure of equipment. This may be by preventing the failure before it actually occurs which PM and condition based maintenance help to achieve. It is designed to preserve and restore equipment reliability by replacing worn components before they actually fail. Preventive maintenance activities include partial or complete overhauls at specified periods, oil changes, lubrication and so on. In addition, workers can record equipment deterioration so they know to replace or repair worn parts before they cause system failure. The ideal preventive maintenance program would prevent all equipment failure before it occurs.
Corrective maintenance The idle time for production machines in a factory is mainly due to the following reasons: Lack of materials Machine fitting, cleaning, tools replacement etc. Breakdowns Taking into consideration only breakdown idle time it can be split in some components: Operator's inspection time - That is the time required by the machine operator to check the machine in order to detect the breakdown reason, before calling the Maintenance department Operator's repairing time - That means time required by machine operator to fit the machine by himself in case he is able to do it. Maintenance dead time - Time lost by machine operator waiting for the machine to be repair by maintenance personnel, from the time they start doing it until the moment they finish their task.
In the corrective environment the system has been conceived to reduce the breakdown detection and diagnosis times and supply the adequate information required to perform the repairing operations. Different sensors are connected to every machine in the workshop, to detect any change in the various parameters when they run put of the normal performance or a shutdown is produced.
EVALUATION Lexical analyzer converts stream of input characters into a stream of tokens. The different tokens that our lexical analyzer identifies are as follows: KEYWORDS: int, char, float, double, if, for, while, else, switch, struct, printf, scanf, case, break, return, typedef, void IDENTIFIERS: main, fopen, getch etc NUMBERS: positive and negative integers, positive and negative floating point numbers. OPERATORS: +, ++, -, --, ||, *, ?, /, >, >=, <, <=, =, ==, &, &&. BRACKETS: [ ], { }, ( ). STRINGS : Set of characters enclosed within the quotes COMMENT LINES: Ignores single line, multi line comments For tokenizing into identifiers and keywords we incorporate a symbol table which initially consists of predefined keywords. The tokens are read from an input file. If the encountered token is an identifier or a keyword the lexical analyzer will look up in the symbol table to check the existence of the respective token. If an entry does exist then we proceed to the next token. If not then that particular token along with the token value is written into the symbol table. The rest of the tokens are directly displayed by writing into an output file. The output file will consist of all the tokens present in our input file along with their respective token values.
CODE /* Program to make lexical analyzer that generates the tokens......
Created by: Ankita Verma, Harshi Yadav, Sameeksha Chauhan*/
#include #include #include #include #define MAX 30
void main() { char str[MAX]; int state=0; int i=0, j, startid=0, endid, startcon, endcon;
clrscr();
for(j=0; j
//Initialise NULL
printf("*** Program on Lexical Analysis ***");
printf("\n\nEnter the string: "); gets(str);
//Accept input string
str[strlen(str)]=' '; printf("\n\nAnalysis:"); while(str[i]!=NULL) { while(str[i]==' ')
//To eliminate spaces
i++; switch(state) { case 0: if(str[i]=='i') state=1;
//if
else if(str[i]=='w') state=3;
//while
else if(str[i]=='d') state=8;
//do
else if(str[i]=='e') state=10; //else else if(str[i]=='f') state=14; //for else if(isalpha(str[i]) || str[i]=='_') { state=17; startid=i; } //identifiers
else if(str[i]=='<') state=19; //relational '<' or '<='
else if(str[i]=='>') state=21; //relational '>' or '>='
else if(str[i]=='=') state=23; //relational '==' or assignment '='
else if(isdigit(str[i])) { state=25; startcon=i; } //constant
else if(str[i]=='(') state=26; //special characters '('
else if(str[i]==')') state=27; //special characters ')'
else if(str[i]==';') state=28; //special characters ';'
else if(str[i]=='+') state=29;
//operator '+'
else if(str[i]=='-') state=30; //operator '-'
break;
//States for 'if' case 1: if(str[i]=='f') state=2; else { state=17; startid=i-1; i--; } break; case 2: if(str[i]=='(' || str[i]==NULL) { printf("\n\nif : Keyword"); state=0; i--; } else { state=17; startid=i-2; i--; } break;
//States for 'while' case 3: if(str[i]=='h') state=4; else { state=17; startid=i-1; i--; }
break; case 4: if(str[i]=='i') state=5; else { state=17; startid=i-2; i--; } break; case 5: if(str[i]=='l') state=6; else { state=17; startid=i-3; i--; } break; case 6: if(str[i]=='e') state=7; else { state=17; startid=i-4; i--; } break; case 7: if(str[i]=='(' || str[i]==NULL) { printf("\n\nwhile
: Keyword");
state=0; i--; } else { state=17; startid=i-5; i--; } break;
//States for 'do' case 8: if(str[i]=='o') state=9; else { state=17; startid=i-1; i--; } break;
case 9: if(str[i]=='{' || str[i]==' ' || str[i]==NULL || str[i]=='(') { printf("\n\ndo : Keyword"); state=0; i--; } break;
//States for 'else' case 10: if(str[i]=='l') state=11; else { state=17; startid=i-1; i--; } break; case 11: if(str[i]=='s') state=12; else { state=17; startid=i-2; i--; } break; case 12: if(str[i]=='e') state=13; else { state=17; startid=i-3; i--; } break; case 13: if(str[i]=='{' || str[i]==NULL) { printf("\n\nelse state=0; i--;
: Keyword");
} else { state=17; startid=i-4; i--; } break;
//States for 'for' case 14: if(str[i]=='o') state=15; else { state=17; startid=i-1; i--; } break; case 15: if(str[i]=='r') state=16; else { state=17; startid=i-2; i--; } break; case 16: if(str[i]=='(' || str[i]==NULL) { printf("\n\nfor
: Keyword");
state=0; i--; } else { state=17; startid=i-3; i--; } break;
//States for identifiers case 17:
if(isalnum(str[i]) || str[i]=='_') { state=18; i++; } else if(str[i]==NULL||str[i]=='<'||str[i]=='>'||str[i]=='('||str[i]==')'|| str[i]==';'||str[i]=='='||str[i]=='+'||str[i]=='-') state=18; i--; break;
case 18:
if(str[i]==NULL || str[i]=='<' || str[i]=='>' || str[i]=='(' || str[i]==')' || str[i]==';' || str[i]=='=' || str[i]=='+' ||str[i]=='-') { endid=i-1; printf(" "); for(j=startid; j<=endid; j++) printf("\n\n%c", str[j]); printf(" : Identifier"); state=0; i--; } break;
//States for relational operator '<' & '<=' case 19: if(str[i]=='=') state=20; else if(isalnum(str[i]) || str[i]=='_') { printf("\n\n< : Relational operator"); i--; state=0; } break; case 20: if(isalnum(str[i]) || str[i]=='_') { printf("\n\n<=
: Relational operator");
i--; state=0; } break;
//States for relational operator '>' & '>=' case 21: if(str[i]=='=') state=22; else if(isalnum(str[i]) || str[i]=='_') { printf("\n\n> : Relational operator");
i--; state=0; } break; case 22: if(isalnum(str[i]) || str[i]=='_') { printf("\n\n>=
: Relational operator");
i--; state=0; } break;
//States for relational operator '==' & assignment operator '=' case 23: if(str[i]=='=') state=24; else { printf("\n\n= : Assignment operator"); i--; state=0; } break; case 24: if(isalnum(STR[i])) {
printf("\n\n==
: Relational operator");
state=0; i--; } break;
//States for constants case 25: if(isalpha(str[i])) { printf("\n\n*** ERROR ***"); puts(str); for(j=0; j' || str[i]==NULL || str[i]==';' || str[i]=='=') { endcon=i-1; printf(" ");
for(j=startcon; j<=endcon; j++) printf("\n\n%c", str[j]); printf(" : Constant"); state=0; i--; } break;
//State for special character '(' case 26: printf("\n\n( : Special character"); startid=i; state=0; i--; break;
//State for special character ')' case 27: printf("\n\n) : Special character"); state=0; i--; break;
//State for special character ';' case 28: printf("\n\n; : Special character");
state=0; i--; break;
//State for operator '+' case 29: printf("\n\n+ : Operator"); state=0; i--; break;
//State for operator '-' case 30: printf("\n\n- : Operator"); state=0; i--; break;
//Error State case 99: goto END; } i++; } printf("\n\nEnd of program"); END:
getch(); }
/*
Output
Correct input -------------
*** Program on Lexical Analysis ***
Enter the string: for(x1=0; x1<=10; x1++);
Analysis:
for (
: Keyword : Special character
x1
: Identifier
=
: Assignment operator
0
: Constant
;
: Special character
x1
: Identifier
<=
: Relational operator
10
: Constant
;
: Special character
x1
: Identifier
+
: Operator
+
: Operator
)
: Special character
;
: Special character
End of program
Wrong input -----------
*** Program on Lexical Analysis ***
Enter the string: for(x1=0; x1<=19x; x++);
Analysis:
for (
: Keyword : Special character
x1
: Identifier
=
: Assignment operator
0
: Constant
;
: Special character
X1
: Identifier
<=
: Relational operator
Token cannot be generated */
ADVANTAGES AND Disadvantages OF LEXICAL ANALYZER ADVANTAGES
• Easier and faster development. • More efficient and compact. • Very efficient and compact. DISADVANTAGES • Done by hand. • Development is complicate
CONCLUSION
Lexical analysis is a stage in compilation of any program. In this phase we generate tokens from the input stream of data. For performing this task we need Lexical Analyzer.
So we are designing a lexical analyzer that will generate tokens from the given input.
In the end, we would really like to thank our H.O.D. Mr. Sudhir Pathak from the bottom of our hearts to give us such a fruitful opportunity to enhance our technical skills.
REFERENCE
• www.google.co.in • www.wikipedia.com • Let Us C
: Yashwant Kanetkar
• Software Engineering
: Rogger Pressman
• System Software Engineering
: D. S. Dhamdhere