Lexical Analyzer Project Report

“Lexical Analyzer” A Major Project Submitted In partial fulfillment For the award of the Degree of Bachelor of Technolo

Views 438 Downloads 54 File size 421KB

Report DMCA / Copyright

DOWNLOAD FILE

Recommend stories

Citation preview

“Lexical Analyzer”

A Major Project Submitted In partial fulfillment For the award of the Degree of Bachelor of Technology In Department of Computer Science & Engineering (With specialization in software engineering)

Submitted to:

Submitted by:-

Mr. Sohit Agarwal

Shubham Sharma

Asst. professior

Mukesh Kumar Shiv Dass Vijay Kumar Sourav Jasrotia

Department of Computer Science & Engineering Suresh Gyan Vihar University

CERTIFICATE

This is to certify that Project Report entitled “Lexical Analyzer” which is submitted by Mukesh Kumar, Shubham Sharma, Shiv Dass, Vijay Kumar, Sourav Jasrotia in partial fulfillment of the requirement for the award of B. Tech. degree in department of Computer science and Engineering is a record of the candidate(s) own work carried out by him / them under my/our supervision. The matter embodied in this thesis is original and has not been submitted for the award of any other degree.

Signature Name of Supervisor – Mr. Sohit Agarwal Designation

DECLARATION

I/We hereby declare that this submission is my/our own work and that, to the best of my/our knowledge and belief, it contains no material previously published or written by another person nor material which to a substantial extent has been accepted for the award of any other degree or diploma of the university or other institute of higher learning, except where due acknowledgment has been made in the text.

Signature Name :- Mukesh Kumar Enrolment No. CP10101408698 Date:

ACKNOWLEDGEMENT

It is great pleasure to acknowledge the support of many people who have contribute to the successful completion of this project. This project would never have seen the light of the day without the help and guidance that I have received, I am profoundly grateful for the support, cooperation and valuable guidance extended by my project mentor Mr. Sohit Agarwal, Jaipur in the development of this project and project report. Also I Would like to sincere gratitude , Mr. Sohit Agarwal Assistant Professor of Suresh Gyan Vihar University for his valuable advice and kind of encouragement as internal guide during the whole process from time to time. She always been a source of inspiration? Without their encouragement and help this project would not have materialized Last but not the least; we would like to express our sincere thanks to our Family member and our friends for their constant encouragement.

Signature Name – Mukesh Kumar Enrolment No. CP10101408698 Date

INTRODUCTION AIM OF THE PROJECT Aim of the project is to develop a Lexical Analyzer that can generate tokens for the further processing of compiler. PURPOSE OF THE PROJECT The lexical features of a language can be specified using types-3 grammar. The job of the lexical analyzer is to read the source program one character at a time and produce as output a stream of tokens. The tokens produced by the lexical analyzer serve as input to the next phase, the parser. Thus, the lexical analyzer’s job is to translate the source program in to a form more conductive to recognition by the parser. GOALS To create tokens from the given input stream. SCOPE OF PROJECT Lexical analyzer converts the input program into character stream of valid words of language, known as tokens.

The parser looks into the sequence of these tokens & identifies the language construct occurring in the input program. The parser and the lexical analyzer work hand in hand; in the sense that whenever the parser needs further tokens to proceed, it request the lexical analyzer. The lexical analyzer in turn scans the remaining input stream & returns the next token occurring there. Apart from that, the lexical analyzer also participates in the creation & maintenance of symbol table. This is because lexical analyzer is the first module to identify the occurrence of a symbol. If the symbol is getting defined for the first time, it needs to be installed into the symbol table. Lexical analyzer is most widely used for doing the same.

SYSTEM DESIGN Process: The lexical analyzer is the first phase of a compiler. Its main task is to read the input characters and produce as output a sequence of tokens that the parser uses for syntax analysis. This interaction, summarized schematically .

Upon receiving a “get next token “command from the parser, the lexical analyzer reads the input characters until it can identify next token. Sometimes, lexical analyzers are divided into a cascade of two phases, the first called “scanning”, and the second “lexical analysis”. The scanner is responsible for doing simple tasks, while the lexical analyzer proper does the more complex operations. The lexical analyzer which we have designed takes the input from an input file. It reads one character at a time from the input file, and continues to read until end of the file is reached. It recognizes the valid identifiers, keywords and specifies the token values of the keywords. It also identifies the header files, #define statements, numbers, special characters, various relational and logical operators, ignores the white spaces and comments. It prints the output in a separate file specifying the line number.

BLOCK DIAGRAM:

Different tokens or lexemes are:   

Keywords Identifiers Operators



Constants Take below example. c = a + b; After lexical analysis a symbol table is generated as given below. Token

Type

c

identifier

=

operator

a

identifier

+

operator

b

identifier

SOFTWARE DEVELOPMENT LIFE CYCLE Systems Development Life Cycle (SDLC), or Software Development Life Cycle, in systems engineering and software engineering relates to the process of developing systems, and the models and methodologies, that people use to develop these systems, generally computer or information systems. In software engineering this SDLC concept is developed into all kinds of software development methodologies, the framework that is used to structure, plan, and control the process of developing an information system, the software development process.

Overview Systems Development Life Cycle (SDLC) is any logical process used by a systems analyst to develop an information system, including requirements, validation, training, and user ownership. An SDLC should result in a high quality system that meets or exceeds customer expectations, within time and cost estimates, works effectively and efficiently in the current and planned Information Technology infrastructure, and is cheap to maintain and cost-effective to enhance.[2] In project management a project has both a life cycle and a "systems development life cycle" during which a number of typical activities occur. The project life cycle (PLC) encompasses all the activities of the project, while the systems development life cycle (SDLC) is focused on accomplishing the product requirements.

Systems Development Phases Systems Development Life Cycle (SDLC) adheres to important phases that are essential for developers, such as planning, analysis, design, and implementation, and are explained in the section below. There are several Systems Development Life Cycle Models in existence. The oldest model, that was originally regarded as "the Systems Development Life Cycle" is the waterfall model: a sequence of stages in which the output of each stage becomes the input for the next. These stages generally follow the same basic steps but many different waterfall methodologies give the steps different names and the number of steps seems to vary between 4 and 7. There is no definitively correct Systems Development Life Cycle model, but t he steps can be characterized and divided in several steps. Phases

Initiation Phase The Initiation Phase begins when a business sponsor identifies a need or an opportunity. The purpose of the Initiation Phase is to:   



Identify and validate an opportunity to improve business accomplishments of the organization or a deficiency related to a business need. Identify significant assumptions and constraints on solutions to that need. Recommend the exploration of alternative concepts and methods to satisfy the need including questioning the need for technology, i.e., will a change in the business process offer a solution? Assure executive business and executive technical sponsorship.

System Concept Development Phase The System Concept Development Phase begins after a business need or opportunity is validated by the Agency/Organization Program Leadership and the Agency/Organization CIO. The purpose of the System Concept Development Phase is to:

Determine the feasibility and appropriateness of the alternatives. Identify system interfaces. Identify basic functional and data requirements to satisfy the business need. Establish system boundaries; identify goals, objectives, critical success factors, and performance measures.    

Evaluate costs and benefits of alternative approaches to satisfy the basic functional requirements Assess project risks Identify and initiate risk mitigation actions, and Develop high-level technical architecture, process models, data models, and a concept of operations.

Planning Phase During this phase, a plan is developed that documents the approach to be used and includes a discussion of methods, tools, tasks, resources, project schedules, and user input. Personnel assignments, costs, project schedule, and target dates are established. A Project Management Plan

is created with components related to acquisition planning, configuration management planning, quality assurance planning, concept of operations, system security, verification and validation, and systems engineering management planning. Requirements Analysis Phase This phase formally defines the detailed functional user requirements using high-level requirements identified in the Initiation, System Concept, and Planning phases.The requirements are defined in this phase to a level of detail sufficient for systems design to proceed. They need to be measurable, testable, and relate to the business need or opportunity identified in the Initiation Phase. The requirements that will be used to determine acceptance of the system are captured in the Test and Evaluation Master Plan. The purposes of this phase are to: Further define and refine the functional and data requirements and document them in the Requirements Document, Complete business process reengineering of the functions to be supported (i.e., verify what information drives the business process, what information is generated, who generates it, where does the information go, and who processes it), Develop detailed data and process models (system inputs, outputs, and the process. Develop the test and evaluation requirements that will be used to determine acceptable system performance.

Design Phase During this phase, the system is designed to satisfy the functional requirements identified in the previous phase. Since problems in the design phase could be very expensive to solve in the later stage of the software development, a variety of elements are considered in the design to mitigate risk. These include: 

Identifying potential risks and defining mitigating design features.

     

Performing a security risk assessment. Developing a conversion plan to migrate current data to the new system. Determining the operating environment. Defining major subsystems and their inputs and outputs. Allocating processes to resources. Preparing detailed logic specifications for each software module.

Development Phase Effective completion of the previous stages is a key factor in the success of the Development phase. The Development phase consists of:   

Translating the detailed requirements and design into system components. Testing individual elements (units) for usability. Preparing for integration and testing of the IT system.

Integration and Test Phase Subsystem integration, system, security, and user acceptance testing is conducted during the integration and test phase. The user, with those responsible for quality assurance, validates that the functional requirements, as defined in the functional requirements document, are satisfied by the developed or modified system. OIT Security staff assesses the system security and issue a security certification and accreditation prior to installation/implementation. Multiple levels of testing are performed, including:   

Testing at the development facility by the contractor and possibly supported by end users Testing as a deployed system with end users working together with contract personnel Operational testing by the end user alone performing all functions.

Implementation Phase This phase is initiated after the system has been tested and accepted by the user. In this phase, the system is installed to support the intended business functions. System performance is compared to performance objectives established during the planning phase. Implementation includes user notification, user training, installation of hardware, installation of software onto production computers, and integration of the system into daily work processes.

This phase continues until the system is operating in production in accordance with the defined user requirements.

Operations and Maintenance Phase The system operation is ongoing. The system is monitored for continued performance in accordance with user requirements and needed system modifications are incorporated. Operations continue as long as the system can be effectively adapted to respond to the organization’s needs. When modifications or changes are identified, the system may reenter the planning phase. The purpose of this phase is to:    

Operate, maintain, and enhance the system. Certify that the system can process sensitive information. Conduct periodic assessments of the system to ensure the functional requirements continue to be satisfied. Determine when the system needs to be modernized, replaced, or retired.

Features The relatively low-level nature of the language affords the programmer close control over what the computer does, while allowing special tailoring and aggressive optimization for a particular platform. This allows the code to run efficiently on very limited hardware, such as embedded systems. C does not have some features that are available in some other programming languages:         

No assignment of arrays or strings (copying can be done via standard functions; assignment of objects having struct or union type is supported) No automatic garbage collection No requirement for bounds checking of arrays No operations on whole arrays No syntax for ranges, such as the A..B notation used in several languages No separate Boolean type: zero/nonzero is used instead[6] No formal closures or functions as parameters (only function and variable pointers) No generators or co routines; intra-thread control flow consists of nested function calls, except for the use of the longjmp or setcontext library functions No exception handling; standard library functions signify error conditions with the global errno variable and/or special return values

      

Only rudimentary support for modular programming No compile-time polymorphism in the form of function or operator overloading Only rudimentary support for generic programming Very limited support for object-oriented programming with regard to polymorphism and inheritance Limited support for encapsulation No native support for multithreading and networking No standard libraries for computer graphics and several other application programming needs

A number of these features are available as extensions in some compilers, or can be supplied by third-party libraries, or can be simulated by adopting certain coding disciplines.

Operators  

bitwise shifts () assignment (=, +=, -=, *=, /=, %=, &=, |=, ^=, =)

increment and decrement (++, --) Main article: Operators in C and C++ C supports a rich set of operators, which are symbols used within an expression to specify the manipulations to be performed while evaluating that expression. C has operators for:              

arithmetic (+, -, *, /, %) equality testing (==, !=) order relations (=) boolean logic (!, &&, ||) bitwise logic (~, &, |, ^) reference and dereference (&, *, [ ]) conditional evaluation (? :) member selection (., ->) type conversion (( )) object size (sizeof) function argument collection (( )) sequencing (,) subexpression grouping (( )) C has a formal grammar, specified by the C standard.

Data structures

C has a static weak typing type system that shares some similarities with that of other ALGOL descendants such as Pascal. There are built-in types for integers of various sizes, both signed and unsigned, floating-point numbers, characters, and enumerated types (enum). C99 added a boolean datatype. There are also derived types including arrays, pointers, records (struct), and untagged unions (union).

Deficiencies Although the C language is extremely concise, C is subtle, and expert competency in C is not common—taking more than ten years to achieve.[11] C programs are also notorious for security vulnerabilities due to the unconstrained direct access to memory of many of the standard C library function calls. It is inevitable that C did not choose limit the size or endianness of its types—for example, each compiler is free to choose the size of an int type as any anything over 16 bits according to what is efficient on the current platform. Many programmers work based on size and endianness assumptions, leading to code that is not portable. Therefore the kinds of programs that can be portably written are extremely restricted, unless specialized programming practices are adopted.

SOFTWARE AND HARDWARE TOOLS Windows XP Windows XP is a line of operating systems produced by Microsoft for use on personal computers, including home and business desktops, notebook computers, and media centers. The name "XP" is short for "experience". Windows XP is the successor to both Windows 2000 Professional and Windows Me, and is the first consumer-oriented operating system produced by Microsoft to be built on the Windows NT kernel and architecture.

Windows XP introduced several new features to the Windows line, including:      



Faster start-up and hibernation sequences The ability to discard a newer device driver in favor of the previous one (known as driver rollback), should a driver upgrade not produce desirable results A new, arguably more user-friendly interface, including the framework for developing themes for the desktop environment Fast user switching, which allows a user to save the current state and open applications of their desktop and allow another user to log on without losing that information The Clear Type font rendering mechanism, which is designed to improve text readability on Liquid Crystal Display (LCD) and similar monitors Remote Desktop functionality, which allows users to connect to a computer running Windows XP Pro from across a network or the Internet and access their applications, files, printers, and devices Support for most DSL modems and wireless network connections, as well as networking over FireWire, and Bluetooth.

Turbo C++ Turbo C++ is a C++ compiler and integrated development environment (IDE) from Borland. The original Turbo C++ product line was put on hold after 1994, and was revived in 2006 as an introductory-level IDE, essentially a stripped-down version of their flagship C++ Builder. Turbo C++ 2006 was released on September 5, 2006 and is available in 'Explorer' and 'Professional' editions. The Explorer edition is free to download and distribute while the Professional edition is a commercial product. The professional edition is no longer available for purchase from Borland

HARDWARE REQUIREMENT Processor

: Pentium (IV) or Above

RAM

: 256 MB

Hard Disk

: 40 GB or Above

FDD

: 4 GB or Above

SOFTWARE REQUIREMENT Platform Used

: TurboC++ 3.0

Operating System

: WINDOWS XP & other versions

Languages

:C

FEASIBILITY STUDY Feasibility study: The feasibility study is a general examination of the potential of an idea to be converted into a business. This study focuses largely on the ability of the entrepreneur to convert the idea into a business enterprise. The feasibility study differs from the viability study as the viability study is an in-depth investigation of the profitability of the idea to be converted into a business enterprise.

Types of Feasibility Studies The following sections describe various types of feasibility studies.



Technology and System Feasibility This involves questions such as whether the technology needed for the system exists, how difficult it will be to build, and whether the firm has enough experience using that technology. The assessment is based on an outline design of system requirements in terms of Input, Processes, Output, Fields, Programs, and Procedures. This can be quantified in terms of volumes of data, trends, frequency of updating, etc in order to estimate if the new system will perform adequately or not.



Resource Feasibility This involves questions such as how much time is available to build the new system, when it can be built, whether it interferes with normal business operations, type and amount of resources required, dependencies, etc. Contingency and mitigation plans should also be stated here so that if the project does over run the company is ready for this eventuality.



Schedule Feasibility A project will fail if it takes too long to be completed before it is useful. Typically this means estimating how long the system will take to develop, and if it can be completed in a given time period using some methods like payback period.



Technical feasibility Centers around the existing computer system and to what extent it can support the proposed addition

SYSTEM DESIGN

A lexical analyzer generator creates a lexical analyzer using a set of specifications usually in the format p1

{action 1}

p2

{action 2}

............ pn

{action n}

Where pi is a regular expression and each action actioni is a program fragment that is to be executed whenever a lexeme matched by pi is found in the input. If more than one pattern matches, then longest lexeme matched is chosen. If there are two or more patterns that match the longest lexeme, the first listed matching pattern is chosen. This is usually implemented using a finite automaton. There is an input buffer with two pointers to it, a lexeme-beginning and a forward pointer. The lexical analyzer generator constructs a transition table for a finite automaton from the regular expression patterns in the lexical analyzer generator specification. The lexical analyzer itself consists of a finite automaton simulator that uses this transition table to look for the regular expression patterns in the input buffer. This can be implemented using an NFA or a DFA. The transition table for an NFA is considerably smaller than that for a DFA, but the DFA recognises patterns faster than the NFA. Using NFA The transition table for the NFA N is constructed for the composite pattern p1|p2|. . .|pn, The NFA recognizes the longest prefix of the input that is matched by a pattern. In the final NFA, there is an accepting state for each pattern pi. The sequence of steps the final NFA can be in is after seeing each input character is constructed. The NFA is simulated until it reaches termination or it reaches a set of states from which there is no transition defined for the current input symbol. The specification for the lexical analyzer generator is so that a valid source program cannot entirely fill the input buffer without having the NFA reach terminationThe pattern making this match identifies the token found, and the lexeme matched is the string between the lexeme beginning and forward pointers. If no pattern matches, the lexical analyser should transfer control to some default recovery routine.

Using DFA Here a DFA is used for pattern matching. This method is a modified version of the method using NFA. The NFA is converted to a DFA using a subset construction algorithm. Here there may be several accepting states in a given subset of nondeterministic states. The accepting state corresponding to the pattern listed first in the lexical analyzer generator specification has priority. Here also state transitions are made until a state is reached which has no next state for the current input symbol. The last input position at which the DFA entered an accepting state gives the lexeme.

TESTING STRATEGY A software testing strategy is a well-planned series of steps that result in the successful construction of the software. It should be able to test the errors in software specification, design & coding phases of software development. Software testing strategy always starts with coding & moves in upward direction. Thus a testing strategy can also divide into four phases:    

Unit Testing Integration Testing System Testing Acceptance testing

: Used for coding : Used for design phase : For system engineering : For user acceptance

Unit Testing

In computer programming, unit testing is a method of testing that verifies the individual units of source code are working properly. A unit is the smallest testable part of an application. In procedural programming a unit may be an individual program, function, procedure, etc., while in object-oriented programming, the smallest unit is a method, which may belong to a base/super class, abstract class or derived/child class.

Benefits The goal of unit testing is to isolate each part of the program and show that the individual parts are correct. A unit test provides a strict, written contract that the piece of code must satisfy. As a result, it affords several benefits. Unit tests find problems early in the development cycle.

Integration Testing 'Integration testing' (sometimes called Integration and Testing, abbreviated I&T) is the phase of software testing in which individual software modules are combined and tested as a group. It follows unit testing and precedes system testing. Integration testing takes as its input modules that have been unit tested, groups them in larger aggregates, applies tests defined in an integration test plan to those aggregates, and delivers as its output the integrated system ready for system testing.

Purpose The purpose of integration testing is to verify functional, performance and reliability requirements placed on major design items. These "design items", i.e. assemblages (or groups of units), are exercised through their interfaces using Black box testing, success and error cases being simulated via appropriate parameter and data inputs. Simulated usage of shared data areas and inter-process communication is tested and individual subsystems are exercised through their input interface. Test cases are constructed to test that all components within assemblages interact correctly, for example across procedure calls or process activations, and this is done after testing individual System Testing System testing of software or hardware is testing conducted on a complete, integrated system to evaluate the system's compliance with its specified requirements. System testing falls within the scope of black box testing, and as such, should require no knowledge of the inner design of the code or logic. The purpose of integration testing is to detect any inconsistencies between the software units that are integrated together (called assemblages) or between any of the assemblages and the hardware. System testing is a more limiting type of testing; it seeks to detect defects both within the "interassemblages" and also within the system as a whole. Implementation & maintenance Implementation The final phase of the progress process is the implementation of the new system. This phase is culmination of the previous phases and will be performed only after each of the prior phases has been successfully completed to the satisfaction of both the user and quality assurance. The tasks, comprise the implementation phase, include the installation of hardware, proper scheduling of resources needed to put the system in to introduction, a complete of instruction that support both the users and IS environment. Coding This means program construction with procedural specification has finished and the coding for the program begins: Once the design phase was over, coding commenced. Coding is natural consequence of design. Coding step translate a detailed design representation of software into a programming language realization.

Main emphasis while coding was on style so that the end result was an optimized code. The following points were kept in to consideration while coding.

Coding style The structured programming method was used in all the modules the projects. It incorporated the following features. The code has been written so that the definitions and implementation of each function is contained in one file. A group of related function was clubbed together in one file to include it when needed and save us from the labor of writing it again and again. Maintenance Maintenance testing is that testing which is performed to either identify equipment problems, diagnose equipment problems or to confirm that repair measures has been effective. It can be performed at either the system level (e.g., the HVAC system), the equipment level (e.g., the blower in a HVAC line), or the component level (e.g., a control chip in the control box for the blower in the HVAC line). Preventive maintenance The care and servicing by personnel for the purpose of maintaining equipment and facilities in satisfactory operating condition by providing for systematic inspection, detection, and correction of incipient failures either before they occur or before they develop into major defects. Maintenance, including tests, measurements, adjustments, and parts replacement, performed specifically to prevent faults from occurring. Corrective maintenance The idle time for production machines in a factory is mainly due to the following reasons: Lack of materials Machine fitting, cleaning, tools replacement etc. Breakdowns

Taking into consideration only breakdown idle time it can be split in some components: Operator's inspection time - That is the time required by the machine operator to check the machine in order to detect the breakdown reason, before calling the Maintenance department Operator's repairing time - That means time required by machine operator to fit the machine by himself in case he is able to do it. Maintenance dead time - Time lost by machine operator waiting for the machine to be repair by maintenance personnel, from the time they start doing it until the moment they finish their task.

OUTPUT

Bibliography



C++ book by Steve oualline.



C++ COOLBOOK by D. Ryan Stephen



www.learncpp.com.



www.wikipedia.com



www.progamiz.com



www.tutorialpoint.com



C++ book by Balaji Guruswami