0% found this document useful (0 votes)
775 views23 pages

Applications of Context Free Grammars

Uploaded by

Udit Rana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
775 views23 pages

Applications of Context Free Grammars

Uploaded by

Udit Rana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 23
Applications of Context Free Grammars Ca uae) YR Coie ec] anh LTT) ed 2 Applications of Context Free Grammars o) Parsers nl Ue Core ede SF SES Rea treeorTer ey nT AU Re eS ara) ACC CCR CR Lr elt) Pee) Context Free Grammars eo eee On uae eC aD ee anon Context free grammar G can be defined by four tuples as: ra) ao G describes the grammar See ee ete nc Sheen eee ree ee ere assis! > In CRG, the start symbol is used to derive the string. You can derive the string by repeatedly replacing a non- een g eee eter re eee Pee eee ee ce eee een a) ) } Pee clad ae eR een ee teed head of the production. Poa mnoes A string of zero or more terminals and SM ee snes ee production, represents one way to form Poe ata arog head. Example ae RCs Rec ee) ers . . eT DCs et cary coe ete nee Sis aroun Soe > Now check that abbebba string can be derived from the given CFG coe eer een ee eet ee eee ee eee ng SS a ee te toe ee Derivations Using a Grammar Inferred 200) 00) Leftmost and Rightmost Deri aR enrages RTD Reporter enter one eaten eices cinta ose y PAREN See CO rece Reece a SEY the current string is replaced with a production rule at each Ae RUS Ee Rhea ee CoM ns meet terminal symbol in the current string is replaced with a Peete eee encom Pa UD aes eee ORR ee eee on Ret eee Ace ce mrs Restos Mae Ns non-terminal symbols are replaced is different. Pa Ro ee ee ese ein occas erat ea med Ocue et tee e Geter tet) Retr tee st Sase et een nots ssn RYailcoash @ deyurely > Ina Context-Free Grammar (CFG), a sentential form isa string of terminal and non-terminal symbols that can be generated from the start symbol of the ‘grammar using a sequence of production rule applications. pe eee eon een ee Cee sv my ed eee Seer ee enue ence nT symbols represent the basic elements of the language being generated. > Sentential forms are generated by repeatedly applying production rules to non- terminal symbols in the current form, until no non-terminal symbols remain. > Sentential forms are important in the theory of formal languages and automata, SOc ace en ne ned eee Ce > The analysis of sentential forms is also important in CFGs to determine properties such as ambiguity or whether the CFG generates a regular language. > Sentential forms can be used to represent intermediate steps in the process of generating a language from a CFG, and to analyze the structure of the language eo Parse Trees > In the context of Context-Free Grammars (CFG), a parse tree is a graphical Seen ace ee ea tee ome cus > A parse tree isa tree where the internal nodes represent non-terminal symbols and the leaf nodes represent terminal symbols. > Each non-terminal node in the parse tree corresponds to a production rule used in the derivation of the string, and the children of a non-terminal node represent the symbols generated by applying the corresponding production rule, > Parse trees are used to visualize the syntactic structure of strings generated by the grammar, where the root of the tree represents the entire string and the leaf Fe ee eRe cn > Parse trees are important in the analysis of CFCs, as they provide a way to Se ener ete na ec > Parse trees are also used in parsing algorithms, such as top-down and bottom-up parsing, to generate the parse tree from an input string, Qc meen sc > Context-free grammars (CFGs) were originally developed by Noam Sea et ec eon omen) practical applications in computer science. > One practical application of CFGs is in the description of programming languages and the mechanical generation of parsers ee aaa Reece een en a aR oe ey computer science were put into practice. > Another application of CFGs is in the development of XML, which ‘uses a Document Type Definition (DTD) that is essentially a CFG to Cre Seen Casa get eR oon acy cy formatting, allowing for more specific and structured interpretations Cero > For example, using the "PHONE" tag to surround a sequence of See a caSen ance Be Vactos a) Le See Renae Oncor it into the corresponding Intermediate Representation(IR). And as there are so} programming languages that cannot be represented by regular expressions alone’ recited DCB St oso ene ene co noe > we must be able to match some left parenthesis against a right parenthesis that appears immediately to its 1 eee eRe eee > Ifwe eventually eliminate all the parentheses then the string was balanced and if we cannot match parentheses Pog rece > Examples of strings of balanced parentheses are ()), 00) (00) and e,, while ( and (() are not. A grammar Gbal = ({B},{()},P.B) generates all and only the strings of balanced parentheses, where P consists of the productions. ere |b BB |(B)|« | Ee re nee enc crts Poe eee nics? Then according to this grammar,(* a)" isa Cd — ee ew Le Pais of Lerimnted = eo around a balanced string, then the result is ae Emel Sees Lacan eee , et ee eC es ee et eC Tee see ee tc ee eRe s input aoa een ey es rs tet aa Seen ee sre cco ee et een es ereeen ts SO en ee ee eet corresponds to a C program whose structure is like the Pee nee if (Condition if (Condition) Statements iF (Condizéon) The YACC ParserGenerator ee eae EL Ineo > YACC takes a Context-Free Grammar (CFG) as input to generate a parser function. > The parser function can create parse trees from source programs that follow the Ps a Cr cn Cone > The action is a fragment of C code executed whenever a node of the parse tree corresponding to the production and its children is created. Pa ecco egos ar een ye EO Ue ey Rae eee eC Ceres An example of a grammar in the YACC notation PSS too es eee eee es ee eno Perea Sa ot The list of bodies for a given head ends with a semicolon. Terminals are enclosed in single quotes, and it is possible to include several characters within a single pair of quotes. YACC also allows the definition of symbolic terminals. The Peete eet Rectan ecstasy eee tree cen ne een te the lexical analyzer. ite eed ei See eRe res Ree ee eee ees eee en tS en tc) Markup Languages Nee en ee gee eR Set that convey semantic information. Dane ner a cr aS CS eT coe eres ee Rte Rc re eran ‘The structure of HTML can be simplified, and a CFG could be used to describe legal HTML documents and guide their processing for display on a monitor or printer. Let us see how with an Example :~ People who die to sin in the fast ane 0) The test Continued ... eae ea eS ose Ree ae) Eee od sen ec ee ere eae SR Ce ee Reon een ec Nac acest a nD tags. An example of a Text element in Fig (a) is "Moldy bread.” 2. Charis any string consisting of a single character that is legal in HTML text. Note that blanks are included as characters. Se Roe Ree a Oa Cerne em hacer RR cnc eo de 4. Blement is either a Text string, or a pair of matching tags and the document Pert mC Oia Race See erS et Seer sO oa omer Pees Ss ese ee aa Continued With the conditions we create a CFG that describes as much of the Pte eg SSeS eet oc eect ee RO eee Tener eC ee ena ene Pee eee et ne LL eens Char Text fee ) explains that "Text" in HTML can be either an empty string or any legal character followed by more text, withthe exception of < and >, which eee ree tree ene ie cened Element Doe Line (3) defines a document in HTML as a sequence of zero or more cee Oe en eee ee ee cy ee eee en ert possibility of additional productions for other tag types in HTML. Line (6) defines a list tem as the
  • tag followed by any document. POR ee eee es ‘And with these productions, we have defined our CFG for the given Pete tan sne XML and Document Type Definitions, The role of context-free grammars (CFGs) in markup languages HTML uses CFGs to describe formatting, while XML uses them to cer ears In XML, tags surround phrases that represent elements, ane a document type definition (DTD) is used to clarify the different kinds of tagsand structures that can appear between them IDOCTYPE name-of-D The DTD language uses a CFG notation to describe variables and list of element tens) Element descriptions in a DTD are regular expressions, which can include other element names, the special term PCDATA, union, concatenation, and closure operators of the elenent) Peer eaten te CNet ee een ec a set cs The above mentioned points show the importance of CFGs in describing, en na Document type definition (DTD) Oe ce CC ea neue ctr The form of 2 DTD De ee SE ey ee ees ee ea ete 1. Other element names, representing the fact that elements of one type can appear within elements of another type, eee me oreo eee ere es 2. The special term #PCDATA, standing for any text that does not involve XML. tags. The allowed operators are: Deore eee enters Peek ent eet es Seen ae etd ey ee eet > Parentheses may group operators her arguments otherwise te usul precedence ofegular- expression operators ; applies. Example eee ee ae ee ee cre eeeT DTD that they can use to publish. on the Web, descriptions of the various PCs that they currently sell Each description of a PC will have a model number, and details about the features of the model, e.g, the amount of RAM, number and size of disks, and maereoree ELEMENT PCS (PCe ELEMENT PC (MODEL, PRICE, PROCE OEE ene ene at ELEMENT MODEL. (PCDATA) ELEMENT PRICE. (#PCDATA) ELEMENT PROCESSOR (HAN. ELEMENT WAN. (47CDATA): erent ee ete an ace ELEMENT MODEL (WPODATA) constituents are other elements for model, price, processor type, and RAM. ELEMENT SPEED (#PODATA)> Brie ELEMENT RIK (¥PCDATA The first element is PCS, which is lke the start symbol of a CFG. PC* is dlofined as zero or more PC entries ach of these constituents must appear once in that order. The lst ELEMENT DISK (HARDDISK | 0D | O¥D) eee See ret ete ene renee Peet ELEMENT HARDDISK (WANE, MODEL, SIZE Rom coer Nn ren ELEMENT SIZE @PCOITH ELEMENT (SPE: De eee ee ee ee eee pone The DISK entry isthe most complex, A disk can be a hard disk, CD, or DVD. The element DISK is the “OR of three other elements, Hard disks have a structure in which the manufacturer, model, and size eee en ae eee er? 5 An example of an XML document that conforms {6 DED PeSpecs Within the illustrated entry, we can easily see that the model number is} Pee ee OL Vee cena 256Mb of RAM, a 30.5Gb Maxtor Diamond hard disk, and a 32x CD-ROM cores What is important is not that we can read these facts, but that a program could read the document, and guided by the grammar in the DTD. Sora ee ee eee ened Tc ee eee eee et ee ta eee Pose Oe ee HARDDISK Dase where the "body" has a closure operator within it. The solution is to replace DISK+ by a gem ee sec ome ee eee ere oe eed of the variable Disk. The equivalent productions are thus: Lear Rar ears nd Dore eee a ae A general technique for converting a CFG wit Derr imo ce lnic lad If the body is the concatenation of elements, then the production is already in the legal form for CFG's, so we do nol De ee eee oe ge Ry acetone 1. The production is ofthe form A — El, E2, where El andl E2 are expressions permitted in the DTD language. This isthe concatenation case. Introduce two new variables, B and C, that appear nowhere else in the grammar. Replace A+ El, E2 by the eres ret oa coR 2. The production is ofthe form A —+ E1 | F2. For this union operator, replace this production by the pair of productions: reas ess) 3, The production is ofthe form A — (B1)*- Introduce a new variable B that appears nowhere else, and replace this production by: rear ra peal Pe eee eee ee ee ee eee rer AB peat 5. The production is of the form A — (E1)?, Replace this production by: ras een A general technique for converting a CFG with Derm elni cag Lot us consider how to convert the DTD rule to legal CFG productions. See gee Rec ee el Sea ee PROCESSOR, RAM and the second of which is DISK+. If we create variables for these two subexpressions, say A and B, respectively, then we can use the productions: ee) eed cannes 2. Only the last of these is not in legal form. We introduce another variable C and the productions: creak rns 3. In this special case, because the expression that A derives is just a concatenation of variables, and Disk is a single variable, we actually have no need for the variables A or C. We could use the following productions instead: Pe — Model Price Processor Ram B Peo pens
  • You might also like