12013289 (P.T.A.B. Jun. 28, 2016)

Ex Parte Kirpal

Patent Trial and Appeal BoardJun 28, 2016

12013289 (P.T.A.B. Jun. 28, 2016)

UNITED STA TES p A TENT AND TRADEMARK OFFICE APPLICATION NO. FILING DATE 12/013,289 0111112008 76239 7590 06/30/2016 Weaver Austin Villeneuve & Sampson - YAHI Attn: Yahoo! P.O. BOX 70250 OAKLAND, CA 94612-0250 FIRST NAMED INVENTOR Alok S. Kirpal UNITED STATES DEPARTMENT OF COMMERCE United States Patent and Trademark Office Address: COMMISSIONER FOR PATENTS P.O. Box 1450 Alexandria, Virginia 22313-1450 www .uspto.gov ATTORNEY DOCKET NO. CONFIRMATION NO. YAH1Pl31/Y04033USOO 8313 EXAMINER MINCEY, JERMAINE A ART UNIT PAPER NUMBER 2159 NOTIFICATION DATE DELIVERY MODE 06/30/2016 ELECTRONIC Please find below and/or attached an Office communication concerning this application or proceeding. The time period for reply, if any, is set in the attached communication. Notice of the Office communication was sent electronically on above-indicated "Notification Date" to the following e-mail address( es): USPTO@wavsip.com PTOL-90A (Rev. 04/07) UNITED STATES PATENT AND TRADEMARK OFFICE BEFORE THE PATENT TRIAL AND APPEAL BOARD Exparte ALOK S. KIRPAL Appeal2013-007782 Application 12/013,289 Technology Center 2100 Before CAROLYN D. THOMAS, DEBRA K. STEPHENS, and NABEEL U. KHAN, Administrative Patent Judges. STEPHENS, Administrative Patent Judge. DECISION ON APPEAL STATEMENT OF THE CASE Appellant appeals under 35 U.S.C. Â§ 134 from a Final Rejection of claims 1, 3-5, 7-10, 12-14, and 16-21. 1 We have jurisdiction under 35 U.S.C. Â§ 6(b). We AFFIRM. 1 Claims 2, 6, 11, and 15 have been cancelled. Appeal2013-007782 Application 12/013,289 STATEMENT OF THE fNVENTION According to Appellant, the claims are directed to extracting entities from a web page using a high precision low recall technique and a sequential model (Abstract). Claim 1, reproduced below, is representative of the claimed subject matter: 1. A method comprising: training a high precision low recall (HPLR), template- based technique on a first web page, producing one or more annotated entities extracted from the first web page so that the HPLR, template-based technique is trained to extract annotated entities from a first cluster of web pages that are structurally similar and for which a template has been created prior to application of the HPLR technique on such first cluster, including the first web page; extracting one or more annotated entities from the first cluster of web pages using the HPLR, template-based technique; training a sequential model by using the one or more annotated entities extracted by the HPLR technique as an observation sequence input to extract additional annotated entities from the first cluster and one or more other clusters of web pages for which a template has not been created, wherein the first cluster's web pages are structurally different than the one or more other cluster's web pages; and applying the trained sequential model on a second web page from a second cluster of the one or more other clusters, producing one or more annotated entities extracted from the second web page. REFERENCES The prior art relied upon by the Examiner in rejecting the claims on appeal is: Shanahan Wen US 2005/0228783 Al US 2008/0046441 Al 2 Oct. 13, 2005 Feb.21,2008 Appeal2013-007782 Application 12/013,289 V.G. Vinod Vydiswaran & Sunita Sarawagi, Advances in Data Management, "Learning to extract information from large websites using sequential models" 3-14 (2005). Philippe Le Hegaret, The W3C Document Object Model (DOM), 2002, available at http:www.w3.org/2002/07/26-dom-article (last visited Mar. 16, 2010). REJECTIONS Claims 1, 4, 5, 9, 10, 13, 14, and 18-21 stand rejected under 35 U.S.C. Â§ 103(a) as being unpatentable over Shanahan and Vinod (Final Act. 3-11). Claims 3, 7, 12, and 16 stand rejected under 35 U.S.C. Â§ 103(a) as being unpatentable over Shanahan, Vinod, and Wen (Final Act. 12-13). Claims 8 and 17 stand rejected under 35 U.S.C. Â§ 103(a) as being unpatentable over Shanahan, Vinod, Wen, and Hegaret (Final Act. 14--15). We have only considered those arguments that Appellant actually raised in the Briefs. Arguments Appellant could have made but chose not to make in the Briefs have not been considered and are deemed to be waived. See 37 C.F.R. Â§ 41.37(c)(l)(iv) (2012). ISSUES 35 US.C. Â§ 103(a): Claims 1, 4, 5, 9, 10, 13, 14, and 18-21 Appellant asserts the invention is not obvious over Shanahan and Vinod (App. Br. 4--8). The issue presented by the arguments is: Issue 1: Has the Examiner erred in finding the combination of Shanahan and Vinod teaches or suggests training a high precision low recall (HPLR), template-based technique and 3 Appeal2013-007782 Application 12/013,289 training a sequential model by using the one or more annotated entities extracted by the HPLR technique as an observation sequence input to extract additional annotated entities from the first cluster and one or more other clusters of web pages for which a template has not been created, wherein the first cluster's web pages are structurally different than the one or more other cluster's web pages, as recited in claim 1, and similarly recited in claims 10 and 19? Issue 2: Has the Examiner improperly combined the teachings and suggestions of Shanahan and Vinod? ANALYSIS We disagree with Appellant's conclusions and adopt as our own: ( 1) the findings and reasons set forth by the Examiner in the action from which this appeal is taken; and (2) the reasons set forth by the Examiner in the i~JlS\ver in response to the i\.ppeal Brief. \Vith respect to the claims argued by Appellant, we highlight and address specific findings and arguments for emphasis as follows. Appellant first argues the combination of Shanahan and Vinod does not teach or suggest "training a sequential model by using the one or more annotated entities extracted by the HPLR technique as an observation sequence input" (App. Br. 4). Specifically, Appellant argues that Shanahan's Support Vector Machine (SVM) "does not indicate that these features are extracted 'using [an] HPLR, template-based technique' as recited in claim 1" (id. at 5). Appellant further argues training of Shanahan' s SVM is to "satisfy an information need," which requires "a highly heterogeneous set of documents"; and argues the information 4 Appeal2013-007782 Application 12/013,289 retrieved by the searches in Shanahan "is extracted from documents without the use of a template to determine whether the document 'satisfies the information need' already expressed" in contrast to Appellant's recited invention (id. at 5---6). We are not persuaded by Appellant's arguments. Initially, we note "HPLR," "HPLR technique," and "HPLR, template-based technique" are not defined explicitly in Appellant's Specification. Appellant points to paragraph three of the Specification as describing a "template-based technique" which states "[a] class of techniques utilized to extract entities and attributes from web pages is known as template-based techniques" (Reply Br. 9 (citing Spec. i-f 3)). The Examiner finds, and we agree, Shanahan discloses acquiring an SVM through a process of learning where "[t]he learnt model tends to exhibit[] high precision, but very low recall" (Shanahan i-f 16; Ans. 21 ). We further agree with the Examiner's finding that the SVM disclosed in Shanahan is a high precision low recall technique that models a hyperplane of classified objects, and that Shanahan teaches the documented information form complex constructs (Ans. 21-22; Shanahan i-f 35). We agree with the Examiner that the support vector machines serve as a computational model or template (Ans. 21 ). We further agree with the Examiner that the structural features described in Shanahan are annotated, or marked, entities (Ans. 22); and neither Appellant's Specification nor the claim itself expressly defines what an annotated entity is. Rather, Appellant's Specification merely describes examples of annotated entities (see Cl. 1; see also Spec. i-fi-1 3--4, 14--15). 5 Appeal2013-007782 Application 12/013,289 As set forth by the Examiner, Shanahan further discloses many alternative techniques are known in the art for extracting these structural features, storing the features, and sorting or ranking features (Ans. 22). Therefore, we are not persuaded Shanahan fails to teach an HPLR, template- based technique. Appellant additionally argues Vinod clearly does not teach or suggest "training a sequential model by using ... one or more annotated entities extracted by the HPLR technique as an observation sequence input" (App. Br. 7). Appellant concedes that "Vinod describes training a sequential model ... a Conditional Random Field (CRF);" but Appellant argues Vinod does not teach or suggest the training is based on an HPLR technique (id. at 6-7). We are not persuaded by Appellant's arguments. Appellant concedes that "Vinod describes training a sequential model ... a Conditional Random Field (CRF)" (App. Br. 6); and Appellant's own Specification lists CRF as an exemplary technique to perform sequential modeling (see Spec. i-f 12). The Examiner finds, and we agree, that Vinod discloses the training of a sequential model to extract, in a second phase, additional annotated entities fetched from relevant web pages (Ans. 23). Appellant also argues Shanahan and Vinod in combination, cannot teach or disclose the features of claim 1 (App. Br. 7). Specifically, Appellant argues (1) Shanahan does not teach a template-based technique and (2) the training technique disclosed in Vinod is based on selected hyperlinks, not structurally similar websites (id.). Therefore, Appellant's argue, "even if the teachings of Shanahan could properly be characterized as describing a template-based technique ... one of ordinary skill in the art 6 Appeal2013-007782 Application 12/013,289 would not be motivated to use such a technique in the context of Vinod in the manner suggested by the Examiner" (id. at 8). We are not persuaded by Appellant's arguments. Regarding the template based technique disclosed in Shanahan, we agree with the Examiner's findings as discussed above. The Examiner finds, and we agree, that Vinod teaches an HPLR technique in the feature extraction during the page fetch, having a structure format and parsing text content (Ans. 23). Therefore, we agree with the Examiner's finding that the combination of Shanahan and Vinod would have been obvious to one of ordinary skill in the art (Final Act. 5; Ans. 23). We are not persuaded by Appellant that an ordinarily skilled artisan would not have been motivated to combine the teachings and suggestions of Shanahan (App. Br. 8). The Examiner has articulated reasoning with some rational underpinning - to train a model to automatically find desired pages - citing Vydiswaran's Introduction. Appellant has not proffered sufficient evidence or argument to persuade us of Examiner error. Accordingly, we are not persuaded the Examiner erred in finding the combination of Shanahan and Vinod teaches or suggests the limitations as recited in claim 1 and claims 4, 5, 9, 10, 13, 14, and 18-21, not separately argued. Therefore, we sustain the rejection of claims 1, 4, 5, 9, 10, 13, 14, and 18-21under35 U.S.C. Â§ 103(a) for obviousness over Shanahan and Vinod. 35 US.C. Â§ 103(a): Claims 3, 7, 8, 12, 16, and 17 Appellant asserts the invention as recited in claims 3, 7, 12, and 16 is not unpatentable over the combination of Shanahan, Vinod, and Wen; and 7 Appeal2013-007782 Application 12/013,289 the invention as recited in claims 8 and 17 is not unpatentable over the combination of Shanahan, Vinod, Wen, and Hegaret, respectively, relying on the arguments set forth for claim 1 (App. Br. 8). For the reasons set forth above, we are not persuaded of error in the Examiner's findings and conclusion. Accordingly, we sustain the rejection of claims 3, 7, 12, 16, and 8 and 17 under 35 U.S.C. Â§ 103(a) for obviousness over Shanahan, Vinod, and Wen; and Shanahan, Vinod, Wen, and Hegaret, respectively. DECISION The Examiner's rejection of claims 1, 4, 5, 9, 10, 13, 14, and 18-21 under 35 U.S.C. Â§ 103(a) as being unpatentable over Shanahan and Vinod is AFFIRMED. The Examiner's rejection of claims 3, 7, 12, and 16 under 35 U.S.C. Â§ 103(a) as being unpatentable over Shanahan, Vinod, and Wen is AFFIRMED. The Examiner's rejection of claims 8 and 17 under 35 U.S.C. Â§ 103(a) as being unpatentable over Shanahan, Vinod, Wen, and Hegaret is AFFIRMED. No time period for taking any subsequent action in connection with this appeal may be extended under 37 C.F.R. Â§ 1.136(a). See 37 C.F.R. Â§ 1.136(a)(l )(iv). AFFIRMED 8