14009907 (P.T.A.B. Mar. 27, 2017)

Ex Parte Lee et al

Patent Trial and Appeal BoardMar 27, 2017

14009907 (P.T.A.B. Mar. 27, 2017)

United States Patent and Trademark Office UNITED STATES DEPARTMENT OF COMMERCE United States Patent and Trademark Office Address: COMMISSIONER FOR PATENTS P.O.Box 1450 Alexandria, Virginia 22313-1450 www.uspto.gov APPLICATION NO. FILING DATE FIRST NAMED INVENTOR ATTORNEY DOCKET NO. CONFIRMATION NO. 14/009,907 10/04/2013 Chae Hyun Lee 4900-0106 4992 22429 7590 03/29/2017 HAUPTMAN HAM, LLP 2318 Mill Road Suite 1400 ALEXANDRIA, VA 22314 EXAMINER TRAN, MAI T ART UNIT PAPER NUMBER 2124 NOTIFICATION DATE DELIVERY MODE 03/29/2017 ELECTRONIC Please find below and/or attached an Office communication concerning this application or proceeding. The time period for reply, if any, is set in the attached communication. Notice of the Office communication was sent electronically on above-indicated "Notification Date" to the following e-mail address(es): docketing @ ipfirm. com pair_lhhb @ firsttofile. com EAnastasio @ IPFirm.com PTOL-90A (Rev. 04/07) UNITED STATES PATENT AND TRADEMARK OFFICE BEFORE THE PATENT TRIAL AND APPEAL BOARD Ex parte CHAE HYUN LEE, MIN SOENG KIM, and JUN SUP LEE Appeal 2017-004186 Application 14/009,907 Technology Center 2100 Before THU A. DANG, NORMAN H. BEAMER, and MATTHEW J. McNEILL, Administrative Patent Judges. DANG, Administrative Patent Judge. DECISION ON APPEAL I. STATEMENT OF THE CASE Appellants appeal under 35 U.S.C. § 134(a) from the Examiner’s Final Rejection of claims 1—10, which constitute all the claims pending in this application. We have jurisdiction under 35 U.S.C. § 6(b). We affirm. A. INVENTION According to Appellants, the claimed invention provides “a system and a method, which calculate and provide a silhouette coefficient for Appeal 2017-004186 Application 14/009,907 objectively analyzing a result of clustering massive data by using Hadoop” (Spec. 3,11. 11-16). B. REPRESENTATIVE CLAIM Claim 1 is exemplary: 1. A system for analyzing a result of clustering massive data, the system comprising: a task management apparatus configured to divide a clustered target file into blocks of a pre-designated size, and generate an input split corresponding to a task pair for a reduce task for reducing input data by combining the divided blocks; at least one distance calculation apparatus configured to receive allocation of the input split, and calculate a distance sum for each record between blocks included in the input split; at least one index coefficient calculation apparatus configured to calculate a clustering significance verification index coefficient for each record by using the distance sum for each record received from the at least one distance calculation apparatus; and an analysis apparatus configured to calculate a final significance verification index coefficient of a corresponding cluster, by averaging the clustering significance verification index coefficient for each record. C. REJECTIONS Claims 1—6 stand rejected under 35 U.S.C. § 103(a) as unpatentable over the teachings of Lee et al. (US 2012/0182891 Al; pub. July 19, 2012) (“Lee”), and Handley (US 2008/0256230 Al; pub. Oct. 16, 2008). Claims 7—10 stand rejected under 35 U.S.C. § 103(a) as unpatentable over the teachings of Dean and Ghemawat (MapReduce: Simplified Data Processing on Larger Cluster, OSDI ’04 6th Symposium on Operating Systems Design and Implementation 137—149 USENIX Association (2004)) (“Dean”), and Handley. 2 Appeal 2017-004186 Application 14/009,907 II. ISSUES The principal issues before us are whether the Examiner erred in finding that the combination of Lee and Handley teaches or suggests a calculation apparatus configured to “calculate a distance sum for each record between blocks included in the input split,” and a calculation apparatus configured to “calculate a clustering significance verification index coefficient for each record by using the distance sum for each record received” (claim 1). III. FINDINGS OF FACT The following Findings of Fact (FF) are shown by a preponderance of the evidence. Lee 1. Lee discloses enabling cluster nodes to process in parallel a large quantity of packets collected in a network in an open source Hadoop distribution system (Abst.), wherein the large size data processed by Hadoop is not stored in one computer but split into several blocks and distributed into and stored in several computers flflf 8—10). Handley 2. Handley is directed to “clustering algorithms that use a distance or similarity measure to determine the proper clustering of objects” (14), wherein a Silhouette Coefficient for a device is determined based on the average distance of a device to other devices in its cluster and the minimum distance of the device to a device not in its cluster (| 43). The Silhouette Coefficient is determined by summing the distance between each pair of devices and dividing it by the number of devices in the cluster (| 46). The 3 Appeal 2017-004186 Application 14/009,907 Silhouette Coefficient for a clustering is the average of the Silhouette Coefficients for the devices in the clustering (| 47). IV. ANALYSIS Appellants contend “Handley does not disclose the feature ‘calculate a distance sum for each record between blocks included in the input split’” (App. Br. 8). Although Appellants concede Handley discloses that “the Silhouette Coefficient for a device may be determined based on the average distance of a device to other devices in its cluster,” Appellants contend “distances between all devices included in the cluster is calculated” in Handley, thus, “Handley has problem of increased amount of calculation” (id.). According to Appellants, “the present invention does not calculate a distance sum for all clustered files” but instead “calculates a distance sum for each record between blocks included in the input split, thereby the amount of calculation being able to be reduced” (id. at 8—9). Appellants disagree with the Examiner’s inherency finding as to the average distance (id. at 11). In particular, Appellants contend claim 1 recites calculating “a clustering, significance verification index for each record by using the distance sum for each record,” whereas “Handley merely states, in paragraph [0043], that the Silhouette Coefficient for a device may be determined based on the average distance of a device to the other devices in its cluster . . .” (id., emphasis omitted). We have considered all of Appellants’ arguments and evidence presented. However, we disagree with Appellants’ contentions regarding the Examiner’s rejections of the claims. We agree with the Examiner’s findings, 4 Appeal 2017-004186 Application 14/009,907 and find no error with the Examiner’s conclusion that the claims would have been obvious over the combined teachings. Appellants’ contentions are mainly directed to what Handley “does not disclose,” what Handley “has [a] problem” with, or what Handley “merely states” {id. at 8—11). However, the test for obviousness is what the combination of references cited by the Examiner teaches or suggests to one of ordinary skill in the art. See In re Merck & Co., 800 F.2d 1091, 1097 (Fed. Cir. 1986). Based on the record before us, we are unpersuaded the Examiner erred in finding the combination of Lee and Handley teaches or at least suggests the contested limitation. Here, we agree with the Examiner’s reliance on Lee for teaching the step of dividing “a clustered target file into blocks of a pre-designated size,” and generating an “input split corresponding to a task pair for a reduce task for reducing input data by combining the divided blocks” (Final Act. 4, citing Lee 10, 16, 30, 37). In particular, similar to Appellants’ invention, Lee discloses splitting large size data in a Hadoop system into multiple blocks (FF 1). Similar to Appellants’ invention, where splitting large size data processed by Hadoop into multiple blocks reduces the amount of calculation (App. Br. 8—9), Lee also discloses such input split of large size data processed by Hadoop. (FF 1). Furthermore, we agree with the Examiner’s reliance on Handley to teach and suggest the step of calculating “a distance sum” and calculating “a clustering significance verification index coefficient for each record by using the distance sum for each record received” (Final Act. 4, citing Handley Tflf 43 47). Similar to Appellants’ invention, Handley discloses determining a “clustering significance verification index coefficient” (Silhouette 5 Appeal 2017-004186 Application 14/009,907 Coefficient) for objectively analyzing a result of a clustering (FF 2). In particular, “Handley is directed to clustering algorithms that use a distance or similarity measure to determine the proper clustering of objects, wherein a Silhouette Coefficient for a device is determined based on an average distance” (FF 2). We agree with the Examiner that “[a]n average distance is the sum of distances divided by the number of distances measure,” wherein “[i]n order to calculate an average distance, the distance sum, inherently was calculated” (Ans. 13—14, emphasis omitted). In fact, Handley explicitly discloses that the Silhouette Coefficient is determined by summing the distances and dividing the sum by the number of distances measured. (FF 2). Accordingly, we find no error with the Examiner’s finding the combination of Lee and Handley teaches or at least suggests calculating “a distance sum for each record between blocks included in the input split” and calculating “a clustering significance verification index coefficient for each record by using the distance sum for each record received” as recited in claim 1. The Supreme Court has stated, “[t]he combination of familiar elements according to known methods is likely to be obvious when it does no more than yield predictable results.” KSR Int'l Co. v. Teleflex, Inc., 550 U.S. 398, 416 (2007). That is, when considering obviousness of a combination of known elements, the operative question is thus “whether the improvement is more than the predictable use of prior art elements according to their established functions.” Id. at 417. The skilled artisan is “a person of ordinary creativity, not an automaton.” Id. at 421. 6 Appeal 2017-004186 Application 14/009,907 Here, we agree with the Examiner’s reliance on Lee for teaching and suggesting splitting a target file into a plurality of blocks and the Examiner’s reliance on Handley for teaching and suggesting calculating a distance sum and then determining a Silhouette Coefficient using the distance sum (Final Act. 4). We conclude that it would have been well within the skill of one skilled in the art to combine Handley’s teaching and suggestion of calculating distance sum and Silhouette Coefficient in clustering into Lee’s Hadoop system that splits a clustered file into multiple blocks with multiple records therebetween, thereby calculating a Silhouette Coefficient for each record between the blocks using the distance sum for each record received. Such a combination would have been well within the skill of the art. See KSR, 550 U.S. at 417. Appellants have presented no evidence that combining Handley’s calculations with Lee’s InputSplit would have been “uniquely challenging or difficult for one of ordinary skill in the art.” Leapfrog Enters., Inc. v. Fisher-Price, Inc., 485 F.3d 1157, 1162 (Fed. Cir. 2007) (citing KSR, 550 U.S. at 420). Instead, we find Appellants’ invention is simply a modification of familiar prior art teachings (as taught or suggested by the cited references) that would have realized a predictable result. KSR, 550 U.S. at 416. Based on this record, we find no error in the Examiner’s rejection of independent claim 1. With respect to claim 2, Appellants repeat that “[a]s set forth above with respect to claim 1, applied references fail to disclose the recited ‘at least one distance calculation apparatus’” (App. Br. 12). Appellants do not 7 Appeal 2017-004186 Application 14/009,907 provide substantive arguments for claims 3—6 {id. at 13—14). Accordingly, we also affirm the rejection of claims 2—6 over Lee and Handley. As for claim 7, Appellants similarly disagree with the Examiner’s finding that the distance sum is inherent in Handley’s average distance. Id at 15. However, similar to the Examiner’s finding with respect to claim 1, we find no error with the Examiner’s reliance on Handley to teach and suggest the step of outputting “a distance sum” and calculating “a clustering significance verification index coefficient for each record by using the distance sum for each record” (claim 7). Appellants repeat the above arguments with respect to claims 8—10 (App. Br. 16—18). On this record, we also affirm the rejection of claims 7— 10 over Dean and Handley. V. CONCLUSION AND DECISION We affirm the Examiner’s rejections of claims 1—10 under 35 U.S.C. § 103(a). No time period for taking any subsequent action in connection with this appeal may be extended under 37 C.F.R. § 1.136(a)(l)(iv). AFFIRMED 8