Goksel Dedeoglu et al.Download PDFPatent Trials and Appeals BoardJul 30, 201913270365 - (D) (P.T.A.B. Jul. 30, 2019) Copy Citation UNITED STATES PATENT AND TRADEMARK OFFICE UNITED STATES DEPARTMENT OF COMMERCE United States Patent and Trademark Office Address: COMMISSIONER FOR PATENTS P.O. Box 1450 Alexandria, Virginia 22313-1450 www.uspto.gov APPLICATION NO. FILING DATE FIRST NAMED INVENTOR ATTORNEY DOCKET NO. CONFIRMATION NO. 13/270,365 10/11/2011 Goksel Dedeoglu TI-70112 1038 23494 7590 07/30/2019 TEXAS INSTRUMENTS INCORPORATED P O BOX 655474, M/S 3999 DALLAS, TX 75265 EXAMINER LUO, KATE H ART UNIT PAPER NUMBER 2488 NOTIFICATION DATE DELIVERY MODE 07/30/2019 ELECTRONIC Please find below and/or attached an Office communication concerning this application or proceeding. The time period for reply, if any, is set in the attached communication. Notice of the Office communication was sent electronically on above-indicated "Notification Date" to the following e-mail address(es): uspto@ti.com PTOL-90A (Rev. 04/07) UNITED STATES PATENT AND TRADEMARK OFFICE ____________________ BEFORE THE PATENT TRIAL AND APPEAL BOARD ____________________ Ex parte GOKSEL DEDEOGLU and VINAY SHARMA ____________________ Appeal 2018-008716 Application 13/270,3651 Technology Center 2400 ____________________ Before ALLEN R. MACDONALD, STACY B. MARGOLIES, and IFTIKHAR AHMED, Administrative Patent Judges. AHMED, Administrative Patent Judge. DECISION ON APPEAL Appellants appeal under 35 U.S.C. § 134(a) from a final rejection of claims 2, 5–9, 11, and 12, which are all of the claims pending in the application. Claims 1, 3, 4, and 10 have been cancelled. App. Br. 2; Dec. 1, 2017 Advisory Action (entering amendment cancelling claims 1, 3, 4, and 10). We have jurisdiction under 35 U.S.C. § 6(b). We REVERSE. SUMMARY OF THE INVENTION The invention generally relates to a “depth-fill algorithm for low- complexity stereo vision.” Spec. ¶ 2. 1 According to Appellants, the real party in interest is Texas Instruments Incorporated. App. Br. 2. Appeal 2018-008716 Application 13/270,365 2 Claims 5 and 9 are illustrative of the subject matter on appeal and are reproduced below with certain limitations at issue emphasized: 5. An apparatus for low-complexity stereo vision, comprising: a memory; a processor coupled to the memory and configured to obtain a depth image based on left and right images of a stereo camera by tracking a farthest depth value on a pixel basis for each of a plurality of pixels over a plurality of stereo frames. 9. A non-transitory computer readable medium comprising instructions that, when executed by a processor, cause the processor to: estimate a first depth value for a pixel in a depth model based on two or more previously-processed stereo frames, each of the previously-processed stereo frames including respective left and right images obtained from a stereo camera; estimate a second depth value for the pixel based on left and right images of a currently-processed stereo frame obtained from the stereo camera; determine whether the second depth value is deeper than the first depth value in the depth model; and update the first depth value in the depth model with the second depth value in response to determining that the second depth value for the pixel is deeper than the first depth value for the pixel in the depth model. REJECTION Claims 2, 5–9, 11, and 12 stand rejected under 35 U.S.C. § 103(a) as obvious over the combination of Izzat et al. (US 2009/0167843 A1; publ. Appeal 2018-008716 Application 13/270,365 3 July 2, 2009) and Hughes et al. (US 2009/0247249 A1; publ. Oct. 1, 2009). Final Act. 3. ISSUES 1. Did the Examiner err in concluding that Izzat teaches or suggests “estimat[ing] a first depth value for a pixel in a depth model based on two or more previously-processed stereo frames, each of the previously-processed stereo frames including respective left and right images obtained from a stereo camera,” as recited in claim 9, and similarly recited in claims 2 and 6? 2. Did the Examiner err in rejecting claim 5 under 35 U.S.C. § 103(a) as being obvious over the combination of Izzat and Hughes? ANALYSIS Claims 2, 6–9, 11, and 12 Independent claim 9 recites a non-transitory computer readable medium comprising instructions that, when executed by a processor, cause the process to “estimate a first depth value for a pixel in a depth model based on two or more previously-processed stereo frames, each of the previously- processed stereo frames including respective left and right images obtained from a stereo camera.” App. Br. 19 (emphasis added). Independent claim 2 (a method claim) and dependent claim 6 (an apparatus claim) recite similar limitations. Id. at 18–19. The Examiner relies on Izzat for disclosure of this limitation, finding that Izzat discloses that depth estimation 302 (shown in Figure 3) matches the pixels in the left image to those in the right image and “then a triangulation procedure 303 is used to convert the disparity values to the Appeal 2018-008716 Application 13/270,365 4 depth values.” Final Act. 7 (citing Izzat, Fig. 3, ¶ 26). The Examiner characterizes this disclosure in Izzat as “a two-pass technique for the recovery of three dimension 3D information.” Ans. 12. According to the Examiner, in Izzat, “[a] first pass in a static scan phase recovers a three dimension of a static scene using a low speed, high accuracy technique.” Ans. 12–13. The Examiner adds that “[s]ince static scene scanning would need to be repeated multiple times to recover any new items introduced in the static scene, the multiple times scanning results in multiple stereo frames.” Id. at 13. Thus, according to the Examiner, “the first depth values from static scene scanning are generated under multiple stereo frames.” Id. Appellants contend that the depth values in Izzat are “calculated based on the disparity between left and right images of a single stereo image pair” and that “[n]o mention is made in Izzat et al. of estimating a first depth valued based on two different stereo frames.” App. Br. 6. Appellants also argue that the depth value in paragraph 26 of Izzat is generated as a part of a “dynamic scan phase,” and not as part of a “static scan phase,” as Examiner asserts. Reply Br. 2–3. According to Appellants, Izzat “does not mention that ‘depth values’ are computed during the ‘static scan phase,’” or mention “using ‘stereo frames’ for the ‘static scan phase.’” Id. at 3. Appellants add that Izzat “does not discuss using ‘stereo frames’ during the ‘static scan phase.’” Id. Instead, Appellants argue, Izzat “appears to use different non- stereo views” during the static scan phase. Id. (citing Figs. 1, 2). Therefore, Appellants conclude, “the Examiner’s argument that the ‘static scene scanning’ is repeated in Izzat et al. does not change the fact that the depth values [are] generated in the ‘dynamic scan phase’ of Izzat.” Id. at 2–3. Appellants also argue that “even if the ‘static scene scanning’ in Izzat et al. Appeal 2018-008716 Application 13/270,365 5 were to generate depth values . . . , such depth values would not be generated [‘]based on two or more previously-processed stereo frames,’ as positively recited in pending claim 9.” Id. at 3. Ultimately, we are persuaded of Examiner error. We agree with Appellants that the Examiner has not shown that Izzat teaches estimating a first depth value based on two or more previously-processed frames, each of the frames including respective left and right images. As illustrated in Figure 3 and described in paragraph 26, Izzat discloses that “[a] stereo image pair is subjected to multiple steps of processing,” including disparity estimation by matching “the pixels in the left image to those in the right images,” and a triangulation procedure that converts the disparity values to depth values, resulting in “depth map 302 . . . obtained from the stereo image pair.” Izzat ¶ 26. The Examiner has not shown that Izzat discloses that the stereo image pair processing is part of the static scan phase or that the stereo image pair processing “is repeated multiple times” to generate depth values based on multiple previously-processed stereo frames. See Ans. 13. Rather, we agree with Appellants (see Reply Br. 2) that Izzat discloses that depth map 302 described in paragraph 26 (on which the Examiner relies for the claimed “first depth value”) is generated during the dynamic scan phase, not during the static scan phase (see Izzat ¶¶ 16, 18, 26). Specifically, Izzat discloses a “two-pass approach [that] recovers the geometry of the static background and dynamic foreground separately using different methods.” Id. ¶ 18 (emphasis added). Izzat discloses that, in the static scan phase, “a high accuracy 3D acquisition approach is used” and that, in the dynamic scan phase, “the dynamic acquisition of 3D information Appeal 2018-008716 Application 13/270,365 6 needs to be performed with a fast, possibly less accurate method of scanning.” Id. ¶¶ 21, 22. The Examiner does not persuasively explain how disparate disclosures in Izzat regarding different scan phases teach the claimed estimating a first depth value. Izzat does not disclose that, during the dynamic scan phase, depth map 302 (which is based on a left image and a right image) is created based on previously-processed frames. Id. ¶ 26. Nor does Izzat disclose that the static scan phase, which may be repeated to recover new items introduced in the scene (id. ¶ 16, claims 6, 15), creates a depth value that is based on previously-processed stereo frames including left and right images. Thus, we agree with Appellants that the Examiner has not shown that Izzat teaches or suggests “estimat[ing] a first depth value for a pixel in a depth model based on two or more previously-processed stereo frames, each of the previously-processed stereo frames including respective left and right images obtained from a stereo camera,” as recited in claims 2, 6, and 9. The Examiner also does not rely on Hughes as teaching this claim limitation in support of the obviousness rejection based on Izzat and Hughes. Accordingly, we do not sustain the Examiner’s obviousness rejection of claims 2, 6–9, 11, and 12. Claim 5 Independent claim 5 recites “a processor coupled to the memory and configured to obtain a depth image based on left and right images of a stereo camera by tracking a farthest depth value on a pixel basis for each of a plurality of pixels over a plurality of stereo frames.” App. Br. 18 (emphases added). Appeal 2018-008716 Application 13/270,365 7 The Examiner again relies on the static and dynamic scan phases disclosed in Izzat as teaching “obtain[ing] a depth image based on left and right images of a stereo camera . . . over a plurality of stereo frames.” Final Act. 6 (citing Izzat ¶¶ 16, 26, Fig. 3). The Examiner also finds that Hughes discloses “tracking a farthest depth value on a pixel basis over a plurality of stereo frames” because it discloses a Z-buffer to store the depth information for pixels and discloses “Z-buffer tests [whereby] the new pixel replaces the previous pixel value if its depth is greater than or equal to the previous value stored in the Z buffer.” Id. (citing Hughes ¶ 40, Fig. 1). Appellants argue that there is no mention “of tracking a farthest depth value over a plurality of different stereo frames” in either Izzat or Hughes. App. Br. 11–12. According to Appellants, the depth values in Izzat “are calculated based on the disparity between left and right images of a single stereo image pair,” not over a plurality of stereo frames. Id. at 11. Hughes, Appellants contend, discloses that “a single image or frame may be composed of a large number of polygons that are separately and sequentially rendered” and “Z-buffering is performed for each polygon that is rendered in a frame.” Id. at 12 (citing Hughes ¶ 40). Appellants argue “Z-buffering is performed [in Hughes] by comparing depth values that are generated for different primitives or polygons of the same graphics frame.” Id. Therefore, Appellants argue, Hughes fails to teach or suggest tracking a depth value over a plurality of different stereo frames. Id. We are persuaded that the Examiner has erred. As discussed above, the Examiner has not shown that Izzat discloses that the stereo image pair processing (that calculates depth values) is part of the static scan phase or that the stereo image pair processing “is repeated multiple times” to generate Appeal 2018-008716 Application 13/270,365 8 depth values based on multiple stereo frames. See Final Act. 6. Rather, we agree with Appellants (see App. Br. 11) that Izzat discloses that depth map 302 is generated during the dynamic scan phase, not during the static scan phase, and is based on left and right images of a single stereo image pair (see Izzat ¶¶ 16, 18, 26). Claim 5, however, requires “obtain[ing] a depth image based on left and right images . . . by tracking a farthest depth value . . . over a plurality of stereo frames.” Emphasis added. To the extent the Examiner relies on Hughes for teaching this feature, we also agree with the Appellants (see App. Br. 12) that Hughes discloses that depth tracking is performed by comparing depth values that are generated for different primitives or polygons, and there is no express disclosure that this comparison involves comparing pixel values from a plurality of stereo frames (see Hughes ¶ 40). Hughes discloses a “2D to 3D texture mapping process to add visual detail to 3D geometry.” Id. Hughes further discloses that an image is “constructed from basic building blocks known as graphics primitives or polygons,” and the texture mapping process performs depth tracking over multiple polygons of the same image in rendering that image to a display. Id. The Examiner does not persuasively explain how this disclosure in Hughes regarding depth tracking over a plurality of polygons teaches the claimed depth tracking over “a plurality of stereo frames.” Thus, we agree with Appellants that the Examiner has not shown that the combination of Izzat and Hughes teaches or suggests “obtain[ing] a depth image based on left and right images of a stereo camera by tracking a farthest depth value on a pixel basis for each of a plurality of pixels over a Appeal 2018-008716 Application 13/270,365 9 plurality of stereo frames,” as recited in claim 5. Accordingly, we do not sustain the Examiner’s obviousness rejection of claim 5. DECISION For the reasons above, we reverse the Examiner’s decision rejecting claims 2, 5–9, 11, and 12. REVERSED Copy with citationCopy as parenthetical citation