Ex Parte Kiuchi et alDownload PDFBoard of Patent Appeals and InterferencesMar 29, 201210730767 (B.P.A.I. Mar. 29, 2012) Copy Citation UNITED STATES PATENT AND TRADEMARK OFFICE UNITED STATES DEPARTMENT OF COMMERCE United States Patent and Trademark Office Address: COMMISSIONER FOR PATENTS P.O. Box 1450 Alexandria, Virginia 22313-1450 www.uspto.gov APPLICATION NO. FILING DATE FIRST NAMED INVENTOR ATTORNEY DOCKET NO. CONFIRMATION NO. 10/730,767 12/08/2003 Shingo Kiuchi 9333-361 3437 74989 7590 03/30/2012 ALPINE/BHGL P.O. Box 10395 Chicago, IL 60610 EXAMINER WOZNIAK, JAMES S ART UNIT PAPER NUMBER 2626 MAIL DATE DELIVERY MODE 03/30/2012 PAPER Please find below and/or attached an Office communication concerning this application or proceeding. The time period for reply, if any, is set in the attached communication. PTOL-90A (Rev. 04/07) UNITED STATES PATENT AND TRADEMARK OFFICE ____________ BEFORE THE BOARD OF PATENT APPEALS AND INTERFERENCES ____________ Ex parte SHINGO KIUCHI and NOZOMU SAITO ____________ Appeal 2009-013722 Application 10/730,767 Technology Center 2600 ____________ Before ELENI MANTIS MERCADER, CARL W. WHITEHEAD, JR. and BRADLEY W. BAUMEISTER, Administrative Patent Judges. WHITEHEAD, JR., Administrative Patent Judge. DECISION ON APPEAL Appeal 2009-013722 Application 10/730,767 2 STATEMENT OF THE CASE Appellants appeal under 35 U.S.C. § 134 (a) from a final rejection of claims 1, 3-8, 12-15, 17-18, and 20. Appeal Brief 2 and 6. We have jurisdiction under 35 U.S.C. § 6(b) (2002). We reverse. Introduction The invention relates to “a device and method for improving speech recognition performance in a noisy environment.” Appeal Brief 3. Exemplary Claim Exemplary independent claim 1 under appeal reads as follows: 1. A method for use with a speech recognition device for improving speech recognition performance, said method comprising: identifying a start position of a speech region of speech data for which speech recognition is to be performed; generating, from said speech data for which speech recognition is to be performed, a plurality of pieces of speech data including said speech region and a varying period of a preceding non-speech region, where start positions of non- speech regions differ for the plurality of pieces of speech data; performing speech recognition using each of said pieces of speech data to obtain a plurality of recognized results; and identifying a most numerous recognized result from among the plurality of obtained recognized results; wherein, by sequentially shifting the start position of said non-speech region from the start position of the speech region back to a position preceding by a predetermined time, a plurality of pieces of speech data whose start positions of non- Appeal 2009-013722 Application 10/730,767 3 speech regions differ are generated from said speech data for which speech recognition is to be performed. Rejection on Appeal Claims 1, 3-8, 12-15, 17-18, and 20 stand rejected under 35 U.S.C. § 103 (a) as being unpatentable over Fujii et al (U.S. Patent Number 4,885,791(filed October 20, 1986) (issued December 5, 1989)), Keiller (U.S. Patent Number 6,975,993 B1(filed May 2, 2000) (issued December 13, 2005)), and Bi (U.S. Patent Number 6,324,509 B1(filed February 8, 1999)(issued November 27, 2001)). Answer 3-6. ANALYSIS The Examiner contends Appellants’ invention is not concerned with determining an actual starting point of speech because the instant invention involves “determining a plurality of starting points in a supposed non-speech region in order to cope with a high noise level in a recognition environment.” Answer 7 (citing Specification 3, paragraph 0010]). Examiner surmises that both the instant invention and Fujii are both similar in that regard. Id. Examiner concludes the Appellants’ claims are silent on determining an actual starting point of speech because claim 1 merely sets forth identifying a start position, thus only requiring a general start position. Id. at 7-8. Appellants argue the following: Characterizations aside, the fact is that Appellants’ claims recite “identifying a start position of a speech region of speech data for which speech recognition is to be performed,” period. Nothing could Appeal 2009-013722 Application 10/730,767 4 be plainer or clearer: a start position of the speech region is identified -- not some vague “general” start position as asserted by the Examiner. Thus, contrary to the Examiner's assertion, Appellants’ invention does involve determining a starting point of a speech region of speech data, and the claims expressly recite this. Reply Brief 2 (underline omitted). We find Appellants’ arguments persuasive. The claims indicate a starting position is identified, thus, leaving the interpretation of the claims unambiguous . The Examiner finds: Fujii, Keiller, and Bi are analogous art because they are from a similar field of endeavor in speech recognition systems. Thus, it would have been obvious to a person of ordinary skill in the art, at the time of invention, to modify the teachings of Fujii in view of Keiller with the concept of backwards searching (shifting) taught by Bi in order to provide a well-known means of achieving the multiple speech data periods in Fujii that can be easily implemented in a real-time processor (Bi, Col. 5, Lines 24-30). Answer 5. Further, the Examiner concludes that since the claims do not require determining an “actual starting point of speech” and “the combination of Fujii and Bi teaches ‘identifying a starting position,’” The Examiner’s reasoning is not convincing. Answer 8. Appellants argue the passage (column 5, lines 13-30) of Bi relied upon by the Examiner is not applicable because the passage “merely notes that the signal data are [sic] stored in a buffer because the processor performs real-time processing but must be able to look back a certain number of speech frames.” Appeal Brief 9. Appellants further argue: Appeal 2009-013722 Application 10/730,767 5 In particular, Bi does not describe sequentially shifting backwards to obtain a plurality of starting points. Bi uses a first signal-to-noise ratio (SNR) threshold value to identify a “first starting point” and a “first ending point” of calculation that are not endpoints of the speech data but are instead only interim calculation points (“PRE_START” and “PRE END”). Then, Bi uses a second, smaller SNR threshold value to determine the “actual” starting and ending points of the speech data (e.g., Abstract, col. 4, lines 37-57; col. 6, lines 15-39; col. 7, lines 24-30). Bi notes that in conventional voice recognition devices, the endpoint detector relies upon a single SNR threshold to determine the endpoints of a piece of speech. However, setting the SNR threshold too low may make the device too sensitive to background noise, whereas setting the SNR threshold too high may miss part of the beginning or ending of speech (col. 2, lines 21-37). Bi, in contrast, “uses multiple, adaptive SNR thresholds to accurately detect the endpoints of speech in the presence of background noise” (col. 2, lines 42-44). Thus, Bi explicitly states that it only determines one actual starting point and one actual ending point. Bi does not “consider multiple starting points” or disclose “different possible starting points” as the Office Action and the Advisory Action assert. Further, the beginning point of the Bi’s process is not the start of the speech period (as in Applicants’ invention), but rather is simply a point where the SNR reaches a first arbitrary threshold value. Appeal Brief 9 (underlines omitted). We find Appellants’ arguments persuasive. As we stated above, the claims indicate a starting position is identified, thus leaving the interpretation of the claims unambiguous. The Examiner cannot ignore the limitation of identifying a start position of a speech region of speech data. Further, Bi does not address the deficiencies of Fujii as noted by the Examiner and as argued by the Appellants. Therefore, we do not sustain the obviousness rejection of claims 1, 3-8, 12-15, 17-18, and 20. Appeal 2009-013722 Application 10/730,767 6 DECISION The rejection of claims 1, 3-8, 12-15, 17-18, and 20 under 35 U.S.C. § 103 is reversed. REVERSED llw Copy with citationCopy as parenthetical citation