Ex Parte DalyDownload PDFPatent Trial and Appeal BoardApr 17, 201712491876 (P.T.A.B. Apr. 17, 2017) Copy Citation UNITED STATES PATENT AND TRADEMARK OFFICE UNITED STATES DEPARTMENT OF COMMERCE United States Patent and Trademark Office Address: COMMISSIONER FOR PATENTS P.O. Box 1450 Alexandria, Virginia 22313-1450 www.uspto.gov APPLICATION NO. FILING DATE FIRST NAMED INVENTOR ATTORNEY DOCKET NO. CONFIRMATION NO. 12/491,876 06/25/2009 Curtis N. Daly P2009-03-17 (290110.432) 6199 70336 7590 04/17/2017 Seed IP Law Group LLP/EchoStar (290110) 701 FIFTH AVENUE SUITE 5400 SEATTLE, WA 98104 EXAMINER CORBO, NICHOLAS T ART UNIT PAPER NUMBER 2427 MAIL DATE DELIVERY MODE 04/17/2017 PAPER Please find below and/or attached an Office communication concerning this application or proceeding. The time period for reply, if any, is set in the attached communication. PTOL-90A (Rev. 04/07) UNITED STATES PATENT AND TRADEMARK OFFICE ____________________ BEFORE THE PATENT TRIAL AND APPEAL BOARD ____________________ Ex parte CURTIS N. DALY ____________________ Appeal 2016-003473 Application 12/491,8761 Technology Center 2400 ____________________ Before CARLA M. KRIVAK, HUNG H. BUI, and JEFFREY A. STEPHENS, Administrative Patent Judges. BUI, Administrative Patent Judge. DECISION ON APPEAL Appellant seeks our review under 35 U.S.C. § 134(a) from the Examiner’s Final Rejection of claims 1–5, 7, 8, and 10–22, which are all the claims pending in the application. We have jurisdiction under 35 U.S.C. § 6(b). We AFFIRM.2 1 According to Appellant, the real party in interest is EchoStar Technologies L.L.C. App. Br. 1. 2 Our Decision refers to Appellant’s Appeal Brief filed July 15, 2015 (“App. Br.”); Reply Brief filed February 16, 2016 (“Reply Br.”); Examiner’s Answer mailed December 15, 2015 (“Ans.”); Final Office Action mailed January 9, 2015 (“Final Act.”); and original Specification filed June 25, 2009 (“Spec.”). Appeal 2016-003473 Application 12/491,876 2 STATEMENT OF THE CASE Appellant’s Invention Appellant’s invention relates to an apparatus, systems, and methods for controlling a receiving device, such as a set-top box (“STB”), via voice commands. Spec. ¶ 1. The apparatus, systems, and methods implement a voice enabled media presentation system (“VEMPS”) to obtain audio data representing a spoken command to control the television STB from a user, via an audio input device of a remote-control device. Spec. ¶¶ 2, 32; Abstract. As an example, the user’s spoken command may request a channel change to a newly identified channel. Spec. ¶ 32. The VEMPS is configured to determine the spoken command by performing speech recognition. Spec. ¶ 2; Abstract. The VEMPS may obtain an indication the user is speaking a spoken command by employing automated voice activity detection, or by employing a signal received from the remote-control device or a first portion of the spoken command’s audio data. Spec. ¶ 83. Upon determining the user is speaking a command, the VEMPS may mute the audio output provided by the STB to a television set. Spec. ¶ 83. Representative Claim Claims 1, 5, 8, and 15 are independent. Representative claim 1 is reproduced below with disputed limitations in italics: 1. A media presentation system, comprising: a remote-control device including multiple keys and an audio input device; and a set-top box wirelessly communicatively coupled to the remote-control device, wherein the media presentation system is configured to: obtain audio data via the audio input device, the audio data received from a user and representing a spoken Appeal 2016-003473 Application 12/491,876 3 command to control the set-top box, wherein the spoken command includes a program title to identify programming; determine the spoken command by performing speech recognition upon the obtained audio data; control the set-top box in response to the determination of the spoken command; control the set-top box in response to a user selection of one of the multiple keys of the remote-control device; receive an indication that the user is speaking the spoken command, wherein the received indication is an initial portion of the audio data; and reduce audio output volume provided by the set-top box in response to the received indication. App. Br. 19 (Claims App’x). Examiner’s Rejections & References (1) Claims 1–5, 7, 8, 10, 11, and 15–21 stand rejected under 35 U.S.C. § 103(a) as being unpatentable over Handelman (US 2002/0052746 A1; published May 2, 2002), Houser et al. (US 5,774,859; issued June 30, 1998; “Houser”), and Broadus et al. (US 2003/0005462 A1; published Jan. 2, 2003; “Broadus”). Final Act. 3–8. (2) Claim 12 stands rejected under 35 U.S.C. § 103(a) as being unpatentable over Handelman, Houser, Broadus, and Bhogal et al. (US 2002/0094512 A1; published July 18, 2002; “Bhogal”). Final Act. 9–10. (3) Claim 13 stands rejected under 35 U.S.C. § 103(a) as being unpatentable over Handelman, Houser, Broadus, Bhogal, and Relyea et al. (US 2008/0120665 A1; published May 22, 2008; “Relyea”). Final Act. 10– 11. Appeal 2016-003473 Application 12/491,876 4 (4) Claims 14 and 22 stand rejected under 35 U.S.C. § 103(a) as being unpatentable over Handelman, Houser, Broadus, and Bloebaum et al. (US 2007/0230678 A1; published Oct. 4, 2007; “Bloebaum”). Final Act. 11–12. Issue on Appeal Based on Appellant’s arguments, the dispositive issue on appeal is whether the cited prior art teaches or suggests the following limitations: (1) obtain audio data via the audio input device, the audio data received from a user and representing a spoken command to control the set-top box, wherein the spoken command includes a program title to identify programming; . . . receive an indication that the user is speaking the spoken command, wherein the received indication is an initial portion of the audio data; and reduce audio output volume provided by the set-top box in response to the received indication, as recited in Appellant’s independent claim 1, and similarly recited in independent claims 5, 8, and 15. App. Br. 8–11; Reply Br. 1–3; (2) disambiguating a plurality of set-top box commands that correspond to the spoken command, by: determining, based on the spoken command, the plurality of set-top box commands; presenting the plurality of set-top box commands to the user; receiving from the user an indication of one of the plurality of set-top box commands; and controlling the set-top box using the one set-top box command, as recited in Appellant’s claim 12. App. Br. 12–13; Reply Br. 4–5; Appeal 2016-003473 Application 12/491,876 5 (3) “wherein receiving the indication of the one set-top box command includes receiving an additional spoken command from the user,” as recited in Appellant’s claim 13. App. Br. 13–16; Reply Br. 5–9; and (4) “determining audio data that represents a voice prompt directing the user to provide a spoken command,” as recited in Appellant’s claim 14, and similarly recited in claim 22. App. Br. 16–18; Reply Br. 9–10. ANALYSIS § 103(a) Rejection of Claims 1–5, 7, 8, 10, 11, and 15–21 based on Handelman, Houser, and Broadus With respect to independent claim 1, the Examiner finds Handelman’s voice activated communication and entertainment system, shown in Figure 1, teaches a media presentation system including a remote-control device (voice activated remote control 24) equipped with an audio input device to obtain audio data from a user that represents a spoken command to control a set-top box (CATV converter 17), as claimed. Final Act. 3–4 (citing Handelman ¶¶ 166, 169–170, 174, 182, 192, Fig. 1). The Examiner further finds Handelman’s media presentation system determines the spoken command by performing speech recognition upon the obtained audio data, and controls the set-top box in response to the determination of the spoken command, as claimed. Final Act. 4 (citing Handelman ¶¶ 169–174, 184, 192, 202). Appeal 2016-003473 Application 12/491,876 6 Handelman’s Figure 1 is reproduced below with additional markings for illustration. Handelman’s Figure 1 shows a voice activated communication and entertainment system including remote control device 24, set-top box (i.e., CATV converter) 17, and television 12. To support the conclusion of obviousness, the Examiner relies on (1) Houser for teaching the claimed “spoken command includes a program title to identify programming” and (2) Broadus for teaching (i) “receiv[ing] an indication that the user is speaking the spoken command, wherein the received indication is an initial portion of the audio data” and (ii) “reduc[ing] audio output volume provided by the set-top box in response to the received indication,” as claimed. Final Act. 5–6 (citing Houser col. 4, ll. 44–53; col. 30, ll. 19–25; col. 32, ll. 12–36; Broadus ¶¶ 37, 147–148). In particular, the Examiner finds Broadus mutes or attenuates audio output Appeal 2016-003473 Application 12/491,876 7 volume provided by a set-top box, upon detecting a user’s voice. Ans. 2–3 (citing Broadus Fig. 9); Final Act. 5–6. Broadus’ Figure 9 is reproduced below with additional markings for illustration. Broadus’ Figure 9 illustrates a hybrid communicator/remote control and set-top box in the context of teleconferencing using an interactive television system. Broadus ¶ 20. Appellant disputes the Examiner’s factual findings regarding Broadus, Handelman, and Houser. Particularly, Appellant argues Broadus does not (i) monitor “a spoken command from the user to identify programming” and (ii) “receive an indication that the user is speaking the spoken command, wherein the received indication is an initial portion of the audio data”; rather, Broadus’ “system mutes the TV sound when the voice from the near- end user is detected within a teleconference with the far-end user.” App. Br. Appeal 2016-003473 Application 12/491,876 8 9–10 (citing Broadus ¶ 132). Appellant further argues Handelman, Houser, and Broadus do not teach or suggest reducing audio output volume in response to a spoken command indication, as claimed. Reply Br. 3; App. Br. 9–10. Rather, Broadus “only reduces the audio output in response to general speech, not in response to the received indication that the user is speaking the spoken command,” and “Handelman, Houser and Broadus mention nothing about reducing interference with spoken commands to the set-top box identifying programming.” Reply Br. 3; App. Br. 10. Appellant additionally disputes the Examiner’s rationale for combining the references. Reply Br. 2. We do not find Appellant’s arguments persuasive. Instead, we find the Examiner has provided a comprehensive response to Appellant’s arguments supported by a preponderance of evidence. Ans. 2–4. As such, we adopt the Examiner’s findings and explanations provided therein. Id. For additional emphasis, we note Appellant’s arguments are predicated on a narrow reading of obviousness under KSR International Co. v. Teleflex Inc., 550 U.S. 398, 418 (2007) and improper separate attacks on Broadus and Handelman when the rejection is based on the combination. In re Keller, 642 F.2d 413, 426 (CCPA 1981) (“[O]ne cannot show non-obviousness by attacking references individually where, as here, the rejections are based on combinations of references.”). In addition, Appellant’s arguments do not address the Examiner’s findings regarding what Broadus and Handelman would have suggested to one of ordinary skill in the art. The proper test for obviousness is not whether the prior art references disclose all elements of the claimed invention, but rather what the combined teachings would have suggested to a person of ordinary skill in the art. See Appeal 2016-003473 Application 12/491,876 9 In re Kotzab, 217 F.3d 1365, 1370 (Fed. Cir. 2000). In such an analysis, precise teachings directed to the specific subject matter of the challenged claim need not be identified because the inferences and creative steps that a person of ordinary skill in the art would employ can be taken into account. See KSR, 550 U.S. at 418. For example, Handelman discloses speech-based control of set-top box (“STB”) (i.e., CATV converter) 17, shown in Figure 1, whereby the user’s speech is a spoken command to control STB 17. Ans. 3 (citing Handelman ¶¶ 169–174, 184, 192, 202). Particularly, Handelman’s “voice activated mode of operation” enables the user to “enter[] voice commands and instructions and . . . mak[e] voice selections” to “browse through the program guide, operate features in the program guide, refer to data presented on on-screen menus, retrieve selected program guide data, record programs, make selections and configure the program guide.” See Handelman ¶ 192. As an extension of Handelman, Broadus teaches a similar speech- based control of set-top box (STB) 902, shown in Figure 9, to enable a user to reduce audio output volume provided by STB 902 that controls television 104. Ans. 2–3 (citing Broadus ¶¶ 147–148, Fig. 9); see also Broadus Fig. 13 (described in paragraphs 147–148). Specifically, Broadus’ set-top box control system “determine[s] whether [user’s] voice . . . or speech is present” and, if a voice is detected, mutes or attenuates the television’s audio output to reduce interference between the user’s voice and the television’s audio. See Broadus ¶¶ 147–148, 150 (emphasis added). Thus, Broadus’ speech- based control system is “capable of receiving audio input comprising a user’s voice/speech, wherein upon detection of the speech of the user causes Appeal 2016-003473 Application 12/491,876 10 the STB/TV to reduce the audio output volume in response to the received speech.” Ans. 2–3. The Examiner also reasonably finds and articulates a reason having a rational underpinning for using Broadus’ audio output reduction responsive to speech indications, in Handelman’s speech-based set-top box control—to reduce interference between the television audio and the user’s spoken commands to the set-top box to identify programming. Final Act. 3, 6 (citing Broadus ¶ 148); Ans. 3–4 (citing Handelman ¶¶ 192, 202). In summary, we find sufficient evidence in the teachings of Handelman and Broadus to support the Examiner’s findings that the combination of Handelman and Broadus teaches and suggests (i) receiving an indication that the user is speaking a spoken command to control a set-top box, wherein the received indication is an initial portion of the audio data of the spoken command, and (ii) reducing audio output volume provided by the set-top box in response to the received indication, as recited in claim 1. For the reasons set forth above, we sustain the Examiner’s obviousness rejection of independent claim 1 and its dependent claims 2–4, which Appellant does not argue separately. App. Br. 10. With respect to independent claims 5, 8, and 15, Appellant reiterates the same arguments presented against claim 1. App. Br. 11. For the same reasons discussed, we also sustain the Examiner’s obviousness rejections of claims 5, 8, and 15, and their respective dependent claims 7, 10, 11, and 16– 21 for which no substantive arguments are provided. App. Br. 11–12. Appeal 2016-003473 Application 12/491,876 11 § 103(a) Rejection of Claim 12 based on Handelman, Houser, Broadus, and Bhogal Claim 12 depends from independent claim 5, and further recites: disambiguating a plurality of set-top box commands that correspond to the spoken command, by: determining, based on the spoken command, the plurality of set-top box commands; presenting the plurality of set-top box commands to the user; receiving from the user an indication of one of the plurality of set-top box commands; and controlling the set-top box using the one set-top box command. App. Br. 21 (Claims App’x). Appellant argues the Examiner erred because none of the references teaches “disambiguating a plurality of set-top box commands or any of the other numerous additional limitations of claim 12 reciting specific steps involving set-top box commands.” App. Br. 13. According to Appellant, Bhogal “merely teaches ‘disambiguating words that correspond to a spoken word’” and “mentions nothing about a set-top box, much less specific set- top box commands”; and Handelman, Houser, and Broadus do not disambiguate any set-top box commands. App. Br. 13; Reply Br. 4. Appellant’s arguments are not persuasive because Appellant addresses each of Bhogal, Handelman, Houser, and Broadus individually, where the Examiner relies on the combination of these references. See In re Keller, 642 F.2d at 426. Moreover, Appellant’s arguments do not substantively address, and therefore reveal no error in, the Examiner’s specific findings regarding Bhogal, Handelman, Houser, and Broadus. See App. Br. 13; Reply Br. 4–5; see also Ans. 4–5. Appeal 2016-003473 Application 12/491,876 12 As recognized by the Examiner, Bhogal teaches disambiguating words that correspond to a spoken word. Final Act. 9 (citing Bhogal ¶ 22, Fig. 5); Ans. 4–5. The Examiner finds a skilled artisan would use Bhogal’s disambiguation technique to clarify ambiguous set-top box commands spoken by Handelman’s user. Final Act. 9–10. Particularly, the Examiner finds the disputed limitation of claim 12 is taught by incorporating Bhogal’s disambiguation technique—which requests user feedback with respect to other words approximating an imprecisely spoken word—into Handelman’s voice-activated remote control and set-top box, resulting in a method of controlling a set-top box via user’s disambiguated commands. Ans. 4–5. The Examiner additionally provides a detailed explanation as to how the teachings of Bhogal’s disambiguation technique and Handelman’s set-top box speech recognition system may be combined to teach the disputed limitation. Ans. 4–5; Final Act. 9–10 (citing Bhogal ¶¶ 4–5); see also Final Act. 4 (citing Handelman ¶ 202 (disclosing set-top box voice commands are processed by a speech recognition unit)). Appellant’s arguments that Bhogal and Handelman individually do not teach the limitations that the Examiner relied upon the combination of Bhogal and Handelman as teaching (in combination with the other cited references) do not persuade us of Examiner error. For these reasons, we sustain the Examiner’s obviousness rejection of claim 12. Appeal 2016-003473 Application 12/491,876 13 § 103(a) Rejection of Claim 13 based on Handelman, Houser, Broadus, Bhogal, and Relyea Claim 13 depends from claim 12, and further recites “wherein receiving the indication of the one set-top box command includes receiving an additional spoken command from the user.” App. Br. 22 (Claims App’x). The Examiner further finds Relyea’s speech recognition application prompts the user to repeat an ambiguously spoken command directed to a set-top box, thereby teaching receiving an indication that includes receiving an additional spoken command from the user, as claimed. Final Act. 10 (citing Relyea ¶ 117). First, Appellant contends Relyea does not teach or suggest receiving an additional spoken command from the user as claimed, but only teaches that “the user may override incorrectly recognized speech” by “saying the same command over again, not an additional spoken command.” App. Br. 14 (citing Relyea ¶ 117). As recognized by the Examiner, however, Relyea discloses confirming a set-top box (access device) command “by repeating the same command vocally,” which is commensurate with the claimed “receiving an additional spoken command from the user.” Ans. 6 (citing Relyea ¶ 117 (explaining that a user may “be provided with tools for confirming correctly recognized speech or overriding incorrectly recognized speech,” such as “to allow the user to provide the correct input command . . . by repeating the command vocally”)). Appellant’s argument that Relyea merely describes “saying the same command over again, not an additional spoken command” in contrast to Appellant’s method—which requires “‘receiving an additional spoken command from the user’ not repeating the same command” (App. Appeal 2016-003473 Application 12/491,876 14 Br. 14 (citing Spec. ¶ 56, Fig. 3D))—is not commensurate with the scope of claim 13. Ans. 6. Claim 13 recites “the one set-top box command” and “receiving an additional spoken command,” but does not require the “additional spoken command” to be different from the “one set-top box command.” Second, Appellant argues Relyea’s paragraph 117 contradicts claims 13 and 12 (from which claim 13 depends) because the claimed term “‘additional’ . . . cannot simply be a repeat of the same ambiguous command originally spoken [in claim 12].” App. Br. 14–15 (emphasis added); Reply Br. 6–7. Appellant’s argument is unpersuasive because the argument is premised on Relyea requiring “a repeat of the same ambiguous command originally spoken.” However, that is not what Relyea discloses. Rather, Relyea’s paragraph 117 teaches a corrected command is obtained by “repeating the command vocally” until the command is accurately recognized by the speech recognition system. See Relyea ¶ 117. Additionally, as discussed supra, claim 13 does not require the “additional spoken command” to be different from the “one set-top box command.” As such, we agree with the Examiner that Relyea’s command—corrected through repetition—is commensurate with the claimed “additional spoken command” received from the user. Ans. 6. Third, Appellant argues the Examiner’s rationale for combining Relyea with Handelman, Houser, Broadus, and Bhogal “does not make sense” and relies on “a teaching of Relyea that does not exist.” App. Br. 15. Appellant further argues incorporating Relyea’s command repetition feature in Bhogal’s system would render Bhogal’s disambiguation unsatisfactory for its intended purpose and change Bhogal’s principle of operation. App. Br. Appeal 2016-003473 Application 12/491,876 15 15–16. We do not agree. The Examiner reasonably finds and articulates a reason having a rational underpinning for combining Bhogal’s disambiguation method via a user’s word selection, with the additional method of Relyea “to further strengthen the voice recognition training of the system.” Ans. 6–7. As the Examiner finds, “[h]aving both methods [of Bhogal and Relyea] implemented on the system does not render [the system] unsatisfactory for its intended purpose” because the Bhogal-Relyea combination is able to disambiguate a spoken command through either Bhogal’s or Relyea’s techniques. Ans. 7. In the Reply, Appellant argues that combining Bhogal and Relyea in this manner would provide “two separate disambiguation methods operating independently,” which is distinct from claim 13. Reply Br. 8–9. Appellant’s argument, again, is not commensurate with the scope of claim 13. The claim does not exclude disambiguating the one set-top box command and the additional spoken command by separate and independent disambiguation methods. For these reasons, we sustain the Examiner’s obviousness rejection of claim 13. § 103(a) Rejection of Claims 14 and 22 based on Handelman, Houser, Broadus, and Bloebaum Claim 14 depends from independent claim 5, and further recites, inter alia, “determining audio data that represents a voice prompt directing the user to provide a spoken command.” App. Br. 22 (Claims App’x). The Examiner finds Handelman provides informational voice messages to the user, thereby teaching audio data representing a voice Appeal 2016-003473 Application 12/491,876 16 prompt, as claimed. Final Act. 11 (citing Handelman ¶¶ 183–184); Ans. 7. The Examiner further finds Bloebaum’s beep prompting the user to speak teaches audio data directing the user to provide a spoken command, as claimed. Final Act. 11 (citing Bloebaum ¶ 306); Ans. 7. The Examiner reasons that the skilled artisan would have combined Handelman’s voice messaging with Bloebaum’s audio cue prompt to “provid[e] the user[s] with an audible confirmation that they can begin to speak, rather than just assume the remote control is listening.” Final Act. 12; see also Ans. 7–8. Appellant argues neither Handelman nor Bloebaum determines audio data that represents a voice prompt, as claimed. App. Br. 17; Reply Br. 9– 10. Rather, Handelman “provid[es] messages, not prompts (prompts ask for a response from the user) whereas [Handelman’s] messages . . . ‘may include error messages and control messages’”; and Bloebaum’s audio cue is a beep, not a voice prompt as claimed. App. Br. 17; Reply Br. 10. We do not find Appellant’s arguments persuasive and commensurate with the scope of Appellant’s claim 14. The Examiner’s rejection is based on combining Bloebaum’s “audible signal (e.g., a beep)” that “prompt[s] [a user] to begin speaking” with Handelman’s “voice messages [provided] to the user” by a speaker. Ans. 7. Appellant’s arguments are premised on the lack of a single reference teaching a voice that asks for a response from the user, but the arguments do not persuasively address the combined teachings of the references. Moreover, claim 14 does not exclude messages and audio cues from the claimed “voice prompt.” Ans. 7. “Messages” are commensurate with the broad description of “voice prompts” in Appellant’s Specification. In particular, Appellant’s Specification describes “voice prompts that are either pre-recorded or automatically generated by a speech Appeal 2016-003473 Application 12/491,876 17 synthesis system,” and “pre-recorded messages” that are “output by a speaker that is part of the remote-control device . . . or some other speaker.” See Spec. ¶¶ 46, 57 (emphases added); Handelman ¶ 183; Bloebaum ¶ 306. As Appellant has not rebutted the Examiner’s findings and conclusion, we sustain the Examiner’s obviousness rejection of claim 14, and claim 22 for which Appellant provides the same arguments. App. Br. 17–18. CONCLUSION On the record before us, we conclude Appellant has not demonstrated the Examiner erred in rejecting claims 1–5, 7, 8, and 10–22 under 35 U.S.C. § 103(a). DECISION As such, we AFFIRM the Examiner’s final rejection of claims 1–5, 7, 8, and 10–22. No time period for taking any subsequent action in connection with this appeal may be extended under 37 C.F.R. § 1.136(a)(1)(iv). AFFIRMED Copy with citationCopy as parenthetical citation