IPR2019-00210 (P.T.A.B. May. 11, 2020)

Nuance Communications, Inc.

Patent Trials and Appeals BoardMay 11, 2020

IPR2019-00210 (P.T.A.B. May. 11, 2020)

Trials@uspto.gov Paper 42 Tel: 571-272-7822 Entered: May 11, 2020 UNITED STATES PATENT AND TRADEMARK OFFICE BEFORE THE PATENT TRIAL AND APPEAL BOARD MMODAL LLC, Petitioner, v. NUANCE COMMUNICATIONS, INC., Patent Owner. IPR2019-00210 Patent 9,818,398 B2 Before KEN B. BARRETT, NEIL T. POWELL, and CHRISTA P. ZADO, Administrative Patent Judges. POWELL, Administrative Patent Judge. JUDGMENT Final Written Decision Determining Some Challenged Claims Unpatentable 35 U.S.C. § 318(a) IPR2019-00210 Patent 9,818,398 B2 2 I. INTRODUCTION MModal LLC (“Petitioner”) filed a Petition (Paper 2, “Pet.”) requesting an inter partes review of claims 1–4, 9, 10, 12, 13, and 15–17 of U.S. Patent No. 9,818,398 B2 (Ex. 1001, “the ’398 patent”). Nuance Communications, Inc. (“Patent Owner”) filed a Preliminary Response. Paper 6 (“Prelim. Resp.”). In view of those submissions, we instituted an inter partes review of claims 1–4, 9, 10, 12, 13, and 15–17 of the ’398 patent. Paper 8. Subsequent filings include a Patent Owner Response (Paper 16, “PO Resp.”), a Petitioner Reply (Paper 29, “Pet. Reply”), and a Patent Owner Sur-reply (Paper 35, “Sur-reply”). An oral hearing was held on February 27, 2019, and a copy of the transcript was entered into the record. Paper 41. We have jurisdiction over this proceeding under 35 U.S.C. § 6(b). After considering the evidence and arguments of the parties, we determine that Petitioner has proven by a preponderance of the evidence that claims 1, 3, 4, 9, 10, 12, 13, 15, and 17 of the ’398 patent are unpatentable. See 35 U.S.C. § 316(e) (2018). We also determine that Petitioner has not proven by a preponderance of the evidence that claims 2 and 16 are unpatentable. We issue this Final Written Decision pursuant to 35 U.S.C. § 318(a). II. BACKGROUND A. Related Matters According to the parties, the ’398 patent is asserted in MModal Services Ltd. v. Nuance Communications, Inc., Case No. 1:18-cv-00901-SCJ (N.D. Ga. 2018). Pet. 2; Paper 5, 2. IPR2019-00210 Patent 9,818,398 B2 3 B. The Asserted Grounds of Unpatentability Petitioner contends that claims 1–4, 9, 10, 12, 13, and 15–17 of the ’398 patent are unpatentable based on the following grounds: Claims Challenged 35 U.S.C. § Reference(s)/Basis 1–4, 9, 10, 12, 13, 15–17 102(b)1 Davenport2 2, 16 103(a) Davenport, Lai3 1–4, 9, 10, 12, 13, 15–17 103(a) Baker4, Baker-6135 2, 4, 13, 16, and 17 103(a) Baker, Baker-613, Jamieson6 Petitioner relies on Declarations of Homayoon Beigi, Ph.D. (Exs. 1003, 1047). Patent Owner relies on the Declaration of John L. Hansen, Ph.D. (Ex. 2001). 1 The Leahy-Smith America Invents Act (“AIA”), Pub. L. No. 112-29, 125 Stat. 284, 287 (2011), amended 35 U.S.C. § 103. Because the application from which the ’946 patent issued was filed before March 16, 2013, the effective date of the relevant amendment, the pre-AIA version of §§ 102 and 103 applies. 2 U.S. Patent Application No. 2002/0184022 A1, pub. Dec. 5, 2002 (Ex. 1005). 3 U.S. Patent No. 6,006,183, iss. Dec. 21, 1999 (Ex. 1022). 4 U.S. Patent Application No. 2004/0186714 A1, pub. Sept. 23, 2004 (Ex. 1006). 5 U.S. Patent No. 6,122,613, iss. Sept. 19, 2000 (Ex. 1007). 6 U.S. Patent No. 7,383,172 B1, iss. June 3, 2008 (Ex. 1008). IPR2019-00210 Patent 9,818,398 B2 4 C. The ’398 Patent The ’398 patent explains that automatic speech recognition (ASR) systems can process speech to produce a recognition result. Ex. 1001, 1:17– 20. The ’398 patent discloses a method involving processing speech recognition results that include two or more results chosen by the ASR system as likely accurate results. Id. at 1:34–37. The method involves “evaluating the two or more results using at least one criterion that differs from criteria used by the ASR system in determining the two or more result.” Id. at 1:37–40. If “the at least one criterion is met by the two or more results,” the method involves “triggering an alert concerning one of the two or more results.” Id. at 1:37–42. This process may alert a person reviewing a transcript that a potential error exists in the recognition results, which may prompt the reviewer to more carefully review the recognition result, insuring correction of incorrect recognition results. Id. at 6:48–64. The ’398 patent discloses a system for performing this process in connection with Figure 1, which is reproduced below. IPR2019-00210 Patent 9,818,398 B2 5 Figure 1 shows speech processing system 100, which includes ASR system 102, data storage 104, and evaluation engine 106. Id. at 9:17–25. ASR system 102 receives speech data 104a from data storage 104 and produces one or more recognition results. Id. at 9:17–21. ASR system 102 “can determine . . . a string of one or more words that might correspond to the speech input.” Id. at 10:35–39. Additionally, ASR system 102 generates, “for the result, a confidence of the ASR system 102 (which may be represented as a probability on a scale of 0 to 1, as a percentage from 0 to IPR2019-00210 Patent 9,818,398 B2 6 100 percent, or in any other way) that the result is a correct representation of the speech input.” Id. at 10:39–44. ASR system 102 may produce multiple recognition results, in which case, it may “order and filter the results in some way so as to output N results as the results of the speech recognition process.” Id. at 10:47–51. To order the results, ASR system 102 may rank the results according to the confidence level that each result correctly reflects the speech input. Id. at 10:51–55. ASR system 102 may format the results as an “N best” list, which includes a “most likely result” or “top result,” as well as “‘N-1’ alternative recognition results that the ASR system has identified as the results that are next most likely, after the top result, to be a correct representation of the speech input.” Id. at 10:55–65. Evaluation engine 106 may process the recognition results for indications of potential error. Id. at 9:21–25. To do so, evaluation engine 106 “may use a set of confusable words and/or phrases 106A, a set of unlikely words 106B, a language model 106C, a semantic interpretation engine 106D, and/or prosody and/or hesitation vocalization information 106E.” Id. at 9:30–34. Evaluation engine 106 may store any results it produces, along with the recognition results from ASR engine 102, as information 104B. Id. at 9:36–41. Information 104B may be used, for example, to present a reviewer with recognition results that identify potential significant errors identified by evaluation engine 106. Id. at 9:41–46. The ’398 patent discusses more detail regarding a process that evaluation engine 106 may use in connection with Figure 2, which is reproduced below. IPR2019-00210 Patent 9,818,398 B2 7 Figure 2 is a flowchart showing process 200 of evaluating recognition results. Id. at 2:10–12. Evaluation engine 106 receives the N best results generated by ASR system at block 202. Id. at 13:38–40. In block 204, evaluation engine 106 compares the top recognition result to the alternate recognition results. Id. at 13:41–47. This may be done “in any suitable manner.” Id. at 13:47–48. “In some embodiments, one or more thresholds may be evaluated as part of the review and comparison of block 204.” Id. at 13:61–62. For example, evaluation engine 106 may determine whether the recognition results’ confidence values are above or below a threshold value to “determine whether the confidence is sufficiently high or low (as compared to the IPR2019-00210 Patent 9,818,398 B2 8 threshold).” Id. at 13:63–67. The ’398 patent explains that “any suitable fixed or variable thresholds set to any suitable value(s) may be used,” and that “[e]mbodiments are not, for example, limited to using thresholds of 50% or any other number that would be used to determine whether a likelihood of something occurring indicates that thing is more likely than not to occur.” Id. at 14:7–13. At block 206, evaluation engine 106 ascertains if the comparison revealed indications of possibly semantically significant errors in the recognition results. Id. at 14:18–21. In the event evaluation engine 106 identifies potential error in the recognition results, at block 208, it triggers an alert. Id. at 14:21–23. This may happen “in any suitable way, including by presenting a visual and/or audible message via a user interface through which the recognition results are to be presented for review.” Id. at 14:30–33. Additionally, at block 210, evaluation engine 106 “may optionally store information regarding the potential error that may be semantically meaningful.” Id. at 14:34–39. Aside from identifying errors in the recognition results produced by ASR system 102, the ’398 patent discloses determining whether the speech input may contain errors. Id. at 36:60–63. According to the ’398 patent, “speech input may include a potential error when a speaker is uncertain of the speech input that he[] or she should be providing.” Id. at 36:30–33. The ’398 patent further explains that a speaker may speak differently when uncertain about the accuracy of what he or she is saying. Id. at 36:33–43. Therefore, the ’398 patent explains, “an evaluation of prosody information for speech input, which may include durational information and pitch information, may provide an indication of whether a speaker was uncertain of the speech input that was provided.” Id. at 36:44–49. IPR2019-00210 Patent 9,818,398 B2 9 The ’398 patent discusses a process of evaluating prosody information in connection with Figure 9, which is reproduced below. Figure 9 is a flowchart showing a process of ascertaining whether a speaker was uncertain by evaluating prosody information. Id. at 2:44–47. At block 902, evaluation engine 106 obtains prosody information from ASR engine 102. Id. at 37:11–13. Evaluation engine 106 then evaluates the prosody information at block 904 to determine the probability that the speaker was uncertain of the provided speech input. Id. at 37:27–30. This may involve, for example, determining whether the speaker took a long IPR2019-00210 Patent 9,818,398 B2 10 time to pronounce phonemes, as well as whether the speaker paused for excessively long times. Id. at 37:63–38:43. At block 906, evaluation engine 106 “compares the likelihood of uncertainty calculated in block 904 to a threshold likelihood to determine whether the calculated likelihood exceeds the threshold.” Id. at 42:28–31. If so, it may be deemed that the speaker was uncertain, and evaluation engine 106 may trigger an alert in step 908. Id. at 42:34–40. D. Illustrative Claim Claims 1, 12, and 15 are independent. Each of claims 2–4, 9, 10, 13, 16 and 17 depends from one of independent claims 1, 12, and 15. Claim 1 is illustrative of the challenged claims and recites: 1. A method comprising: evaluating two or more results of a recognition, by an automatic speech recognition (ASR) system on a speech input, using at least one criterion that differs from criteria used by the ASR system in determining the two or more results, wherein the two or more results were identified by the ASR system as likely to be accurate recognition results for the speech input and comprise a first recognition result identified by the ASR system as most likely to be a correct recognition result for the speech input and at least one alternative recognition result identified by the ASR system as a potential recognition result for the speech input; and in response to determining that the at least one criterion is met by the two or more results, triggering presentation, via a user interface, of an alert concerning one of the two or more results. Ex. 1001, 51:8–24. IPR2019-00210 Patent 9,818,398 B2 11 III. ANALYSIS A. Claim Construction In an inter partes review filed before November 13, 2018, claim terms in an unexpired patent are given their broadest reasonable construction in light of the specification of the patent. 37 C.F.R. § 42.100(b) (2018).7 Consistent with that standard, we assign claim terms their ordinary and customary meaning, as would be understood by one of ordinary skill in the art at the time of the invention, in the context of the entire patent disclosure. See In re Translogic Tech., Inc., 504 F.3d 1249, 1257 (Fed. Cir. 2007). Only those terms that are in controversy need be construed, and only to the extent necessary to resolve the controversy. See Nidec Motor Corp. v. Zhongshan Broad Ocean Motor Co., 868 F.3d 1013, 1017 (Fed. Cir. 2017) (citing Vivid Techs., Inc. v. Am. Sci. & Eng’g, Inc., 200 F.3d 795, 803 (Fed. Cir. 1999)). Petitioner asserts that all of the terms in the challenged claims “should be given their ordinary and customary meaning.” Pet. 11. Petitioner contends that “[n]o specific construction of any claim term is required because the prior art relied on in this Petition meets each of the claim terms under any reasonable construction.” Id. at 11–12. 7 The Office recently changed the claim construction standard used in inter partes review proceedings. As stated in the Federal Register notice, however, the new rule applies only to petitions filed on or after November 13, 2018, and, therefore, does not impact this matter. See Changes to the Claim Construction Standard for Interpreting Claims in Trial Proceedings Before the Patent Trial and Appeal Board, 83 Fed. Reg. 51,340, 51,340 (Oct. 11, 2018) (stating “[t]his rule is effective on November 13, 2018 and applies to all IPR, PGR and CBM petitions filed on or after the effective date”). IPR2019-00210 Patent 9,818,398 B2 12 Patent Owner argues that, contrary to assertions in the Petition, “evaluating two or more results of a recognition” does not encompass evaluating just a top recognition result. PO Resp. 8–9. Patent Owner argues that we must reject Petitioner’s assertion that this limitation “could be met by evaluating only the top ASR result.” Id. at 9. As discussed below, for purposes of deciding whether Petitioner has shown unpatentability of the challenged claims, we need not address whether evaluating a top recognition result constitutes “evaluating two or more results of a recognition.” The parties’ disputes regarding patentability do, however, warrant addressing other claim-construction issues. We turn now to detailed discussions of those issues. 1. “criterion” Petitioner asserts in its Reply that “the broadest reasonable construction of criterion is its plain meaning, ‘a standard, rule, or test by which something may be judged.’” Pet. Reply 7. In support of this construction, Petitioner cites the Petition, dictionary definitions of “criterion,” and Dr. Beigi’s testimony. Id. (citing Pet. 17–18, 33–39; Ex. 1042, 344; Ex. 1043, 204; Ex. 1047 ¶¶ 25–28). Dr. Beigi testifies that “[a person of ordinary skill in the art] reading the claims in the context of the ’398 patent would have understood that criterion/criteria retains” the plain meaning noted by Petitioner. Ex. 1047 ¶ 26. Dr. Beigi cites passages of the Specification as using the term consistent with this plain meaning. Id. ¶ 27. Patent Owner does not dispute Petitioner’s proposed construction of “criterion.” See generally, PO Resp. The evidence presented by Petitioner persuades us that a person of ordinary skill in the art would understand the broadest reasonable IPR2019-00210 Patent 9,818,398 B2 13 interpretation of “criterion” in view of the claims and Specification of the ’398 patent is “a standard, rule, or test by which something may be judged.” 2. “evaluating two or more results of a recognition” Independent claim 1, for example, recites “evaluating two or more results of a recognition, by an automatic speech recognition (ASR) system on a speech input, using at least one criterion that differs from criteria used by the ASR system in determining the two or more results.” Ex. 1001, 51:9–13. The Petition asserts that we should give the claim language “evaluating two or more results of a recognition” its “ordinary and customary meaning.” Pet. 11. The Petition contends this claim language encompasses assessing an n-best list with its scores. Id. at 14–18. For example, Petitioner contends that the claim language encompasses determining “whether the top results’ scores are ‘very close’ so that an accurate distinction between them ‘might not be possible.’” Id. at 17. In its Response, Patent Owner disagrees, arguing that “[t]he independent claims’ plain language requires ‘evaluating two or more results of a recognition’ identified ‘as likely to be accurate recognition results’— not evaluating the confidence scores used to identify those results as likely to be accurate.” PO Resp. 13–14. Patent Owner argues that the Petition treats confidence scores as different from recognition results. Id. at 13. Patent Owner argues that this comports with “the challenged claims, the ’398 specification, and both experts’ testimony about how a [person of ordinary skill in the art] would have understood ‘results of a recognition’ as claimed.” Id. In its Reply, Petitioner responds to that argument in maintaining that the claim language encompasses evaluating recognized words with IPR2019-00210 Patent 9,818,398 B2 14 associated scores. Pet. Reply 1–6. Disagreeing with Patent Owner’s contention that the claimed “results of a recognition” are limited to recognized words (id. at 2), Petitioner adds that “even if one adopts [Patent Owner’s] narrow view of ‘recognition results,’ evaluating a recognition result is not limited to evaluating the recognized words themselves” (id. at 3). Petitioner cites the Specification as disclosing repeatedly “using any suitable criteria” to evaluate results. Id. (citing Ex. 1001, code (57), 4:66– 5:1, 9:6–11, 11:53–56, 12:3–8, 13:25–31). Petitioner also argues that the Specification explicitly discloses using scores to evaluate recognition results and does not require any other evaluation, “especially in view of its repeated exhortations that its examples are not limiting in any way.” Id. Petitioner adds that aside from the recognized words, the Specification also discloses other information, such as sets of confusable and unlikely words, for evaluating the recognition results. Id. at 4. In its Sur-reply, Patent Owner argues we should not consider the arguments in Petitioner’s Reply, and that even if we do, those arguments lack merit. Sur-reply 2–8. Patent Owner argues that we should disregard the arguments from the Reply because they constitute procedurally improper new arguments. Id. at 2–3, 5–6. Additionally, Patent Owner maintains that the claimed “results of a recognition” do not encompass scores, as asserted by Petitioner. Id. at 2–5. Patent Owner cites claim language and the Specification as demonstrating that “recognition results” do not encompass scores. Id. Patent Owner also disagrees substantively with Petitioner’s assertion that the claim language encompasses evaluating recognized words with associated scores. Id. at 7–8. Patent Owner argues that the portions of the Specification cited by Petitioner either (1) do not disclose evaluating IPR2019-00210 Patent 9,818,398 B2 15 recognition results with associated scores, or (2) describe unclaimed subject matter. Id. To resolve the parties’ dispute, we need not address whether the claimed “results of a recognition” includes scores. Rather, we can resolve the parties’ dispute by addressing whether the claim language “evaluating two or more results of a recognition” encompasses evaluating recognized words with associated scores. First, we consider Patent Owner’s argument that Petitioner’s Reply improperly newly argued that the claim language encompasses evaluating recognition results with associated scores. Patent Owner contends, in its Sur-reply, that the Petition did not advance this argument, thereby depriving Patent Owner of an adequate opportunity to present arguments and evidence to the contrary. Sur-reply 5–8. As noted above, the Petition asserts that we should give the claim terms their “ordinary and customary meaning.” Pet. 11. Additionally, when addressing the “evaluating” language of the challenged claims, the Petition explains that evaluating an n-best list by determining “whether the top results’ scores are ‘very close’ so that an accurate distinction between them ‘might not be possible’” constitutes evaluating “two or more ASR results using at least one criterion that is different from the criteria used by its ASR.” Id. at 17. We find the Petition clearly conveys Petitioner’s position that the plain and ordinary meaning of the claim language encompasses evaluating recognized words with associated scores. Patent Owner’s response that “results of a recognition” do not include scores simply fails to acknowledge the Petition’s clear assertion that the claim language, as a whole, encompasses evaluating recognized words with IPR2019-00210 Patent 9,818,398 B2 16 associated scores. To the extent Patent Owner is correct in asserting that the Petition treats scores as distinct from recognition results, this does not diminish the Petition’s clear assertion that the claim language encompasses evaluating recognized words with associated scores. Instead, it illustrates that the Petition contends the claim language encompasses evaluating recognized words with associated scores, even if the scores themselves do not constitute part of the “results of a recognition.” Indeed, without acknowledging expressly the Petition’s position, Patent Owner’s Response recognizes the issue of whether the claim language encompasses evaluating recognition results with scores, arguing that the Specification does not teach doing so. PO Resp. 18–19 n.3. Additionally, the Sur-reply afforded Patent Owner the opportunity to respond to this argument, and Patent Owner did so. Sur-reply 5–8. Thus, we disagree with Patent Owner’s assertion that it did not get an opportunity to address whether the claim language encompasses such evaluation. Turning to the merits of the Petition’s position on the scope of the claim language, we start with the claim language. The claim language explicitly states that evaluating can involve information other than the results of the recognition. In particular, the claim language recites “evaluating . . . using at least one criterion that differs from criteria used by the ASR system in determining the two or more results.” Although this does not expressly refer to scores associated with recognition results, it conveys that the evaluating can be performed with information other than the results of the recognition. Because the claim language conveys that the evaluation may use information other than the results of a recognition, and because the claim language does not exclude using the scores associated with the results IPR2019-00210 Patent 9,818,398 B2 17 of a recognition8, we find that the plain meaning of the claim language encompasses evaluating results of a recognition with associated scores. Next, we consider whether the plain meaning of the claim language is consistent with the Specification. On this point, the parties and their declarants view certain disclosures of the Specification differently. Noting the Specification’s repeated disclosures of evaluating recognition results “using any suitable criteria,” Petitioner and Dr. Beigi read the Specification as expressly teaching evaluating recognition results by evaluating their scores. Pet. Reply 3; Ex. 1047 ¶¶ 14–16. In support of this, Petitioner and Dr. Beigi cite the following passages from the Specification: In some embodiments, one or more thresholds may be evaluated as part of the review and comparison of block 204. For example, confidences in recognition results or likelihoods determined by an ASR system using acoustic and/or language models may be compared to one or more thresholds to determine whether the confidence, likelihood, etc. is above or below the threshold. Actions may then be taken based on the comparison to the threshold, such as in the case that the evaluation engine determines whether the confidence, likelihood, etc. is sufficiently high or low (as compared to the threshold) for the action to be taken. In some embodiments in which an evaluation engine uses one or more thresholds as part of a review and comparison of recognition results, any suitable fixed or variable thresholds set to any suitable value(s) may be used, as embodiments are not limited to using any particular values for thresholds. Embodiments are not, for example, limited to using thresholds of 50% or any other number that would be used to determine whether a likelihood of something occurring indicates that thing is more likely than not to occur. It should also be 8 Although, we recognize that the claim language “at least one criterion that differs from criteria used by the ASR system” arguably requires the use of information other than the scores, it does not exclude also using the scores. IPR2019-00210 Patent 9,818,398 B2 18 appreciated that embodiments are not limited to using thresholds, and that some embodiments may perform the review and comparison of block 204 without evaluating thresholds. Ex. 1001, 13:61–14:17. Accordingly, in some embodiments a recognition result of an ASR system may be evaluated to determine whether the recognition result includes a word or phrase that is unlikely to occur in the domain of the speech input on which the recognition result is based. When the recognition result includes an unlikely word or phrase, any suitable action may be taken. For example, when an indication of a potential error is detected, an alert may be triggered to notify a reviewer that an error may be present in the recognition result of the ASR system. When the reviewer is notified about a potential error, the reviewer may be more likely to closely review and, if desired, correct the error and not inadvertently overlook the error. Determining whether a recognition result includes an unlikely word or phrase may be carried out in any suitable manner. Id. at 7:63–8:10. Patent Owner and Dr. Hansen do not read the Specification as discussing evaluating scores to evaluate recognition results. Instead, Patent Owner argues: The ’398 patent describes an embodiment that uses confidence scores output from the ASR system, but not to evaluate two or more recognition results. ’398 patent, 15:13–61. That embodiment uses confidence scores only to select a subset of alternative results to evaluate, which is a preliminary step before performing the evaluation using the different criterion. PO Resp. 18–19 n.3. In support of this assertion, Patent Owner cites Dr. Hansen’s testimony that I note the ’398 patent describes an embodiment that makes use of confidence scores output from the ASR system, but a POSA would have understood the ’398 patent to teach that this use is not the claimed evaluation of the two or more recognition results. IPR2019-00210 Patent 9,818,398 B2 19 ’398 patent, 13:61-14:17, 15:13-67. The ’398 patent teaches that this embodiment uses confidence scores only to select a subset of alternative results to evaluate, which is a preliminary step before performing the evaluation using the criterion that differs from criteria used by the ASR system in determining the results. Id. The ’398 patent does not describe this usage of confidence scores as an evaluation of recognition results, and explicitly says the confidence scores are used to determine “which alternative recognition results to evaluate.” Id., 15:64-65. Moreover, a POSA would have understood that an evaluation of confidence scores is manifestly not an evaluation of recognition results, which are the “string[s] of one or more words that might correspond to the speech input” (id., 10:38-39), as discussed above in § V.A and further below. Ex. 2001 ¶ 58. The Specification clearly defines the scope of its evaluation of recognition results broadly. It explains that “[t]he evaluation of the recognition results may be carried out using any suitable criteria.” Ex. 1001, 4:66–5:1 (emphasis added). It elaborates that an evaluation engine (which may be part of a speech processing system) may carry out any suitable evaluation process to determine whether recognition results include potential significant errors. Fig. 2 illustrates one non-limiting process that may be used in some embodiments by an evaluation engine to make such a determination. Id. at 13:25–31. As part of the process shown in Figure 2, the Specification explains that “[i]n block 204, the evaluation engine reviews the N best results and compares the top result to the alternative recognition results to determine whether there are one or more discrepancies between the top result and any of the alternative recognition result that is semantically meaningful in the domain.” Id. at 13:41–47. The Specification also explains that this may IPR2019-00210 Patent 9,818,398 B2 20 involve comparing “confidences in recognition results . . . to one or more thresholds.” Id. at 13:61–66. This portion of the Specification conveys that comparing recognition results’ scores to thresholds constitutes an example of a suitable process of evaluating recognition results using suitable criteria. We do not agree with Patent Owner and Dr. Hansen’s position that the Specification’s discussion of block 204 using scores and thresholds does not disclose evaluating recognition results. PO Resp. 18–19 n.3.; Ex. 2001 ¶ 58. “[T]he evaluation engine” performs block 204 to identify potential error in a top recognition result. Ex. 1001, 13:41–47 (emphasis added). The disclosure of “the evaluation engine” using scores and thresholds when reviewing recognition results for potential error conveys that the system evaluates the recognition results using the scores and thresholds. We find that a person of ordinary skill in the art would understand the Specification as disclosing evaluation of recognition results with their associated scores, such as by comparing the scores to one or more thresholds. See, e.g., id. at 4:66, 7:63–8:10, 13:25–31, 13:60–14:17. Patent Owner argues that the discussion of the execution of block 204 is an “introductory discussion . . . of the embodiment described more fully at 15:13-61, which the POR (at 18-19 n.3) already explained uses confidence scores only in a preliminary step before performing the evaluation of the recognition results that the ‘398 patent claims.” Sur-reply 7. Patent Owner provides no explanation or evidence of why a person of ordinary skill in the art would understand the discussion of block 204 as only disclosing “the embodiment more fully described” in column 15. See id. Like Patent Owner, Dr. Hansen appears to assume that the discussion in column 15 limits the discussion of the evaluation engine performing block 204. See Ex. IPR2019-00210 Patent 9,818,398 B2 21 2001 ¶ 58. Also like Patent Owner, Dr. Hansen provides no explanation or evidence to support this assumption. See id. We do not agree with the unsupported assumption of Patent Owner and Dr. Hansen. Moreover, even if we agreed with Patent Owner’s argument that the Specification does not expressly disclose using scores and thresholds to evaluate recognition results, we find that Patent Owner does not provide persuasive reasoning or evidence to demonstrate that the Specification disavows or excludes such evaluation of recognition results. For the foregoing reasons, we find that the plain meaning of the claim language encompasses evaluating results of a recognition with scores, and that the Specification is consistent with this meaning. Accordingly, we conclude that the broadest reasonable interpretation of “evaluating two or more results of a recognition, by an automatic speech recognition (ASR) system on a speech input, using at least one criterion that differs from criteria used by the ASR system in determining the two or more results” encompasses evaluating the recognition results with their associated scores. B. Alleged Anticipation by Davenport 1. Overview of Davenport Davenport discloses “[a] system that identifies recognized words from a voice recognition system that have the lowest possibility of being correct, and flagging those words on a user interface, to help with proofreading.” Ex. 1005, code (57). Davenport notes that speech engines may generate an Alts list, which usually includes more than one recognition candidate for each recognized phrase or word. Id. ¶ 11. A corresponding confidence value is associated with each recognition candidate. Id. ¶ 12. The confidence value reflects the recognizer’s degree of confidence that the IPR2019-00210 Patent 9,818,398 B2 22 phrase or word accurately corresponds with the speaker’s utterance. Id. According to Davenport, “virtually all dictation engines are believed to produce a list of the different candidates and somehow score the likelihood that the current word is the correct candidate.” Id. ¶ 13. Davenport’s “system uses these variables to identify situations where it is likely that recognition errors have occurred.” Id. ¶ 14. Davenport discusses a process for doing so in connection with Figure 2, which is reproduced below. Figure 2 illustrates a flowchart of a process to “identify and produce an indication showing likely misrecognition candidates.” Id. ¶ 6. First, at 205, the system identifies a circumstance where the confidence level of a best recognition falls below a predetermined threshold. Id. ¶ 14. The system may, for example, use 50 percent or 70 percent as the IPR2019-00210 Patent 9,818,398 B2 23 predetermined threshold. Id. The system uses these values “to form a first list, called list A.” Id. The system also identifies, at 210, “two alternatives which have very close scores, e.g., close enough that accurate detection of one or the other might not be possible.” Id. ¶ 15. Davenport explains “this may use a system of percentile rankings. The scores lying in the top 5 percentile closest scores are taken as unusually close confidence ratings.” Id. A second list, identified as List B, is formed using the values obtained at 210. Id. The words of list A and list B are identified at 215, and “[t]he user interface is modified to show at least some of the list A and list B words.” Id. ¶ 26. The system may highlight the words on the list within the document, such that “the users may be advised of likely misrecognitions, thereby making it easier to proofread such a document.” Id. ¶ 27. 2. Legal Standard “A claim is anticipated only if each and every element as set forth in the claim is found, either expressly or inherently described, in a single prior art reference.” Verdegaal Bros. Inc., v. Union Oil Co., 814 F.2d 628, 631 (Fed. Cir. 1987). Whether a reference anticipates is assessed from the perspective of an ordinarily skilled artisan. See Dayco Prods., Inc. v. Total Containment, Inc., 329 F.3d 1358, 1368 (Fed. Cir. 2003) (“[T]he dispositive question regarding anticipation [i]s whether one skilled in the art would reasonably understand or infer from the [prior art reference’s] teaching that every claim element was disclosed in that single reference.” (alterations in original)). In that regard, in an anticipation analysis, “it is proper to take into account not only specific teachings of the reference but also the inferences IPR2019-00210 Patent 9,818,398 B2 24 which one skilled in the art would reasonably be expected to draw therefrom.” In re Preda, 401 F.2d 825, 826–27 (CCPA 1968). 3. Discussion Petitioner explains in detail how it contends, and we find based on Petitioner’s showing, Davenport discloses each of the limitations of claims 1–4, 9, 10, 12, 13, and 15–17. Pet. 14–25. Patent Owner argues that Davenport does not disclose certain limitations of independent claims 1, 12, and 15. Prelim. Resp. 22–34. We turn now to detailed discussions of the parties’ disputes regarding different challenged claims. a. Independent Claims 1, 12, and 15 With respect to independent claims 1, 12, and 15, the parties’ arguments raise a number of disputes regarding whether Davenport discloses “evaluating two or more results of a recognition, by an automatic speech recognition (ASR) system on a speech input, using at least one criterion that differs from criteria used by the ASR system in determining the two or more results,” as recited in each of the independent claims. Addressing “results of a recognition, by an automatic speech recognition (ASR) system on a speech input,” Petitioner asserts that Davenport’s discussion of a “speech recognition engine,” “speech engine,” “dictation engine,” or “voice recognition system/engine” discloses an ASR system. Pet. 15–16. Petitioner contends that Davenport’s ASR system uses a “‘language model’ and/or ‘acoustic model’” to generate an n-best list with results and confidence scores. Id. at 16–17. Additionally, Petitioner asserts that Davenport discloses evaluating the recognition results of the n-best list using at least one criterion that differs from the criteria employed by the ASR. Id. at 17. Petitioner explains IPR2019-00210 Patent 9,818,398 B2 25 that the ASR selects the n-best results using its own language and acoustic models, whereas “Davenport’s post-processing uses a different criterion to further evaluate the results, namely, it evaluates whether the top results’ scores are ‘very close’ so that an accurate distinction between them ‘might not be possible.’” Id. Petitioner elaborates that “the post-processing criteria is a straightforward numerical comparison,” “not an acoustic model or language model used by Davenport’s ASR.” Id. According to Petitioner, “Davenport explains that the criterion for ‘unusually close confidence ratings’ can be, for example, whether the scores are within the top 5 percentile of closest scores.” Id. Results meeting this criterion go on “List B,” formed of “all words or utterances whose top two or three recognition candidates vary with a margin that is very narrow,” Petitioner explains. Id. at 17–18. Petitioner also notes that Davenport discusses an example of three recognition results with “‘very close’ confidence scores: eight (score of 85); ate (score of 83); and bait (score of 80).” Id. at 17. Petitioner also argues that Davenport’s discussion of comparing recognition results’ confidence scores to a threshold discloses the claimed evaluation using a criterion different from the criteria used to identify the recognition results. Id. at 18. Petitioner explains that Davenport discloses this threshold “may be a number, such as a 70% confidence level, or may be the results with the lowest 5 percentile confidence levels.” Id. We turn now to detailed discussions of the disputes raised by the parties’ arguments. IPR2019-00210 Patent 9,818,398 B2 26 (1) Whether Davenport’s Generation of “List A” Evaluates Two or More Results of a Recognition Patent Owner argues that Davenport’s process of generating list A evaluates only the score of a top recognition result, thereby failing to disclose the independent claims’ “evaluating two or more results of a recognition” (emphasis added). PO Resp. 11–12. We deem this argument, which applies only to Davenport’s generation of list A, moot because we determine for the reasons discussed throughout this Decision that Petitioner has shown that Davenport’s generation of list B meets the “evaluating” limitations of the independent claims. (2) Whether Davenport Evaluates Results of a Recognition Patent Owner argues that Davenport’s process of determining whether to include recognition results on list A or list B does not evaluate recognition results. PO Resp. 12–16. Patent Owner bases this argument on its position that the claimed “evaluating two or more results of a recognition” requires evaluating the words themselves, rather than evaluating the words with the associated confidence scores. Id. As explained in detail above in Section III.A.2, we disagree with Patent Owner’s position regarding the scope of the claims. Consequently, we find unpersuasive Patent Owner’s argument that Davenport evaluates scores, but not recognition results. IPR2019-00210 Patent 9,818,398 B2 27 (3) Whether Davenport Uses a Criterion Different From Criteria Used by the ASR System (a) Patent Owner’s Argument that Davenport’s Scores Are Used by the ASR Patent Owner argues that Davenport uses confidence scores in the ASR to determine recognition results, as well as in determining whether to put recognition results on list A or list B. PO Resp. 18–19. Patent Owner argues Davenport, therefore, does not disclose “evaluating two or more results of a recognition, by an automatic speech recognition (ASR) system on a speech input, using at least one criterion that differs from criteria used by the ASR system in determining the two or more results,” as recited by the claims. Id. Patent Owner’s argument is not commensurate in scope with the independent claims. Although this claim language requires using at least one criterion that was not used in the determining the two or more results, it does not exclude also using some criteria that were used in determining the two or more results. Therefore, even if Davenport uses the confidence scores to determine the recognition results and then to evaluate them, that by itself does not distinguish the claims from Davenport. Of course, to the extent Davenport does use the confidence scores both to determine recognition results and subsequently evaluate them, Petitioner needs to demonstrate Davenport uses some at least one other criterion that meets the claims. The parties’ disputes regarding whether Davenport discloses some other criterion are discussed in the following sections. IPR2019-00210 Patent 9,818,398 B2 28 (b) Whether Confidence Scores Being “Very Close” or Below a Threshold Is a Different Criterion As noted above, Petitioner argues that Davenport’s evaluation of whether the ASR’s recognition results have very close scores and the evaluation of whether the scores fall below a threshold each involve using a criterion that differs from the criteria used by Davenport’s ASR in determining the recognition results. Pet. 17–18. The evidence cited by the Petition includes Dr. Beigi’s testimony that Davenport’s “post-processor . . . uses a different criteria: (a) its own minimum confidence threshold, and (b) a threshold for how close the top scores are.” Ex. 1003 ¶ 125. Patent Owner argues that “[t]he ’398 specification nowhere describes applying a threshold to a criterion used by the ASR system as somehow being a different criterion, because it manifestly is not.” PO Resp. 20. Asserting that Davenport’s ASR uses confidence scores, Patent Owner criticizes that the Petition does not analyze why applying a threshold to a criterion used by the ASR constitutes applying a different criterion. Id. at 20–22. Patent Owner argues that “[a]s Dr. Hansen explains, the criterion Davenport evaluates is the confidence score, which is a criterion used by the ASR system in generating the recognition results.” Id. at 20–21. Petitioner counters that “[a] ‘criterion,’ however, is “a standard, rule, or test by which something may be judged,’ and not the something that is being judged, i.e., the recognized words and their scores.” Pet. Reply 8–9. In his Supplemental Declaration, Dr. Beigi elaborates that “a criterion might be whether the top two or three scores are within the top 5th percentile of closest scores,” i.e., “whether the difference between the two or three top scores is small enough that 95% of the top scores are further apart.” IPR2019-00210 Patent 9,818,398 B2 29 Similarly, Dr. Beigi testifies that “the test of whether the word’s score is above 75” is a criterion. Patent Owner responds that each recognition result’s confidence score is a “criterion” according to Petitioner’s construction. Sur-reply 9. Patent Owner contends that “Davenport undisputedly only evaluates the scores that the ASR system itself used as measures in determining the recognition results by judging their likely accuracy.” Id. at 10. Petitioner persuades us that, contrary to Patent Owner’s arguments, Davenport’s evaluation to compile lists A and B does not only use confidence scores as criteria, but also uses at least one criterion different from the confidence scores. For example, Petitioner persuades us that a person of ordinary skill in the art would have understood that Davenport’s process of generating list B uses a test with a threshold, such as the exemplary 5% value, to determine whether the recognition results’ scores are close enough to trigger inclusion of the recognition results on the list. Pet. 17–18; Pet. Reply 6–12; Ex. 1003 ¶¶ 125–128; Ex. 1047 ¶¶ 25–28. This test and threshold are not the recognition results’ scores; the scores are the scores, and the test and threshold are separate and applied to the scores. Additionally, under the broadest reasonable interpretation of “criterion” the test and threshold constitute “at least one criterion,” as recited in the independent claims. Thus, regardless of whether the scores constitute at least one criterion used by Davenport to evaluate the recognition results, the test and the threshold applied to the scores constitutes at least one different criterion than the scores themselves. Thus, a person of ordinary skill in the art would have understood Davenport as disclosing that the test and threshold used in generating list B were not used to determine the IPR2019-00210 Patent 9,818,398 B2 30 recognition results (discussed below in Section III.B.3.a.(3)(c)), and we therefore find Davenport discloses “evaluating two or more results of a recognition, by an automatic speech recognition (ASR) system on a speech input, using at least one criterion that differs from criteria used by the ASR system in determining the two or more results,” as recited in the independent claims. (c) Whether Davenport’s ASR System Uses the Alleged Criteria in Determining Recognition Results Citing Dr. Beigi’s testimony, the Petition asserts that Davenport’s post-processing uses at least on criterion different from the criteria used by its ASR to generate recognition results. Pet. 14–18. In discussing the state of the art, Dr. Beigi’s first Declaration explained in detail how ASRs generate recognition results using language models, acoustic models, and statistics. Ex. 1003 ¶¶ 52–59. Dr. Beigi also explained in detail how known post-processing allowed error correction using various techniques. Id. ¶¶ 65–69. With this background, Dr. Beigi further testified that Davenport’s ASR uses “language and/or acoustic models,” whereas its post-processor “uses a different criteria: (a) its own minimum confidence threshold, and (b) a threshold for how close the top scores are.” Id. ¶ 125. Based on this testimony, the Petition argued that Davenport discloses using different criteria in post-processing than in generating recognition results with the ASR. Pet. 14–18. Discussing the state of the art in post- processing, the Petition argued that “[u]sing the same criteria as the ASR would not have made sense.” Id. at 7. IPR2019-00210 Patent 9,818,398 B2 31 Dr. Beigi testifies that Davenport discloses a post-processor that uses at least one criterion different from the criteria used by Davenport’s ASR to generate its n-best list. Ex. 1003 ¶¶ 119–128. Dr. Beigi explains that Davenport’s ASR generates an n-best list with “its own ‘language model’ and/or ‘acoustic model.’” Id. ¶ 123. Dr. Beigi further explains that Davenport’s “post-processor, however, uses a different criteria: (a) its own minimum confidence threshold, and (b) a threshold for how close the top scores are. These criteria are not language models or acoustic models in the ASR. They are much simpler and straightforward numerical criteria.” Id. ¶ 125. Patent Owner deposed Dr. Beigi about his testimony that Davenport’s post-processor uses criteria different from the criteria used by the ASR to generate the n-best list. Ex. 2003, 92:19–102:15. For example, Patent Owner asked Dr. Beigi if “Davenport’s ASR use the scores that it produces?” Id. at 96:13–14. Dr. Beigi answered: It uses the scores that it produces, but with its own logic.· So the criteria are different.· The scores are pieces of information, so the ASR can use the scores in any way it wants, but once it puts out the N-best list, it’s already put out whatever it did.· So it’s applied all its criteria, basically. Now the N-best list is at Davenport's disposal.· Now Davenport continues and applies further criteria, which those further criteria weren’t applied.· Had they been applied, he wouldn't need apply them again.· Right?· It's just very straightforward that this is a different criterion that's being applied.· Otherwise, there is no need for that criteria to be applied anymore.· Davenport is coming up with the threshold or a percentile that’s making assumptions, and those assumptions are the criteria that it's using. Id. at 96:15–97:11. IPR2019-00210 Patent 9,818,398 B2 32 Patent Owner also asked Dr. Beigi “[i]s it impossible that the ASR uses the same threshold that Davenport’s post-processor uses?” Id. at 98:6– 8. Dr. Beigi answered that “it’s very low probability. But it’s not impossible.” Id. at 98:10–12. Citing Dr. Beigi’s deposition testimony in combination with other evidence, Patent Owner’s Response argues that the Petition did not show that the thresholds Davenport’s system uses to generate lists A and B are not also used by Davenport’s ASR when generating recognition results. PO Resp. 22–23. Patent Owner asserts that the Petition does not demonstrate express anticipation because “Davenport’s ASR system is a black box.” Id. Patent Owner argues that the Petition does not demonstrate inherent anticipation because Dr. Beigi admitted a possibility of Davenport’s ASR system using the same thresholds that Davenport uses to generate lists A and B. Id. at 23. Patent Owner concludes that “the Petition wholly fails to establish that the alleged criteria Davenport uses to evaluate recognition results are different from the criteria used by the ASR system in determining the results.” Id. In its Reply, Petitioner maintains that it demonstrates Davenport’s system uses different criteria in post-processing for generating lists A and B than used by the ASR to generate the n-best list of recognition results. Pet. Reply 8–12. Arguing that it need not completely rule out the possibility that the ASR could use the same criteria as the post-processing, Petitioner asserts that Patent Owner’s “only supporting evidence that Davenport’s ASRs might have used the same criteria is an expert declaration that is due no weight because it simply repeats Nuance’s legally flawed and conclusory arguments.” Id. at 10. IPR2019-00210 Patent 9,818,398 B2 33 Petitioner elaborates that Dr. Beigi “explained why a [person of ordinary skill in the art] would have understood from Davenport that the ASRs did not apply [the same criteria as the post-processing]. If the ASR had already applied these criteria, then Davenport ‘wouldn’t need [to] apply them again’ in its post-processor.” Id. at 11 (citing Ex. 2003, 97) (alteration in original). Consistent with this, Dr. Beigi testifies that “a [person of ordinary skill in the art] would have understood that, as a practical matter, Davenport’s ASR does not use the same criteria as its post-processor.” Ex. 1047 ¶ 29. Dr. Beigi also testifies that the ’398 patent and various other disclosures show post-processor criteria like those used by Davenport and reflect a recognition in the art that it would not have made sense for post- processing to reuse criteria already used by the ASR. Id. ¶ 31. In its Sur-reply, Patent Owner maintains that the Petition did not show Davenport discloses using a criterion in the post-processing that was not used to generate recognition results. Sur-reply 10–12. Patent Owner criticizes the Reply as “attempt[ing] to shift the burden to [Patent Owner]” (id. at 11) and relying on procedurally improper arguments and evidence that should have been advanced in the Petition to establish a prima facie case of unpatentability (id. at 10–12). Id. For example, Patent Owner argues that “Dr. Beigi’s deposition testimony that Davenport ‘wouldn’t need’ to use the same thresholds (Reply, 11) was not cited in the Petition and cannot be used belatedly to make the Petition’s prima facie case.” Id. at 11–12. We find Patent Owner’s arguments that the Petition fails to sufficiently set forth evidence and arguments establishing a prima facie showing of unpatentability to be unavailing. Sur-reply 11–12. “[I]n an [inter partes review], the petitioner ha[s] the burden from the onset to show IPR2019-00210 Patent 9,818,398 B2 34 with particularity why the patent it challenges is unpatentable.” Harmonic Inc. v. Avid Tech., Inc., 815 F.3d 1356, 1363 (Fed, Cir, 2016). This burden never shifts to the patent owner. Dynamic Drinkware, LLC v. Nat’l Graphics, Inc., 800 F.3d 1375, 1378 (Fed. Cir. 2015). Here, we find Petitioner has shown from the onset with particularity why Davenport discloses the claim limitation at issue. As we discussed above, the Petition sets forth expressly the position that the post-processor in Davenport uses criteria different from that used by the ASR, stating that a post-processor “[u]sing the same criteria as the ASR would not have made sense.” Pet. 7. Also, citing Dr. Beigi’s testimony, the Petition asserts elsewhere that Davenport’s post-processing uses at least one criterion different from the criteria used by its ASR to generate recognition results. Pet. 14–18; Ex. 1003 ¶¶ 119–128. To the extent Patent Owner takes issue with Dr. Beigi’s testimony during his cross-examination, we do not rely on this testimony in determining that the Petition sets forth with particularity why Davenport discloses the claim limitation at issue. However, we do consider this testimony is assessing whether Petitioner has shown unpatentability by a preponderance of the evidence. “There is no requirement, either in the Board’s regulations, in the APA, or as a matter of due process, for the institution decision to anticipate and set forth every legal or factual issue that might arise in the course of trial,” and “[t]he purpose of the trial in an inter partes review proceeding is to give the parties an opportunity to build a record by introducing evidence—not simply to weight the evidence of which the Board is already aware.” Genzyme Therapeutic Prods Ltd. P’ship v. Biomarin Pharm. Inc., 825 F.3d 1360, 1366–67 (Fed. Cir. 2016). Dr. Beigi’s deposition testimony was given in response to cross-examination by IPR2019-00210 Patent 9,818,398 B2 35 Patent Owner’s counsel, and we find it is proper for the Board to consider this testimony. Moreover, Patent Owner had the opportunity to address this testimony in the Sur-reply, which was filed approximately five months after Dr. Beigi’s deposition. Patent Owner also argues that Davenport’s ASR could possibly use the same thresholds that its post-processing does to generate lists A and B. PO Resp. 23; Sur-reply 11–12. Assuming arguendo that Patent Owner is correct, at best this would show that, theoretically, it would have been possible to construct Davenport’s ASR to use these thresholds before reusing them in post-processing to generate lists A and B. This, however, neither shows that a skilled artisan would have understood Davenport to disclose using the same thresholds in post-processing and the ASR, nor outweighs Petitioner’s evidence, discussed above, that a person of ordinary skill in the art would have understood Davenport as disclosing that the post- processing uses thresholds not used by the ASR when generating recognition results. (4) Summary for Independent Claims 1, 12, and 15 In sum, Petitioner demonstrates, by a preponderance of the evidence, that Davenport anticipates independent claims 1, 12, and 15. b. Claims 2 and 16 Depending from claims 1 and 15, respectively, Claims 2 and 16 recite “wherein the evaluating the two or more results using the at least one criterion comprises evaluating the two or more results for medically- meaningful discrepancies between the two or more results.” Ex. 1001, 51:25–28; 53:35–38. Petitioner notes that the claims and the Specification “do not recite any particular methodology for finding medically meaningful IPR2019-00210 Patent 9,818,398 B2 36 discrepancies.” Pet. 21. With this background, Petitioner asserts that a person of ordinary skill in the art would understand that “Davenport, like Lai, identifies meaningful discrepancies in any domain—including the medical field—by evaluating the ASR’s scores using its own criteria.” Id. at 22. Petitioner elaborates that Davenport teaches identifying potential error in circumstances where the potential recognition results include the terms “eight,” “ate,” and “bate” with relatively close scores. Id. Petitioner asserts that “[t]hese are meaningful discrepancies in any domain, including the medical field, as a comparison between ‘eight broken bones’ and ‘ate broken bones’ or ‘eight stitches’ and ‘ate stitches’ readily reveals.” Id. at 23. Citing Lai as evidence, Petitioner also indicates that Lai and Davenport each operate “by evaluating the ASR’s scores to detect errors such as ‘there are signs of cancer’ that should state ‘there are no signs of cancer.’” Id. at 22. Petitioner argues that there was an assertion during the prosecution history of the ’398 patent that such an error constitutes a medically-meaningful discrepancy. Id. In its Response, Patent Owner argues Petitioner does not show Davenport would find medically-meaningful discrepancies. PO Resp. 24– 30. Patent Owner argues that Petitioner does not show Lai or Davenport actually would identify potential confusion between “no signs of cancer” and “signs of cancer.” Id. at 24. Patent Owner argues that Petitioner also fails to show Davenport actually would identify potential errors in the other hypotheticals raised, such as “eight broken bones” versus “ate broken bones.” Id. at 25–30. IPR2019-00210 Patent 9,818,398 B2 37 Patent Owner also argues that even if Davenport’s system finds discrepancies, and some of those discrepancies happen to be medically- meaningful, Petitioner does not show that Davenport evaluates “for medically-meaningful discrepancies.” Id. at 30–31. Patent Owner contends that the Specification explains that identified discrepancies may or may not be medically-meaningful. Id. at 30. In its Reply, Petitioner indicates that Patent Owner overestimates the required showing and underestimates Davenport’s disclosure. Pet. Reply 12–17. Petitioner contends that “the issue is whether a [person of ordinary skill in the art] would have understood that Davenport taught the necessary likelihood determinations.” Id. at 14. Petitioner argues that Patent Owner admitted a person of ordinary skill in the art would have recognized that “Davenport’s ASR’s internal models would be trained for the relevant domain (e.g., medical) before use so that they could determine the likelihood of recognized words in that domain.” Id. at 13. Petitioner maintains that Lai and Davenport show that “Davenport would identify medically-meaningful discrepancies.” Id. at 13–15. Petitioner asserts that Patent Owner relies on an overly narrow “moving-target construction” of the claims. Id. at 15–17. Petitioner argues the claims “require finding ‘potential errors in the recognition results that may be semantically significant’ in the medical field.” Id. at 15–16. In its Sur-reply, Patent Owner maintains that Petitioner has not shown Davenport’s system would detect discrepancies that happen to be medically- meaningful. Sur-reply 12–15. Patent Owner explains that “Davenport only generates alerts under particular conditions, and the Petition provides no IPR2019-00210 Patent 9,818,398 B2 38 evidence that any medically-meaningful discrepancies would meet Davenport’s alert conditions.” Id. at 14. Patent Owner also maintains that even if Davenport’s process identifies discrepancies that happen to be medically-meaningful, Petitioner has not shown that Davenport discloses “evaluating for medically meaningful discrepancies.” Id. at 15–16. Patent Owner explains that “[t]he relevant issue is whether the system evaluates for discrepancies classified as medically-meaningful specifically, as opposed to merely identifying discrepancies generally.” Id. at 16. Petitioner’s explanation of how Davenport allegedly anticipates dependent claims 2 and 16 does not account adequately for all of the words in the claim language “evaluating the two or more results for medically- meaningful discrepancies.” See In re Wilson, 424 F.2d 1382, 1385, (CCPA 1970) (“All words in a claim must be considered in judging the patentability of that claim against the prior art.”). As noted above, Petitioner argues that Davenport “identifies meaningful discrepancies in any domain—including the medical field—by evaluating the ASR’s scores using its own criteria.” Pet. 22. Thus, Petitioner assumes, without persuasive support, that the claims only require finding meaningful discrepancies, indiscriminative of domain. This assumption reads at least the word “medically” out of the recited claim language, which recites “evaluating . . . for medically- meaningful discrepancies.” Ex. 1001, 51:27–28, 53:37–38 (emphasis added). We disagree with Petitioner’s argument that identifying discrepancies in Davenport—i.e., such as between ate, eight, and bait—discloses the recitation “evaluating . . . two or more results for medically-meaningful IPR2019-00210 Patent 9,818,398 B2 39 discrepancies between the . . . results.” Pet. 21–23. First, we disagree with Petitioner’s assertion that Davenport “identifies meaningful discrepancies in any domain.” Id. at 22. Davenport does not describe identifying discrepancies that are meaningful for a domain. The Specification of the ’210 patent explains that within a particular context, i.e., domain, some misrecognition of text errors may be significant and have serious consequences, whereas other misrecognition errors are not significant and do not have serious consequences. Ex. 1001, 3:23–28. Moreover, misrecognition errors that are significant in one domain may not be significant in a different domain, according to the Specification. Id. at 4:27– 42. Accordingly, the Specification discloses, in some aspects of the invention, providing indications of potential significant errors that “may include discrepancies between recognition results that are meaningful for a domain, such as medically-meaningful discrepancies.” Id. at 4:63–66. In contrast, Davenport does not disclose evaluating results that are meaningful for a particular domain. For example, Petitioner relies on Davenport’s disclosure of list B recognition results “eight,” “ate,” and “bait,” but Petitioner does not cite to, nor do we find, disclosure in Davenport of any evaluation of whether the discrepancies between these three results are meaningful within a particular domain, much less for a medical domain. Indeed, the disclosure of Davenport does not appear to be concerned with domains at all. Petitioner nonetheless posits that the difference between the words “ate” and “eight” could be medically meaningful, Pet. 22–23, but we find this unpersuasive because Davenport does not evaluate these results for medically-meaningful discrepancies. IPR2019-00210 Patent 9,818,398 B2 40 We also find unpersuasive Petitioner’s argument that the Specification does not constrain how the evaluation of recognition results occur. E.g., Pet. 21–22. Although the Specification discloses that any suitable criteria may be used to evaluate for medically-meaningful discrepancies, Ex. 1001, code (57), this does not mean we should read out of the claim language that the evaluation must be for medically-meaningful discrepancies. As argued by Patent Owner, the Specification clearly conveys that not all discrepancies are meaningful in the medical domain. See PO Resp. 30; Sur-reply 15–16. Moreover, the Specification conveys that the set of discrepancies that would be meaningful differs in different domains. This is conveyed, for example, in the following passage from the Specification: Further, a set of words/phrases may include words/phrases that, when confused by an ASR system, would change a meaning of a recognition result in a way that would be semantically significant (e.g., significant in a domain). For example, while the words “to” and “too” may be acoustically similar and confusable by an ASR system and an incorrect substitution of one for the other would be erroneous, the error may not have serious consequences in some domains if not corrected, so that these words would not be considered significant in these domains. Conversely, as discussed above, in the medical context if the word "malignant" is erroneously substituted for the word “non-malignant” or vice versa, a patient may be improperly treated, which may have serious consequences for the patient and the medical institution. Ex. 1001, 19:34–48 (emphasis added). In view of this, we do not agree with Petitioner’s assumption that the claims only require indiscriminately finding meaningful discrepancies, some of which would happen to be meaningful in the medical domain. Additionally, we find unavailing Petitioner’s allegation that Patent Owner presents a “moving-target construction.” We require Petitioner to IPR2019-00210 Patent 9,818,398 B2 41 explain “[h]ow the challenged claim is to be construed” and “[h]ow the construed claim is unpatentable.” 37 C.F.R. § 42.104(b)(3), (4). And Petitioner bears the ultimate burden of persuasion. 35 U.S.C. § 316(e). We also find unavailing Petitioner’s observations regarding Patent Owner’s infringement contentions in the district court. Petitioner argues that Patent Owner identifies gender and laterality as medically-meaningful discrepancies. Pet. 22. This assertion does not help Petitioner because Petitioner does not present persuasive evidence that Davenport evaluates for gender or laterality discrepancies. Additionally, we find unavailing Petitioner’s assertion that “Davenport’s ASR’s internal models would be trained for the relevant domain (e.g., medical) before use so that they could determine the likelihood of recognized words in that domain.” Pet. Reply 13. Even if accurate, this assertion addresses models that Petitioner does not identify as used in the “evaluating” recited in claims 1, 2, 15, and 16. Petitioner asserts that Davenport’s ASR generates the “results of a recognition” recited in claims 1 and 15. Pet. 14–17. The Petition alleges that the “evaluating” recited in claims 1, 2, 15, and 16 occurs in “post-processing,” not in the ASR. Id. at 17–18, 21–23. Thus, even if the ASR has provisions for domain-specific identification of meaningful discrepancies, this does not demonstrate use of those provisions in the post-processing that the Petition alleges to do the “evaluating” of claims 1, 2, 15, and 16. For the foregoing reasons, Petitioner does not demonstrate by a preponderance of the evidence that Davenport anticipates claims 2 and 16. IPR2019-00210 Patent 9,818,398 B2 42 c. Claims 3, 4, 9, 10, 13, 17 The Petition explains in detail why Petitioner contends, and we find based on Petitioner’s showing, that Davenport discloses each of the limitations recited in these claims. Pet. 21–25. Aside from the arguments, addressed above, regarding the patentability of the independent claim 1, Patent Owner does not present arguments specific to dependent claims 3, 9, or 10, provide argument other than the arguments for. Our review of the full record in this proceeding persuades us that Davenport anticipates claims 3, 9, and 10. Patent Owner does argue claims 4, 13, and 17 separately from the independent claims. PO Resp. 31. Patent Owner argues that “the Petition (at 23) alleges that Davenport anticipates claims 4, 13, and 17 merely ‘for the same reasons as claims 2 and 16,’ which . . . are fatally flawed.” Id. We disagree with Patent Owner’s characterization of the Petition. Although it states that claims 4, 13, and 17 “are anticipated for the same reasons as claims 2 and 16,” it provides additional explanation. Pet. 23. It explains Petitioner’s contention that the disclosures of Davenport discussed in connection with claims 2 and 16 meet the specific limitations of claims 4, 13, and 17 because “Davenport evaluates the two or more results for errors that ‘may cause’ a change in meaning, such as its ate-eight-bait example.” Id. In other words, rather than repeating all of the Davenport disclosures cited as teaching the limitations of claims 2 and 16, the Petition refers back to them and explains that those disclosures meet the limitations of claims 4, 13, and 17. Our review of the evidence persuades us that although they do not meet the limitations of claims 2 and 16, the disclosures of Davenport cited by Petitioner do meet the broader limitations of claims 4, 13, and 17. IPR2019-00210 Patent 9,818,398 B2 43 In sum, Petitioner has shown, by a preponderance of the evidence, that Davenport anticipates claims 3, 4, 9, 10, 13, 17. C. Alleged Obviousness over Davenport and Lai 1. Overview of Lai Lai’s disclosure relates to speech recognition systems’ user interfaces. Ex. 1022, 1:7–8. Lai teaches that it can be difficult to detect errors when dictating text with speech recognition technology. Id. at 2:25–32. Lai explains, “[f]or example, the sentence ‘there are no signs of cancer’ can become ‘there are signs of cancer’ through a deletion error. This type of error can be easy to miss when quickly proof reading a document.” Id. at 2:29–32. 2. Discussion Petitioner asserts that it would have been obvious in view of Lai to use Davenport in the medical realm, and that doing so would meet the limitations of claims 2 and 16. Pet. 25–26. Petitioner elaborates that [a person of ordinary skill in the art] would have been motivated to use Davenport in the medical domain and would have reasonably expected to successfully identify medically meaningful discrepancies such as . . . ‘eight/ate broken bones’ and ‘eight/ate stitches’ . . . . Lai teaches that general-purpose approaches such as Davenport’s were used in the medical domain because such errors are easy for a human to miss but important to correct. “For example, the sentence ‘there are no signs of cancer’ can become ‘there are signs of cancer’ through a deletion error. This type of error can be easy to miss when quickly proofreading a document.” Lai 2:25-33; Beigi Decl. ¶¶ 138-141. Thus, a [person of ordinary skill in the art] would have been motivated to use Davenport in the medical field because medically meaningful errors in a medical transcription can have ‘dire consequences’ for the patient. E.g., Chang (Ex. 1012) at 724-25; Beigi Decl. ¶ 140. Indeed, [Patent Owner] IPR2019-00210 Patent 9,818,398 B2 44 acknowledged this fact when it obtained the ’398 patent—and gave the same cancer vs. no-cancer example as Lai. See ’398 File History at 60. Pet. 25–26. Petitioner also asserts that Davenport and Lai are analogous art. Id. at 26. This challenge of claims 2 and 16 extensively overlaps with Petitioner’s challenge of claims 2 and 16 as allegedly anticipated by Davenport. The significant difference between the two challenges is that the obviousness challenge relies on Lai as having allegedly provided motivation to use Davenport in the medical domain. Compare Pet. 21–23 with Pet. 25– 26. Assuming it would have been obvious to use Davenport in the medical domain, for substantially the reasons explained above in Section III.B.3.b, we find that Petitioner has not addressed adequately how the prior art meets the claims’ limitation of “evaluating the two or more results for medically- meaningful discrepancies between the two or more results.” Consequently, Petitioner has not demonstrated by a preponderance of the evidence that claims 2 and 16 would have been obvious over Davenport and Lai. D. Alleged Obviousness over Baker and Baker-613 1. Overview of Baker Baker discloses improving on a base speech recognition system without changing the base speech recognition system. Ex. 1006 ¶ 56. Baker explains that its system could function as a product added onto a commercially available recognition system. Id. Baker discusses an example in connection with Figure 1, which is reproduced below. IPR2019-00210 Patent 9,818,398 B2 45 Figure 1 is a flowchart of a process according to one embodiment of Baker. Id. ¶ 20. The process begins at block 100, which involves obtaining an output hypothesis produced by a base speech recognition process. Id. ¶ 57. The output hypothesis may include a series of speech elements “such as the base speech recognition system would send to any application program as the IPR2019-00210 Patent 9,818,398 B2 46 recognition system’s choice corresponding to a given interval of speech.” Id. At block 110, the system may obtain a set of alternative hypotheses. Id. ¶ 58. This may involve obtaining from the base speech recognition system choices considered to be almost as likely as the chosen sequence. Id. Additionally, the system may receive “the evaluation score for the top choice and any alternate choices.” Id. At block 120, the external system may use “a second set of different scoring models that is separate from and external to the base speech recognition process” to score the top choice and alternative hypotheses. Id. ¶ 59. For example, the external system may use “its own acoustic models and language model, external to the base speech recognition system, to rescore each hypothesis on the list of alternate choices that has generated.” Id. The hypothesis with the best score is then chosen in block 130. Id. Baker provides further details regarding one embodiment of the process of rescoring hypotheses in connection with Figure 2, which is reproduced below. IPR2019-00210 Patent 9,818,398 B2 47 Figure 2 is a flowchart illustrating one embodiment of a process involving rescoring hypotheses with a second set of scoring models. Id. ¶¶ 61–63. The process illustrated in Figure 2 “is premised, at least in part, on obtaining alternate hypotheses and scores therefor from the base speech recognition process, where the scores are based on the scoring models used in the base speech recognition process.” Id. ¶ 61. Block 200 involves selection of a diminished set of alternative hypotheses that have good scores, as ascertained by the scoring models used by the base speech-recognition IPR2019-00210 Patent 9,818,398 B2 48 process. Id. Two of the chosen hypotheses are compared at block 210 to identify which speech element or elements differ between the hypotheses. Id. ¶ 62. “[U]sing the second set of scoring models,” one or more of the differing speech elements are rescored at block 220. Id. ¶ 63. At block 230, the hypothesis having the best score is chosen from the rescored hypotheses. Id. ¶ 64. Baker discusses an alternate embodiment in connection with Figure 3, which is reproduced below. Figure 3 is a flowchart illustrating a different process of identifying a hypothesis with a best score. Id. ¶¶ 65–69, Fig. 3. IPR2019-00210 Patent 9,818,398 B2 49 At block 300, “a confusable one or more speech elements in the output hypothesis are detected.” Id. ¶ 65. This may involve referencing “a database of confusable elements or elements that are often deleted in speech.” Id. Block 310 involves obtaining for at least one of the identified confusable speech elements an alternative speech element. Id. ¶ 66. This may involve obtaining the alternative speech element from the database of confusable or often deleted speech elements. Id. Utilizing the alternative speech element, a new hypothesis is generated at block 320. Id. ¶ 67. Block 330 involves scoring the new hypothesis, which may be done “using the second set of scoring models.” Id. ¶ 68. The hypothesis having the best score is chosen at block 340. Id. ¶ 69. Baker also discloses alternate processes of identifying best-scoring hypotheses in connection with Figures 4 and 5. Id. ¶¶ 72–75. Referring back to Figure 1, once “the rescoring system has accepted or corrected the top choice speech element sequence the new, possibly corrected sequence will be presented to the user or sent to an application program in block 140, as if it had come directly from the base recognition system.” Id. ¶ 76. Block 150 involves collecting “error correction data or other feedback information” from the user. Id. ¶ 77. At block 160, the information could be applied to improve the base speech recognition procedure and/or the second set of scoring models. Id. ¶ 79. IPR2019-00210 Patent 9,818,398 B2 50 2. Overview of Baker-613 Baker-613 discloses using two different speech recognizers to process speech input. Ex. 1007, code (57). Baker-613 illustrates a system with two speech recognizers in Figure 3, which is reproduced below.9 Id. at 4:29–30. Figure 3 shows a system that includes real-time recognizer 303, offline recognizer 309, combiner 311, and offline transcription station 313. Id. at 5:64–66, 6:13–18, 6:40–44. Real-time recognizer 303 receives a speech detection sample from microphone 301. Id. at 5:64–66. Real-time recognizer 303 does speech recognition on the sample in real time and delivers recognized text to monitor 305. Id. at 6:3–6. 9 Specifically, the version of Figure 3 reproduced below is the one appearing in the Certificate of Correction for Baker-613. Ex. 1007, 25–26. IPR2019-00210 Patent 9,818,398 B2 51 The speech sample from real-time recognizer may also be transmitted to offline recognizer 309, as well as combiner 311. Id. at 6:13–18. Offline recognizer 309 may then perform independent recognition analysis, “for example, using a LVSCR recognizer such as the HTK system.” Id. at 6:29– 32. Offline recognizer 309 may transmit its recognition results to combiner 311. Id. at 6:29–35. Having received both sets of recognition results, combiner 311 “processes the results by generating a combined set of recognition results and by checking for instances of uncertainty by one or both of the recognizers or discrepancies between the results produced by the two recognizers.” Id. at 6:35–40. Combiner 311 transmits “the speech sample and the combined set of recognition results, including information that identifies instances of recognition uncertainty or disagreement, to the offline transcription station 313.” Id. at 6:40–44. Using the speech sample and the input from combiner 311, a human operator at offline transcription station 313 “generate[s] an essentially error-free transcription of the speech sample.” Id. at 6:44–48. 3. Discussion Petitioner identifies where it contends, and we find Petitioner has shown, how Baker and Baker–613 teaches each limitation of claims 1–4, 9, 10, 12, 13, and 15–17. Pet. 28–55. Specifically, Petitioner argues that Baker teaches most of the limitations of claims 1–4, 9, 10, 12, 13, and 15– 17. Id. For example, Petitioner asserts that Baker teaches the “evaluating” step of independent claim 1. Id. at 29–33. Petitioner contends that Baker determines two or more speech recognition results with first scoring models, followed by using a different set of scoring models to rescore each of the IPR2019-00210 Patent 9,818,398 B2 52 speech-recognition results. Id. at 31. Petitioner argues that Baker “selects and presents to the user the best of the n-best hypotheses.” Id. at 35. Petitioner notes that Baker does not teach triggering an alert. Id. at 33. Petitioner argues, however, that “Baker and Baker-613 together” teach rescoring the ASR’s results with at least one criterion to trigger an alert. Id. Petitioner explains that Baker-613 teaches triggering an alert when it detects uncertainty in speech recognition results. Id. at 36–37, 41. Petitioner cites Baker-613 as teaching identification of uncertainty when an ASR produces recognition results with close scores or when two ASRs reach different results. Id. at 36–37. Petitioner argues that a person of ordinary skill in the art would have applied Baker-613’s teachings to Baker to identify instances of uncertainty to which the user should be alerted. Id. at 37. Petitioner argues that the criteria for identifying uncertainty “are applicable whether Baker individually scores each of the n-best list or applies its discriminative models.” Id. Petitioner also argues that “a [person of ordinary skill in the art] would have understood that post-processing systems such as the Baker combination typically identify recognition uncertainty . . . if all of the candidates’ scores fail to meet a minimum threshold.” Id. at 38. Petitioner asserts, and we are persuaded, that a person of ordinary skill in the art “would have been motivated to combine Baker with Baker-613 to identify and alert the user to ensure that the user does not miss an important potential error rather than simply providing the best-known result with no indication of the system’s uncertainty.” Id. at 41. Petitioner alleges various considerations that would have contributed to this motivation. Id. at 41–45. IPR2019-00210 Patent 9,818,398 B2 53 We turn now to detailed discussions of the disputes raised by the parties’ arguments. a. Whether It Would Have Been Obvious to Combine Baker and Baker-613 Patent Owner argues that Petitioner does not advance a supportable reason that a person of ordinary skill in the art would have applied the alerts of Baker-613 in Baker’s system. PO Resp. 42–50. Patent Owner argues that Baker’s speech recognition process relates to the base speech recognition process relate to one another differently than Baker-613’s two different speech recognizers. Id. at 44. Patent Owner argues that “Baker-613 uses two parallel recognizers that perform the same task—a ground-up recognition on speech input, seeking to identify the most likely recognition candidates from an unlimited universe of possibilities.” Id. On the other hand, Patent Owner contends that “Baker uses serial recognition stages that do not perform the same task.” Id. Patent Owner asserts that the criteria used by Baker-613 to identify uncertainty do not apply to Baker because Baker’s serial recognition processes only improve the recognition results from the base speech recognition. Id. at 45–46. With Baker’s system only improving the results and only outputting the improved results, close results and changes during the improvement process do not indicate any uncertainty, Patent Owner argues. Id. at 45–48. Additionally, Patent Owner contends that Petitioner fails to objectively support its assertion that low scores or certain circumstances during discriminative evaluation would indicate uncertainty. Id. at 48–50. In its Reply, Petitioner maintains that the criteria used to signal uncertainty in Baker-613 would also signal uncertainty in Baker. Pet. IPR2019-00210 Patent 9,818,398 B2 54 Reply 20–23. Petitioner disagrees with Patent Owner’s suggestion that Baker’s process for generating more accurate results eliminates the uncertainty present in Baker-613. Petitioner explains that just as Baker’s post-processing produces more accurate recognition results than the base speech recognition, Baker-613’s offline recognizer generates more accurate results than its real-time recognizer, yet flags uncertainty in its results. Id. at 20. Petitioner argues that Patent Owner did not support its arguments that Baker’s system would not present the same uncertainty concerns as Baker- 613’s. Id. at 19. Petitioner also maintains that the evidence supports its assertions regarding uncertainty in instances of low scores and certain circumstances related to discriminative modeling. Id. at 21, 23. In its Sur-reply, Patent Owner maintains that Baker’s system differs from Baker-613’s fundamentally. Sur-reply 18–22. Patent Owner explains that Baker-613’s two ASRs operate in parallel, whereas Baker-613’s processes occur serially. Id. at 18–19. Patent Owner argues that Baker’s post-processing improves the results of the base speech recognition, eliminating uncertainty. Id. at 19–21. Petitioner persuades us that a person of ordinary skill in the art would have had reason to combine Baker-613’s teachings regarding flagging uncertainty with the system taught in Baker. As Petitioner asserts, each reference teaches a system with two different speech-recognition evaluations, one of which tends to produce more accurate results. Pet. Reply 20. Baker-613 teaches that such a systems may detect and flag uncertainty when either of the speech-recognition processes (including the ostensibly less accurate process) identifies closely scored results, or when the two speech-recognition processes identify different results. Ex. 1007, 11:12–30; IPR2019-00210 Patent 9,818,398 B2 55 Fig. 13. These similarities strongly support Petitioner’s position that a person of ordinary skill in the art would have recognized that the criteria used by Baker-613 would lend themselves to identifying and flagging uncertainty in the results produced by Baker. Patent Owner’s arguments about differences between the references do not undermine Petitioner’s position. Patent Owner’s observation that one reference processes results in parallel and the other processes serially does not diminish Petitioner’s observation that each reference has two separate processes, one of which ostensibly produces more accurate results. Sur- reply 18–19; Pet. Reply 20. Petitioner persuades us that a person of ordinary skill in the art would glean from Baker-613 that certain aspects of both the more-accurate and the less-accurate process indicate uncertainty. Patent Owner does not provide persuasive evidence that this understanding would apply only to parallel processing, as in the Baker-613 reference. We also find unavailing Patent Owner’s observation that Baker never outputs the results of the base recognition process to a user. PO Resp. 46– 47. We agree with Patent Owner that this aspect of Baker differs from Baker-613, which outputs the results of both real-time recognizer 303 offline recognizer 309, as well as combining the results for presentation to a transcriptionist. But Baker-613 flags uncertainty in the combined results prepared for presentation to a transcriptionist. Ex. 1007, 11:12–30. This teaching of flagging uncertainty in the results prepared for presentation to a transcriptionist would apply to the results that Baker outputs from its post- processing, regardless of whether Baker outputs the results of its base speech-recognition. IPR2019-00210 Patent 9,818,398 B2 56 In sum, Petitioner persuades us that it would have been obvious to combine the teachings of Baker-613 with those of Baker in the manner proposed by Petitioner. b. Whether the Combination Meets the Limitations of the Claims (1) Independent Claims 1, 12, and 15 (a) Whether the Recognition Results of Baker’s Base Speech Recognition Constitute “Results of a Recognition, by an Automatic Speech Recognition (ASR) System on Speech Input” The Petition asserts that Baker’s base speech recognition process outputs “two or more results of a recognition, by an automatic speech recognition (ASR) system on a speech input,” as recited in claims 1, 12, and 15. Pet. 30–33. Patent Owner disagrees. PO Resp. 52–56. Patent Owner argues that the Specification clearly conveys that the ASR system “process[es] input speech to yield a recognition result,” including a “top recognition result” “present[ed] . . . via a display” or in the “text of a document,” and can be composed of “one or more systems that apply any suitable ASR technique(s) to speech input to perform a speech recognition on the speech input.” Id. at 54 (alterations in original). In view of this, Patent Owner argues that the results of Baker’s base speech recognition process cannot correspond to the claimed “results of a recognition, by an automatic speech recognition (ASR) system” because “Baker’s system does not output any hypothesis of the base recognition as the recognized text, and instead outputs to the user only the results of the rescoring as the text recognized by the ASR system.” Id. at 55. IPR2019-00210 Patent 9,818,398 B2 57 In its Reply, Petitioner maintains that the results of Baker’s base- speech recognition process constitute “results of a recognition,” as recited in the independent claims. Pet. Reply 23–25. Petitioner notes that Patent Owner “admits ‘Baker uses a commercially available ASR system as its base recognizer.’” Id. at 24. Petitioner argues that the Specification discusses optionally presenting an ASR’s output to the user, but does not require doing so. Id. Indeed, Petitioner argues, the ’398 patent conveys that an ASR system need not display text to a user. Id. Petitioner explains that “[l]ike Baker, the patent displays text to the user after post-processing.” Id. Patent Owner responds that the ’398 patent teaches displaying the top recognition result from the ASR system. Sur-reply 24. In contrast, Patent Owner argues, Baker’s base process and rescoring cooperate to generate the recognized text, making them internal components of a single ASR system. Id. We find Petitioner’s arguments and evidence persuasive. Dr. Beigi testifies that “Baker’s ‘base recognition process’ (sometimes called ‘first speech recognition’) corresponds to the claimed ASR. Ex. 1003 ¶ 167. The ’398 patent uses the term ASR as it is commonly understood: it is common automatic speech recognition systems that ‘process input speech to yield a recognition result corresponding to the speech.’” Id. (quoting Ex. 1001, 1:16–19). Dr. Beigi notes that the ’398 patent explains that “[s]ome ASR systems may produce as output from a speech recognition process results formatted as an ‘N best’ list of recognition (N being two or more) that includes a top recognition result and N–1 alternative recognition results.” Ex. 1001, 8:27–33. Dr. Beigi testifies that Baker’s base speech recognition process functions in this manner. Ex. 1003 ¶ 167. We find this and IPR2019-00210 Patent 9,818,398 B2 58 Petitioner’s other evidence firmly supports the assertion that the results of Baker’s base speech recognition process constitute “results of a recognition, by an automatic speech recognition (ASR) system.” Pet. 29–33; Pet. Reply 23–25. We find Patent Owner’s arguments and evidence unpersuasive. Patent Owner does not reconcile persuasively its admission that “Baker uses a commercially available ASR system as its base recognizer” with its other arguments. PO Resp. 54. Patent Owner contends that Baker’s base speech recognizer would be an ASR system according to the ’398 patent if and only if its top recognition result were displayed to a user. Id. But Patent Owner does not cite persuasive evidence that the ’398 patent requires that the top recognition result of an ASR system be displayed without further processing. We agree with Petitioner that the portions of the Specification cited by Patent Owner do not compel a conclusion that the recognition results of an ASR system must be displayed without further processing. Pet. Reply 24. For the foregoing reasons, Petitioner persuades us that the results of Baker’s base speech recognition process constitute “results of a recognition, by an automatic speech recognition (ASR) system on a speech input,” as recited in independent claims 1, 12, and 15. (b) “Evaluating Two or More Results of a Recognition, . . . , Using At Least One Criterion That Differs from Criteria Used by the ASR System in Determining the Two or More Results . . . .” Each of independent claims 1, 12, and 15 recites evaluating two or more results of a recognition, by an automatic speech recognition (ASR) system on a speech input, using at least one criterion that differs from criteria used by the IPR2019-00210 Patent 9,818,398 B2 59 ASR system in determining the two or more results, wherein the two or more results were identified by the ASR system as likely to be accurate recognition results for the speech input and comprise a first recognition result identified by the ASR system as most likely to be a correct recognition result for the speech input and at least one alternative recognition result identified by the ASR system as a potential recognition result; and in response to determining that the at least one criterion is met by the two or more results, triggering presentation, via a user interface, of an alert concerning one of the two or more results. Ex. 1001, 51:9–24, 52:38–53, 53:18–34. Petitioner contends that, in combination, Baker and Baker-613 meet these limitations of the independent claims. Pet. 28–45. Petitioner concedes “[w]hile Baker uses different models to re-score the ASR’s results, it does not teach doing so using at least one criterion that will also be used to trigger an alert . . . . However, Baker and Baker-613 together do.” Id. at 33. Petitioner cites Baker-613 as teaching identification of uncertainty when an ASR produces recognition results with close scores or when two ASRs reach different results. Id. at 36–37. Petitioner argues that a person of ordinary skill in the art would have applied Baker-613’s teachings to Baker to identify instances of uncertainty to which the user should be alerted. Id. at 37. Petitioner argues that the criteria for identifying uncertainty “are applicable whether Baker individually scores each of the n-best list or applies its discriminative models.” Id. Petitioner also argues that “a [person of ordinary skill in the art] would have understood that post-processing systems such as the Baker combination typically identify recognition uncertainty . . . if all of the candidates’ scores fail to meet a minimum threshold.” Id. at 38. IPR2019-00210 Patent 9,818,398 B2 60 In view of Baker-613, Petitioner argues that it would have been obvious to combine Baker and Baker-613 by applying Baker’s rescoring processes (that are explicitly different than those used by its ASR) and determining if (i) two or more top scores from the ASR or post-processor are very close, (ii) the ASR and the post-processor rescoring models did not all agree on one top result, and (iii) all of the scores and/or rescores fall below an minimum acceptable value. Id. at 39. Patent Owner argues that, regardless of which of the identified uncertainty criteria is used, the combination of Baker and Baker-613 does not meet the limitations of independent claims 1, 12, and 15. PO Resp. 56– 61. Patent Owner argues that the proposed Baker combination evaluates the scores of recognition results, not the recognition results. Id. Patent Owner also argues that Petitioner does not demonstrate that the Baker combination meets the limitation of “evaluating two or more results of a recognition, by an automatic speech recognition (ASR) system on a speech input, using at least one criterion that differs from criteria used by the ASR system in determining the two or more results.” Id. at 57–61. Patent Owner first argues that Petitioner does not demonstrate the scores used to identify uncertainty are not used by the ASR to identify recognition results. Id. at 57–58. Patent Owner also argues that [t]he claims require more than evaluating any two speech recognition hypotheses; they require evaluating results of a recognition by the ASR system, including a recognition result identified by the ASR system as most likely to be correct and an alternative recognition result identified by the ASR system. The Petition fails to establish that any of the five alert conditions meet these claim requirements. Id. at 59. IPR2019-00210 Patent 9,818,398 B2 61 Weighing the parties’ arguments and evidence, we find that the proposed combination of Baker and Baker-613 would meet the limitations of independent claims 1, 12, and 15. We first address the dispute raised by Patent Owner’s argument that the Baker combination evaluates the scores associated with recognition results, rather than evaluating the recognition results themselves. Id. at 56–57. As explained above in Section III.A.2, we conclude that the claims encompass evaluating recognition results by evaluating their associated scores. Accordingly, Patent Owner’s argument that the Baker combination evaluates recognition results scores, instead of evaluating the recognition results themselves, does not undermine Petitioner’s persuasive evidence that the Baker combination teaches “evaluating two or more results of a recognition.” Next, we turn to the dispute raised by Patent Owner’s argument that Petitioner does not demonstrate the scores used in the evaluation were not also used in identifying the recognition results. Id. at 59. This argument is not commensurate in scope with the independent claims, which recite “evaluating . . . using at least one criterion that differs from criteria used by the ASR system in determining the two or more results.” E.g. Ex. 1001, 51:9–13. Although this claim language requires using at least one criterion that was not used in the determining the two or more results, it does not exclude also using some criteria that were used in determining the two or more results. Petitioner persuades us that the proposed combination of Baker and Baker-613 uses at least one criterion different from those used in determining the recognition results. For example, the Petition conveys that the combination would identify uncertainty by using not only the values of the scores, but also some separate criterion, such as a threshold value, for IPR2019-00210 Patent 9,818,398 B2 62 identifying whether the scores are “close together,” which indicates uncertainty. See, e.g., Pet. 39. Next, we consider the dispute raised by Patent Owner’s argument that none of the five alert conditions meets the requirements of the claims. Patent Owner argues that the first alert condition of close scores from the base recognition process does not meet the claim limitations because “the base recognition’s hypotheses are not recognition results of the Baker Combination’s ASR system. Thus, they also are not identified by the ASR system as most likely to be correct and an alternative recognition result as claimed.” PO Resp. 59. This argument builds on Patent Owner’s argument that Baker’s base speech recognition process does not constitute an ASR system that generates recognition results. For the reasons explained above in Section III.D.3.b.(1).(a), we find this argument unpersuasive. In sum, weighing the parties’ arguments, we find that the proposed combination of Baker and Baker-613 meets each of the limitations of independent claims 1, 12, and 15. In particular, we find persuasive at least Petitioner’s assertion that modifying Baker to identify uncertainty if its base speech recognition scores are close results in a system that meets each of the limitations of the challenged independent claims. (c) Summary for Independent Claims 1, 12, 15 In sum, Petitioner persuades us that the Baker combination teaches all of the limitations of independent claims 1, 12, and 15, and provides persuasive rationale to combine the references (Section III.D.3.a). For the reasons stated above, Petitioner demonstrates, by a preponderance of the IPR2019-00210 Patent 9,818,398 B2 63 evidence, that claims 1, 13, and 15 would have been obvious over Baker and Baker-613. (2) Claims 2 and 16 As discussed in Section III.B.3.b above, the Petition asserts that the ’398 patent does not specify any particular approach for identifying medically-meaningful discrepancies. Pet. 48. Similar to the Petition’s arguments regarding the alleged anticipation of claims 2 and 16 by Davenport (discussed above in Section III.B.3.b), the Petition argues that “the Baker combination identifies meaningful discrepancies in any domain, including the medical field.” Id. at 49. The Petition elaborates that a person of ordinary skill in the art would understand that certain medically- meaningful discrepancies “could be detected by identifying when the ASR or post-processor results indicate uncertainty.” Id. The Petition argues that it would have been obvious to use the Baker combination in the medical field. Id. at 50. Given this, Petitioner contends that the Baker combination would have “evaluated its ASR’s two or more results for, and identified, medically meaningful discrepancies.” Id. The Petition adds that “even if the claims are narrower—requiring a model that specifically evaluates the two or more results using criterion rooted in the medical domain, i.e., evaluating medical terminology in particular—the claims are still obvious.” Id. The Petition elaborates that Baker’s rescoring models have statistical language models that determine a result’s probability by referencing a language example from a relevant domain like the medical field. Id. at 51. According to the Petition, “[t]hus, the Baker combination would be used in the medical domain . . . , and it IPR2019-00210 Patent 9,818,398 B2 64 already evaluates two or more results for medically meaningful discrepancies using criteria rooted in the medical domain.” Id. In its Response, Patent Owner contends Petitioner does not demonstrate that the Baker combination would have been used in the medical domain. PO Resp. 61–63. Noting that Petitioner cites certain potential medical-field errors as the motivation for using the Baker combination in the medical field, Patent Owner argues Petitioner does not demonstrate that the Baker combination would identify those errors. Id. Patent Owner also argues that Petitioner does not show that the Baker combination finds medically-meaningful discrepancies between recognition results. Id. at 63–65. Patent Owner argues that Petitioner relies on hypothetical examples of alternative recognition results with medically- meaningful discrepancies. Id. at 63–64. Patent Owner argues that Petitioner does not demonstrate that the hypotheticals have medically-meaningful discrepancies. Id. at 64. Patent Owner also asserts that Petitioner does not demonstrate that the Baker combination would identify the hypothetical results as having medically-meaningful discrepancies. Id. at 64–65. Patent Owner also argues that even if the Baker combination finds discrepancies, and some of those discrepancies happen to be medically- meaningful, Petitioner does not show that Davenport evaluates “for medically-meaningful discrepancies.” Id. at 65. Patent Owner contends that the Specification explains that identified discrepancies may or may not be medically-meaningful. Id. In its Reply, Petitioner contends that it “established that the Baker combination taught evaluating for meaningful discrepancies in any domain, including medical, especially because its language models take into account IPR2019-00210 Patent 9,818,398 B2 65 the speech’s context, determining whether they are likely to occur in the domain.” Pet. Reply 26–27. Petitioner argues that Patent Owner concedes “Baker’s language models would have been trained on a relevant corpus before use in a specialized domain (e.g., medical) so that they can determine the likelihood of the recognized words in the specialized context.” Id. at 27. Petitioner adds that “the expectation of success need only be reasonable, not absolute.” Id. at 28. In its Sur-reply, Patent Owner maintains that Petitioner does not demonstrate the Baker combination would identify any medically- meaningful discrepancies. Sur-reply 26. According to Patent Owner, having models “trained on a relevant corpus” does not show that the Baker combination would detect medically-meaningful discrepancies. Id. Like the challenge of claims 2 and 16 as allegedly anticipated by Davenport, the challenge of claims 2 and 16 as allegedly obvious over the Baker combination does not account adequately for all of the words in the claim language “evaluating the two or more results for medically- meaningful discrepancies.” For the same substantive reasons as discussed above in Section III.B.3.b, Petitioner’s assertion that the Baker combination meets claims 2 and 16 because “the Baker combination identifies meaningful discrepancies in any domain, including the medical field” reads at least one of the words “for” and “medically” out of the claims. Therefore, this assertion, as well as the associated supporting arguments and evidence, fail to carry Petitioner’s burden of showing unpatentability of claims 2 and 16. We also find unavailing Petitioner’s argument regarding the statistical language models used by Baker’s rescoring models. Claims 2 and 16 IPR2019-00210 Patent 9,818,398 B2 66 require “evaluating the two or more results for medically-meaningful discrepancies between the two or more results.” Ex. 1001, 51:27–28, 53:37– 38 (emphasis added). According to Petitioner and Dr. Beigi, with the statistical language models, “a result’s probability is determined by comparing it to a language sample from the domain, such the medical field or a particular specialty.” Pet. 51; Ex. 1003 ¶ 205. Neither Petitioner nor Dr. Beigi explains persuasively how determining a result’s probability, as Baker allegedly does with its statistical language models, constitutes evaluating for medically-meaningful discrepancies between two or more results. To the extent Baker’s statistical language models help adapt Baker to the medical domain in some general sense, such as by improving estimation of a recognized word’s probability in the medical domain, that does not explain how the Baker combination meets the specific requirements of claims 2 and 16. For the foregoing reasons, Petitioner fails to demonstrate obviousness of claims 2 and 16 over Baker and Baker-613 by a preponderance of the evidence. (3) Claims 3, 4, 9, 10, 13, 17 The Petition explains in detail why Petitioner contends, and we find, the Baker combination teaches all of the limitations of these claims. Pet. 45–55. Patent Owner does not, for dependent claims 3, 9, or 10, provide argument other than the arguments for independent claim 1. Our review of the full record in this proceeding persuades us that the Baker combination teaches all of the limitations of claims 3, 9, and 10. Patent Owner does, for claims 4, 13, and 17, provide arguments in addition to those for the independent claims. PO Resp. 65. Patent Owner IPR2019-00210 Patent 9,818,398 B2 67 argues that “the Petition (at 55) alleges that claims 4, 13, and 17 ‘are obvious for the same reasons as claims 2 and 16,’ which . . . are fatally flawed.” Id. We disagree with Patent Owner’s characterization of the Petition. Although the Petition states that claims 4, 13, and 17 “are obvious for the same reasons as claims 2 and 16,” it provides additional explanation. Pet. 54–55. It explains Petitioner’s contention that the disclosures of Baker combination discussed in connection with claims 2 and 16 meet the specific limitations of claims 4, 13, and 17 because “the Baker combination determines whether the two or more results indicate an a potential error by detecting uncertainty in the scores or rescore” and “the potential errors identified by the Baker combination ‘may cause’ the meaning to differ from the speech input, as recited in these claims.” Id. at 55. In other words, rather than repeating all of the Davenport disclosures cited as teaching the limitations of claims 2 and 16, the Petition refers back to them and explains that those disclosures meet the limitations of claims 4, 13, and 17. Our review of the evidence persuades us that although Petitioner does not show that it meets the limitations of claims 2 and 16, Petitioner does demonstrate that the Baker combination meets the broader limitations of claims 4, 10, and 17. In sum, Petitioner persuades us that the Baker combination teaches all of the limitations of claims 3, 4, 9, 10, 13, 17, and provides persuasive rationale to combine the references (Section III.D.3.a). For the reasons stated above, Petitioner demonstrates, by a preponderance of the evidence, that claims 3, 4, 9, 10, 13, 17 would have been obvious over Baker and Baker-613. IPR2019-00210 Patent 9,818,398 B2 68 E. Alleged Obviousness over Baker, Baker-613, and Jamieson 1. Overview of Jamieson Jamieson discloses a system and method for processing speech. Ex. 1008, 1:20–21. Jamieson explains that “[u]sers dictating text with speech recognition find that errors are hard to detect.” Id. at 3:1–2. Jamieson elaborates that doctors may miss subtle mistakes that change the meaning of a report, possibly affecting clinical care. Id. at 3:19–25. Jamieson teaches that “[s]emantic understanding of hypotheses returned by a speech engine could improve the quality of recognition and in cases of misrecognition speed the identification of errors and potential substitutions.” Id. at code (57). Jamieson discloses a method that “achieves semantic understanding by coupling a speech recognition engine to a semantic recognizer, which draws from a database of domain sentences derived from a document corpus, and a knowledge base created for these domain sentences.” Id. 2. Discussion In support of its assertion that claims 2 and 16 would have been obvious over Baker, Baker-613, and Jamieson, Petitioner argues that “it would have been obvious to modify the Baker combination to add Jamieson’s semantic analysis—that is, to directly evaluate the candidate results’ meaning—to help identify medically significant discrepancies.” Pet. 51. Petitioner explains that Jamieson teaches its semantic analysis helps avoid medical error. Id. at 52. Petitioner asserts that such error “could have ‘dire consequences.’” Id. Petitioner asserts that a person of ordinary skill in the art would have been motivated to add Jamieson’s semantic analysis capability to the combination of Baker and Baker-613 “to help identify IPR2019-00210 Patent 9,818,398 B2 69 uncertainty and thus improve the quality of the transcriptions and avoid any dire consequences.” Id. at 53. Petitioner also asserts that Jamieson is analogous art. Id. at 54. In support of the contention that claims 4, 13, and 17 would have been obvious over Baker, Baker-613, and Jamieson, Petitioner asserts that a person of ordinary skill in the art “would have added Jamieson’s semantic analysis to Baker’s uncertainty determination as explained with respect to claims 2 and 16. As a result, the new combination would further evaluate whether the meanings of the top results are different, thus indicating a potential difference from the meaning of the speech input.” Id. at 55. We turn now to detailed discussions of the disputes raised by the parties’ disagreements regarding Petitioner’s challenge of claims 2, 4, 13, 16, and 17 as allegedly obvious over Baker, Baker-613, and Jamieson. a. Claims 2 and 16 Patent Owner argues that Jamieson does not use its semantic analysis to identify discrepancies between recognition results. PO Resp. 67–70. Instead, Patent Owner asserts, Jamieson applies its semantic analysis to only one displayed result. Id. at 67. Patent Owner argues that the semantic analysis serves only to illustrate for the user the displayed result’s meaning. Id. at 68. According to Patent Owner, “[o]nce the displayed hypothesis’s semantic meaning is indicated, it is up to Jamieson’s user to ‘spot’ whether that meaning differs from what the user spoke. . . . Nowhere does Jamieson teach comparing the semantic meaning of a top recognition hypothesis to that of any alternative hypothesis.” Id. at 70. Consequently, Patent Owner argues, Jamieson’s teachings do not support Petitioner’s argument that adding Jamieson’s semantic analysis to the Baker combination would IPR2019-00210 Patent 9,818,398 B2 70 “identify . . . when the ASR’s top choice and highest-ranked alternatives have opposite meanings.” Id. (quoting Pet. 54). Petitioner counters that Patent Owner does not address the combination of the prior art proposed in the Petition, but attacks the references individually. Pet. Reply 28. Petitioner asserts that “the combination uses Jamieson to evaluate for medically-meaningful differences between the top-ranked results.” Id. at 28–29. Patent Owner responds that Petitioner’s “alleged combination merely ‘add[s] Jamieson’s semantic analysis to the Baker combination,’ and alleges this would result in ‘identify[ing] when the top choice and highest-ranked alternatives have different meanings.’” Sur-reply 27 (citing Pet. Reply 28; Pet. 51, 53–54) (alteration in original). Simply combining the references in this manner, Patent Owner argues, would not produce the result alleged by Petitioner. Id. Asserting that only the ’398 patent teaches a semantic comparison of results, Patent Owner argues that Petitioner’s obviousness contentions rest “entirely on impermissible hindsight.” Id. Patent Owner persuades us that Petitioner does not demonstrate obviousness of claims 2 and 16 over Baker, Baker-613, and Jamieson. Petitioner persuades us that the ’398 patent and Jamieson recognize and address a problem common to both. The ’398 patent expresses concern that “if the word ‘malignant’ is erroneously substituted for the word ‘non- malignant’ or vice versa, a patient may be improperly treated.” Ex. 1001, 19:44–47. Likewise, Jamieson recognizes that “the sentence ‘there is no evidence of atelectasis’ can become ‘there is evidence of atelactasis’ through a deletion error. This type of error changes the meaning of the report and could affect clinical care.” Ex. 1008, 3:19–24. IPR2019-00210 Patent 9,818,398 B2 71 Patent Owner persuades us, however, that the challenged claims of the ’398 patent and Jamieson address this problem in different ways. Claims 2 and 16 of the ’398 patent address the problem by “evaluating the two or more results for medically-meaningful discrepancies between the two or more results.” Ex. 1001, 51:27–28, 53:37–38. Jamieson, on the other hand, addresses the misrecognition problem by using semantic analysis to evaluate the top recognition result and display certain information regarding the top recognition result. Ex. 1008, 3:45–56, 4:20–24, 4:46–58, 6:60–7:9, 7:33–54, 7:65–8:4, 8:55–58. For example, “if no valid sentence hypothesis is determined an indicator, such as a red color, is associated with the words from the top ranked hypothesis, and is displayed in the user interface.” Id. at 4:49–54. In such an instance, “[t]he user can ask for corrections.” Id. at 8:4. Similarly, Jamieson’s system displays a red dot beside recognition results whose semantic meaning is abnormal and a green dot beside recognition results whose semantic meaning is normal. Id. at 6:60–7:3. Jamieson identifies this as an advantage, stating that “[t]he semantic type of a sentence can be quickly displayed improving the confidence that the dictated sentence was transcribed correctly and making it easier for the user to spot errors in a given knowledge domain.” Id. at 8:55–58. Because Jamieson does not teach using its semantic analysis to evaluate for medically-meaningful discrepancies between two or more recognition results, we agree with Patent Owner that simply adding Jamieson’s teachings regarding semantic analysis to the Baker combination would not meet the limitations of claims 2 and 16. As noted above, in response to Patent Owner’s argument that Jamieson does not teach using semantic analysis in the manner required by claims 2 and 16, Petitioner criticizes Patent Owner as attacking the IPR2019-00210 Patent 9,818,398 B2 72 references individually. Pet. Reply 28. Explaining that “Baker already teaches evaluating the ASR’s results in other ways,” Petitioner argues that “the combination uses Jamieson to evaluate medically-meaningful differences between the top-ranked results.” Id. at 28–29. Notwithstanding Petitioner’s argument that, generally, it is unavailing to attack individually the references used in a combination, In re Merck & Co., Inc., 800 F.2d 1091, 1097 (Fed. Cir. 1986) (citing In re Keller, 642 F.2d 413, 425 (CCPA 1981)), and, that bodily incorporation is not the test for obviousness, Keller, 642 F.2d at 425, Petitioner has not explained how the proposed combination teaches the disputed claim limitation. Petitioner does not explain persuasively how the combination of the references teach what Jamieson does not—evaluating for medically- meaningful discrepancies between two or more results. Although we agree that Baker teaches different ways of evaluating recognition results, as explained above in Section III.D.3.b.(2), Petitioner does not persuade us that any of those ways involves evaluating for medically-meaningful discrepancies between two or more results. Nor does Petitioner provide persuasive evidence that the combination of Jamieson’s semantic analysis of the top result with Baker’s “other ways” of analyzing demonstrates obviousness of evaluating for medically-meaningful discrepancies between two or more results. The Petition indicates that a person of ordinary skill in the art would have found it obvious in view of Jamieson’s teachings to “[a]dd[] semantic analysis, . . . in addition to relying on low scores, close scores from alternate interpretations, and/or different models disagreeing on the results.” Pet. 53–54. This argument that it would have been obvious to use each of the evaluating approaches taught by the Baker combination and IPR2019-00210 Patent 9,818,398 B2 73 Jamieson’s semantic analysis in one system fails to persuade us, as none of the evaluating approaches meets the claims. And Petitioner does not allege clearly, much less demonstrate persuasively, that it would have been obvious to modify any of the different evaluating approaches taught by the references to meet the claims. Petitioner emphasizes that “Jamieson touts that it can quickly arrive at the semantic meaning of a sentence, and quickly identify out-of-domain recognitions.” Pet. Reply 29. But Petitioner does not provide persuasive reasoning or evidence that this alleged ability would have made it obvious to modify Jamieson’s teaching from analyzing the top recognition result to analyzing for discrepancies between two or more results. Nor does Petitioner allege clearly or persuasively demonstrate that Jamieson’s teaching of semantically analyzing the top recognition result would have made it obvious to modify one of the Baker combinations’ evaluating approaches to meet the claims. For the foregoing reasons, Petitioner does not show, by a preponderance of the evidence, that claims 2 and 16 would have been obvious over Baker, Baker-613, and Jamieson. b. Claims 4, 13, and 17 As discussed in Section III.D.3.b.(3) above, Petitioner has shown obviousness of claims 4, 13, and 17 over Baker and Baker-613. Petitioner relies on Jamieson, for example, as showing the state of the art and as support for the proposition that “a [person of ordinary skill in the art] would have understood that post-processing systems such as the Baker combination typically identify recognition uncertainty . . . if all of the candidates’ scores fail to meet a minimum threshold.” E.g., Pet. 38, 43–44. Our review of the IPR2019-00210 Patent 9,818,398 B2 74 full record persuades us that Petitioner has shown obviousness of claims 4, 13, and 17 over Baker, Baker-613, and Jamieson. IV. CONCLUSION In summary, Claims 35 U.S.C. § Reference(s)/Basis Claims Shown Unpatentable Claims Not Shown Unpatentable 1–4, 9, 10, 12, 13, 15– 17 102 Davenport 1, 3, 4, 9, 10, 12, 13, 15, 17 2, 16 2, 16 103(a) Davenport, Lai 2, 16 1–4, 9, 10, 12, 13, 15– 17 103(a) Baker, Baker-613 1, 3, 4, 9, 10, 12, 13, 15, 17 2, 16 2, 4, 13, 16, 17 103(a) Baker, Baker-613, Jamieson 4, 13, 17 2, 16 Overall Outcome 1, 3, 4, 9, 10, 12, 13, 15, 17 2, 16 V. ORDERS In consideration of the foregoing, it is hereby ORDERED that Petitioner has shown by a preponderance of the evidence that claims 1, 3, 4, 9, 10, 12, 13, 15, and 17 are unpatentable; FURTHER ORDERED that Petitioner has not shown by a preponderance of the evidence that claims 2 and 16 are unpatentable; and IPR2019-00210 Patent 9,818,398 B2 75 FURTHER ORDERED that because this is a Final Written Decision, any party to the proceeding seeking judicial review of the decision must comply with the notice and service requirements of 37 C.F.R. § 90.2. IPR2019-00210 Patent 9,818,398 B2 76 PETITIONER: Jon Strang Inge Osman David K. Callahan Kevin C. Wheeler jonathan.strang@lw.com inge.osman@lw.com david.callahan@lw.com kevin.wheeler@lw.com PATENT OWNER: Richard Giunta Andrew Tibbetts Elisabeth Hunt Daniel Wehner rgiunta-ptab@wolfgreenfield.com atibbetts-ptab@wolfgreenfield.com ehunt-ptab@wolfgreenfield.com dwehner-ptab@wolfgreenfield.com