Opinion
20-CV-3677 (LGS) (KHP)
05-18-2021
MOJO NICHOLS, et al., Plaintiffs, v. NOOM INC., et al., Defendants.
ORDER ON PLAINTIFFS' MOTION TO COMPEL SAMPLING PROTOCOL KATHARINE H. PARKER, United States Magistrate Judge
In this putative class action, Plaintiffs contend that Defendant Noom, Inc. ("Noom") tricked consumers into signing up for its weight loss program via an autorenewal feature and made it difficult for consumers to cancel their Noom subscription. Discovery has been extensive and involves collection and review of data from multiple digital platforms. Due to the volume of data in some of Noom's data repositories (each of which contains millions of communications), the parties have met and conferred regarding collection of a sample of information from these repositories. The repositories are called UserVoice, Zendesk and GroupsMagic.
Although Noom has reservations about the relevance and usefulness of Plaintiffs' sampling methodology for UserVoice and Zendesk, Noom has agreed to produce samples from those repositories using Plaintiffs' preferred methodology.
This decision addresses the parties' dispute about the methodology for sampling data from GroupsMagic.
GroupsMagic Sampling Methodology
GroupsMagic is Noom's coaching chat repository. Consumers are assigned a coach who assists the consumer in using the Noom application and reaching weight loss goals. The coach also assists consumers who want to end their Noom subscription. The consumer and the coach communicate through the coach chat function, and the conversations are stored in GroupsMagic. When a user begins a coach chat, he or she is assigned a unique access code. GroupsMagic contains 9.8 million access codes. Each access code links to a chat log of every communication a Noom user has with his or her coach. Therefore, each of the 9.8 million chat logs may contain hundreds of communications that may span many months.
In response to Plaintiffs' document requests, Noom agreed to produce a representative sample of coach chats about cancellation and autorenewal, as well as coach training materials and scripts pertaining to Noom's autorenewal policy, refunds, cancellations, and any bot coaching.
Plaintiffs seek production of the entire chat log history for 2,500 unique Noom users, collected randomly from the GroupsMagic repository. Plaintiffs intend to conduct aggregate analyses of these communications to ascertain the number and frequency of cancellation and enrollment complaints that Defendants received in connection with their trial period and automatic enrollment practices, as well as when Defendants were notified by consumers that consumers were being deceived by the practices challenged in this litigation. Plaintiffs would then extrapolate their findings onto the larger repository.
Noom proposes limiting the dataset from which the random sample is generated to the coach chats of users who are likely to be members of the putative class, that is, users of the Healthy Weight program during the period January 1, 2017 through the date of collection. It further proposes applying search terms to capture communications related to autorenewal, refunds, cancellations, and bot coaching and, after applying those search terms, extracting a list of access codes from the search term-positive results to which Plaintiffs may apply the same randomization process the parties are using for Zendesk to identify 2,500 chat logs for production. After the randomized pull from the subset of chat logs found to contain relevant communications, Noom proposes further limiting the production by only producing communications within a 48-hour window of the search-term positive communication.
Noom's rationale for so limiting the production is that the random sampling of 2,500 chat logs Plaintiffs request may contain hundreds (if not more) of irrelevant discussions about users' weight-loss journey, food anxieties, life stressors, alcohol use, and other highly personal information. It also argues that Plaintiffs' request is not proportional to the needs of the case insofar as it would have to redact protected health information and other personally identifying information from thousands of chats. Finally, it argues the purpose for which it agreed to produce the chats was simply to provide a sample of the types of communications had with coaches to terminate the program or complain about the autorenewal features of the program, not to provide data for statistical analysis.
To minimize the chance that reviewers improperly tag relevant searches and to ensure that the search terms capture the majority of relevant chats, Noom has agreed to do two things. First, Noom has agreed to have two reviewers independently review a sample of the non-search term positive results. The reviewers will compare their results, and if there is any disagreement, a third, independent reviewer will facilitate a discussion to resolve the discrepancy in the coding. Once that process is complete, the reviewers will report the results, including a confidence interval, to demonstrate that the proposed search terms fail to capture only a reasonably small proportion of coach communications related to autorenewal, refunds, cancellations, and bot coaching. According to Noom's expert, this mechanism - known as "content analysis" - is a reasonable and appropriate safeguard to ensure that the population of communications given to Plaintiffs covers the vast majority of relevant communications in the sampling. (Hanssens Declaration ¶¶ 4-6.) Second, Noom has agreed to run the same type of "content analysis" review on the search terms hits. Two reviewers will independently conduct a simultaneous review of the document set. To the extent there is any disagreement in the coding, a third, independent reviewer will facilitate a reconciliation to resolve the disagreement between the two reviewers. This practice is frequently used to resolve concerns over review bias in analyses of the content of communications. (Id. ¶¶ 4-5, 7.)
Turning first to the dataset from which the sample should be pulled, the Court agrees with Noom that the sample should be generated only from users of the Healthy Weight program. These are the users whose chats could contain relevant information, as they are the users contemplated for the putative class. Any proper statistical analysis starts with a relevant pool; here, the putative class members are the relevant pool/population to study. Cf. Hazelwood Sch. Dist. v. United States, 433 U.S. 299, 308 (1977) (discussing appropriate population for statistical analysis). Noom must provide Plaintiffs with the total number of unique access codes (that is, unique user-coach chats) in the population. There is no need for a start date limitation if Noom can segregate chats of those users who were in the Healthy Weight program.
The next question is whether the subset of putative class member chats should be further culled by applying search terms to locate those chat logs with words that might indicate a chat including communications about autorenewal, refunds, cancellations, and bot coaching. Search terms are often used to cull emails and in fact are being used by the parties in this case to cull emails and other communications. Plaintiffs' objection to their use here appears to be based on the intended purpose of their review of the chat logs. Unlike emails, Plaintiffs want to use statistics to analyze the percentage of users complaining about autorenewal and difficulty canceling their Noom subscription. Thus, while the particulars of the communications may be of interest, there is also an interest in ensuring a statistically sound sample for the purpose of this statistical analysis. Noom, in contrast, does not view the chat logs as data amenable to statistical analysis and therefore believes the argument about a statistically sound sample is a red herring. Rather, Noom reasons, it should only have to produce relevant and responsive chats. It also argues that it never agreed to produce the chats for statistical review.
The Court rejects Noom's argument in part. For a period of time, the only way to cancel a subscription was through a coach. Therefore, the coach chats will contain relevant information about canceling a subscription. To the extent Plaintiffs wish to count the number of chats where users complained about autorenewal or difficulty canceling, they may do so. Noom can later make legal arguments about the validity and admissibility of such an analysis. There is no merit to Noom's argument that Plaintiffs should be limited in their use of the information produced to support their claims. Accordingly, Noom should not apply search terms before generating the randomized sampling of 2,500 user-coach chats.
The Court is, however, concerned about the potentially large amount of highly personal, irrelevant information that would be produced through Plaintiffs' proposal. A random sample from the total pool of Healthy Weight users can be generated, providing the statistical soundness that Plaintiffs desire. After that sample of 2,500 chats is generated, however, a process can be applied to identify those among the sample that contain the types of communications that Plaintiffs are seeking.
Plaintiffs suggest that they will (1) conduct a linear review of the full chat logs for all 2,500 users to identify those containing communications they would deem "complaints" concerning the features of the Noom program at issue, (2) use that total as the numerator to ascertain the percentage of the 2,500 users who had such complaints, and (3) extrapolate that percentage onto the larger pool of Healthy Weight users from which the sample was generated.
Noom, alternatively, suggests that search terms be applied to identify relevant chats. Insofar as the parties have already identified search terms to identify these types of communications, Noom's proposal makes sense from an efficiency standpoint and from a standpoint of preventing disclosure of irrelevant communications and those that are highly personal to users. Noom can utilize its content analysis process to evaluate the "null set" among the 2,500 chats pulled to ensure the search terms were appropriately designed to identify the vast majority of relevant chats. It can similarly use its content analysis process to evaluate the chats returned as hits to ensure that they are properly tagged as relevant and responsive. Through this process, Noom can identify with a high degree of accuracy the number of users among the random sample of 2,500 who have relevant complaints (the "complaining subset"). Plaintiffs do not actually need to see the chats for the remainder of the 2,500 users (the "non-complaining subset") for purposes of their statistical exercise or for any other purpose. Indeed, they are not entitled to them, as they do not contain relevant communications within the meaning of Rule 26.
Next, Noom proposes that it be permitted to withhold portions of the chat logs for the complaining subset users to eliminate communications that do not bear on autorenewal, cancellation, and complaints about bot coaching. Rather than do this through a linear review of the chat logs, Noom proposes that it produce communications within a 48-hour window around a search term positive "hit" within a log. Noom does not explain why 48 hours is the appropriate time period to capture relevant communications. It is possible that a user might have a conversation about canceling over a period of weeks. It is also possible that the entirety of the chat may include information about the user experience that led to the complaint about the autorenewal or other features of the program at issue in this case. Thus, the Court is not convinced that the complaining subset of user-coach communications should be truncated by the 48-hour limitation, or any other limitation for that matter.
Noom argues that it will need to review the chat logs and redact for personally identifying information and medical related information and that such a review is burdensome and not proportional to the needs of the case. However, as explained above, the Court has already cut the number of the 2,500-sample set that actually needs to be produced to only those chat logs comprising the complaining subset, which will inevitably reduce the review burden. Additionally, because there is a Protective Order in place, Noom can elect to forego redaction and rely on the Protective Order. To ensure greater protection, Noom also may designate the chats "attorneys' eyes only." Also, both parties are reminded that no personally identifying information of users may be publicly filed with the Court - such information must be redacted per Court rules. Therefore, Noom can reduce its burden by relying on these privacy safeguards for this limited set of information.
CONCLUSION
For the reasons set forth above, Plaintiffs' motion (ECF No. 241) is GRANTED in part and DENIED in part.
SO ORDERED. DATED: May 18, 2021
New York, New York
/s/_________
KATHARINE H. PARKER
United States Magistrate Judge