Regression Analysis in Litigation

By Joshua Fruchter, Esq
Mention “regression analysis” at a cocktail party and heads will likely start to spin. Lawyers unfamiliar with the discipline may conjure up images of economists in ivory towers scribbling indecipherable equations on blackboards. Still, all kidding aside, litigators need to understand the fundamentals of regression analysis since it is frequently used in litigation in which direct evidence of harm is absent and in which there is a need to statistically establish a causal relationship between a defendant’s alleged misconduct and a plaintiff’s damages. Where to begin? Fortuitously, the District Court for the Southern District of New York recently provided a primer on regression analysis as background to its evaluation of expert testimony under Daubert. Reed Construction Data, Inc. v. McGraw-Hill Cos., Inc., 2014 WL 4746130 (Sept. 24, 2014).
 
The fact pattern in Reed involved two companies, Reed Construction Data, Inc. (Reed) and McGraw-Hill Companies, Inc. (MGH), that operate databases which provide information on pending construction projects to subcontractors (e.g., plumbers, electricians) interested in submitting bids for work. To gain an edge against Reed, MGH began accessing Reed’s service surreptitiously through fictitious accounts for the purpose of extracting data that could be used to draw unfavorable comparisons between the two services in MGH’s marketing materials and in negotiations with new customers. For example, when prospects wanted to compare the two services, MGH invited them to search for specific projects in both services, knowing in advance that the suggested projects did not exist in Reed’s database. The conclusion MGH obviously wanted prospects to draw was that Reed’s service was not as comprehensive as that of MGH.
 
After Reed discovered the scheme, the company sued MGH for violations of the Lanham Act, the Sherman Antitrust Act, and various state law torts. Unfortunately for Reed, the company was able to find only one customer to testify that its purchasing decision was influenced by MGH’s flawed comparisons. Thus, rather than contend that MGH’s alleged misconduct caused it to lose customers, Reed claimed that the misconduct enabled MGH to enjoy greater pricing power for some of its data services than it otherwise would have absent the misconduct. To establish a causal link between MGH’s misconduct and the alleged pricing impact, Reed retained an expert, Dr. Frederick Warren-Boulton, to undertake a regression analysis that purported to show that (i) MGH’s flawed product comparisons caused prospects to pay more for MGH’s, and less for Reed’s, national data services during the period when the scheme was operating and (ii) that this price differential disappeared after the alleged misconduct ceased. MGH moved to exclude Boulton’s testimony under Daubert. Before analyzing the admissibility of Boulton’s methodology and conclusions, the court provided an overview of regression analysis.
 
As the court explained, “[t]he fundamental goal of regression analysis is to convert an observation of correlation (e.g., apartments in [New York City] cost more than those in [Boston]) into a statement of causation (apartments in [New York City] cost more than those in [Boston] because they are in [New York City], [and] not because they are larger or more luxuriously appointed).” To isolate the effect of “apartment location” (called the “independent variable”) on “apartment price” (called the “dependent variable”), a statistician needs to account for all other factors (referred to as “control variables”) that might materially influence apartment price and hold them constant. In our example, that would mean comparing the prices of apartments in New York City to the same type of apartments in Boston, ones that are the same size, have the same number of bathrooms and other amenities, etc. The analysis takes the form of an equation resulting in a line that best fits a plot of the data points. However, since the data points never conform perfectly to the line, further mathematical tests are performed to determine whether the “spaces” between the data points and the line suggest that an important variable with statistically meaningful influence was omitted from the model.
 
With the above principles in mind, the court examined Boulton’s opinion that the price differential between MGH’s and Reed’s national data services observed during the time the scheme operated and the gradual disappearance of such differential after the scheme ceased proved that the scheme caused the pricing power enjoyed by MGH during the relevant timeframe. It concluded that Boulton’s analysis suffered from a number of flaws. The court found, among other issues, that Boulton’s model did not properly take into account important variables that could have influenced some of the pricing effects observed – i.e., increased competition and changes in the volume of national construction projects (as a result of the recession). For example, Boulton’s observation that the price differential in the national data market after the scheme ceased narrowed could have been caused by Reed becoming a more effective competitor in that niche (of which there was evidence in the record).  Further, changes in construction volume at the national level during the recession, beginning in 2008, might have also affected pricing during the relevant timeframe. In particular, the latter issue gave rise to what the court referred to as a “multicollinearity” problem – i.e., an inability to isolate the effect of MGH’s alleged misconduct from the effect of construction volume.
 
Based on the foregoing issues, as well as other methodological flaws, the Court granted MGH’s motion to exclude Boulton’s testimony.
 
As noted, regression analysis is used widely in litigation to establish a causal link between misconduct and damages, particularly when direct evidence of harm is absent. Common contexts include securities fraud (in which the goal is to prove that misrepresentations caused investor losses) and employment discrimination (in which the goal is to show a causal relationship between certain alleged discriminatory conduct and hiring, firing, or promotion decisions).

Have you had occasion to work with and/or cross-examine a regression analysis expert? If so, tell us about the experience. In particular, how did you approach getting up to speed on all of the complicated mathematical and statistical concepts?
Avatar

Joshua Fruchter, Esq

An NYU School of Law graduate, Joshua has been practicing as a litigator for over twenty five years. Joshua has published regularly on legal marketing topics in numerous law-related periodicals, and presented on legal marketing technologies to various bar and legal marketing associations.   Mr. Fruchter is a recognized voice in litigation commentary, who has discussed issues ranging from Daubert analyses and inventor testimony in patent litigation, to predictive coding in document reviews.

Get the best expert

Fill out the form and one of our representatives will be in touch with you shortly. Or, you can call or email us directly.