897 UNITED STATES DISTRICT COURT FOR THE DISTRICT OF COLUMBIA ELOUISE PEPION COBELL, : Civil Action 96-1285 et al. : Plaintiffs : : Washington, D.C. V. : Tuesday, June 17, 2008 : DIRK KEMPTHORNE, Secretary : of the Interior, et al. : : Defendants : MORNING SESSION TRANSCRIPT OF EVIDENTIARY HEARING DAY 6 BEFORE THE HONORABLE JAMES ROBERTSON UNITED STATES DISTRICT JUDGE APPEARANCES: For the Plaintiffs: DENNIS GINGOLD, ESQUIRE LAW OFFICES OF DENNIS GINGOLD 607 14th Street, NW Ninth Floor Washington, DC 20005 (202) 824-1448 ELLIOTT H. LEVITAS, ESQUIRE WILLIAM E. DORRIS, ESQUIRE KILPATRICK STOCKTON, L.L.P. 1100 Peachtree Street Suite 2800 Atlanta, Georgia 30309-4530 (404) 815-6450 KEITH HARPER, ESQUIRE JUSTIN GUILDER, ESQUIRE KILPATRICK STOCKTON, L.L.P. 607 14th Street, N.W. Suite 900 Washington, D.C. 20005 (202) 585-0053 DAVID C. SMITH, ESQUIRE KILPATRICK STOCKTON, L.L.P. 1001 West Fourth Street Winston-Salem, North Carolina 27101 (336) 607-7392 Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 898 For the Defendants: ROBERT E. KIRSCHMAN, JR., ESQUIRE JOHN WARSHAWSKY, ESQUIRE MICHAEL QUINN, ESQUIRE J. CHRISTOPHER KOHN, ESQUIRE GLEN GILLETT, ESQUIRE U.S. Department of Justice 1100 L Street, N.W. Washington, D.C. 20005 (202) 307-0010 JOHN STEMPLEWICZ, ESQUIRE Senior Trial Attorney U.S. Department of Justice Commercial Litigation Branch Civil Division Ben Franklin Station P.O. Box 975 Washington, D.C. 20044 (202) 307-1104 Court Reporter: REBECCA STONESTREET Official Court Reporter Room 6511, U.S. Courthouse 333 Constitution Avenue, N.W. Washington, D.C. 20001 (202) 354-3249 Proceedings reported by machine shorthand, transcript produced by computer-aided transcription. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 899 C O N T E N T S WITNESS DIRECT CROSS REDIRECT RECROSS DR. EDWARD ANGEL By Mr. Siemietkowski -- -- 918 -- By Mr. Smith -- 900 -- -- DR. FREDERICK (FRITZ) SCHEUREN By Mr. Warshawsky 926 -- -- -- By Mr. Dorris -- 979 -- -- E X H I B I T S NUMBER ADMITTED Defendant Exhibit: 460 - 464 978 500 978 505 - 506 925 Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 900 1 P R O C E E D I N G S 2 COURTROOM DEPUTY: This is Civil Action 96-1285, 3 Elouise Cobell, et al. versus Dirk Kempthorne, et al. 4 THE COURT: Dr. Angel was on the stand, and he may 5 resume his position there. And Mr. Smith, I think you're still 6 cross examining. 7 MR. SMITH: Yes, Your Honor. 8 THE COURT: You may continue. 9 MR. SMITH: Good morning, Your Honor. 10 (DR. EDWARD ANGEL, DEFENDANT witness, having been previously 11 duly sworn, testified as follows:) 12 CONTINUED CROSS-EXAMINATION 13 BY MR. SMITH: 14 Q. Good morning, Dr. Angel. 15 A. Good morning, Mr. Smith. 16 Q. Dr. Angel, were you involved at all in the preparation of 17 what has been referred to in this trial as AR-171? 18 A. I supplied Ms. Herman with some documentation, we've chatted 19 about it. 20 Q. Okay. If we could look at Exhibit 68, which is the May 30th 21 version of AR-171, and as you can tell, we've highlighted the 22 period prior to 1972. Do you see that? 23 A. I do. 24 Q. And that's the period that you were involved in with your 25 investigation and the documents you supplied. Is that correct? Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 901 1 A. That's correct. 2 Q. Now, as you can see on the bottom left-hand corner, it says, 3 "as of May 30th, 2008." Do you see that? 4 A. I do. 5 Q. Okay. And we notice that the numbers pre-1971 -- or excuse 6 me, 1972, when we get to the June 4 version, it changed 7 significantly. If we could look at Exhibit 69. In fact, you 8 have hard copies there if you want to compare them. 9 A. Thank you. 10 Q. We're now looking at the June 4th version, and as you can 11 see, the numbers have changed fairly significantly. 12 A. Yes, sir. 13 Q. I'm not going to ask you specifically about those numbers, 14 but my question is: Is there any information that you provided 15 Ms. Herman or Dr. Scheuren or anybody else at Interior from 16 May 30th to June 4th which to your knowledge would have caused a 17 change in those numbers? 18 A. No. 19 Q. So any information that you provided them would have been 20 prior to May 30th? 21 A. Correct. 22 Q. Okay. Thank you. 23 Dr. Angel, yesterday you mentioned -- you were 24 criticizing the model prepared by the plaintiffs, or critiquing 25 it. And you discussed Osage, and I want to make sure I Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 902 1 understood your comments. You were concerned about the use of 2 the number 2,229 in the calculation of the Osage revenue. Do 3 you recall that? 4 A. That's correct. 5 Q. And was your concern about the use of that number prior to 6 1906? 7 A. That's correct. 8 Q. And my question is, did you ever discuss that at all with 9 Ms. Herman or anybody at FTI, the concern about using that 10 number prior to 1906? 11 A. I did. 12 Q. Okay. And do you recall what Ms. Herman's comment was to 13 you? 14 MR. SIEMIETKOWSKI: Objection, Your Honor. Calls for 15 hearsay. 16 THE COURT: Overruled. 17 A. I do not. 18 BY MR. SMITH: 19 Q. If we could look at DX 372-165. And Dr. Angel, I'll 20 represent to you that this is the document -- one of the 21 documents that was provided to us by FTI. And do you see at the 22 top it says "Osage Annuity Payment Per Headright"? 23 A. Yes. 24 Q. And it starts in 1880 and then goes down through the 1800's. 25 And if we could roll down to the bottom, and at the bottom of Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 903 1 the page it says, "Calculated Annuity Per Share Payment, 2,229 2 Shares"? 3 A. I am aware that Ms. Herman did use that figure, and I 4 believe NORC used that figure as well. 5 Q. So you're aware that both parties used the identical figure? 6 A. I am. 7 Q. And my question is, if that figure was not used, have you 8 determined what figure should be used for that pre-1906 period? 9 A. As we've looked at the documentation and taken a look at 10 total payments - mostly these are annual reports of the 11 Commissioner of Indian Affairs - what we've seen is that the 12 payees range between around 1,500, to, by 1906, around 2,100. 13 Q. 1,500 to 2,100, that range? 14 A. Roughly. 15 Q. Okay. Thank you. One other comment or question about Osage 16 before I move on. 17 You had used the phrase "direct pay" at one point in 18 your testimony in connection with Osage. Is my understanding 19 that you use that term because money at times goes directly from 20 the Tribal Trust to headright owners? Is that your 21 understanding of direct pay? 22 A. I don't know if that was Tribal Trust or not. These were 23 payments prior to 1908 that were made as a result of treaty 24 obligations of the United States, but my understanding was, from 25 reading the documentation, that, yes, money was paid directly to Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 904 1 individual Osage Indians. 2 Q. Okay. So when you were using the word "direct pay," you 3 were using that in the context of prior to 1906, when the Osage 4 statute was enacted? 5 A. That's correct. In this instance I'm using it prior to 6 1906. 7 Q. Okay. Thank you. 8 If we could look at DX-483, and I think there's a hard 9 copy there in front of you which you testified was the chart you 10 prepared. And let me ask you this: When you were preparing 11 this chart, if you had two or more data points for the same year 12 and they were in conflict, how would you decide which one to 13 use? 14 A. I believe we addressed that yesterday, and I tried to use, 15 to the best of my ability, both. The one instance we had talked 16 about yesterday was, gee, in the late 1960's, I believe, I had 17 used the GAO report for the total, but I had the other component 18 parts listed as well. 19 Q. Okay. 20 A. So where I could use -- where I did use more than one 21 document per year, where I could, I did. 22 Q. Okay. Good. Let me give you an example. If we could look 23 at on DX-483, page three, and I've highlighted on there the 24 years 1952, '53, and '54. Do you see that? 25 A. I do. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 905 1 Q. And under "IIM System Funds Invested in Government 2 Securities," we have the numbers 35 million 425; 3 33 million, 183; and 31 million, 831. Do you see that? 4 A. I do. 5 Q. And if we could look at DX-64 at 64-5, and this appears to 6 be a document from your records. The "MA" designation is yours 7 on the Bates stamp. Is that correct? 8 A. That's correct. 9 Q. And if we could -- it appears to be trust funds and certain 10 other accounts of the federal government, holdings of federal 11 securities by government agencies in accounts June 30, 1952 to 12 '60. 13 If you could see the first three columns are the 14 identical time period, 1952, '53, and '54. Do you see that? 15 A. I do. 16 Q. And if we could scroll down to the second half of the page 17 and focus in, it says "Individual Indian Money Deposit Fund in 18 Trust Funds." And it looks like for those periods, the first 19 number, 3-5-4-2-5 is the same -- 20 A. Uh-huh. 21 Q. -- less a few dollars, and then the number you use is 22 3-3-1-8-3-2-5-5, which is roughly $900,000 less than what is on 23 the chart. Do you see that? 24 A. I do. 25 Q. And then the following year, 3-1-8-3-1 is the exact same Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 906 1 number that you use. 2 Can you tell me why in this particular case you used a 3 number that was roughly $900,000 less than what was on the 4 exhibit? 5 A. No. It was not intentional. 6 Q. Okay. Was there another document that you looked at that 7 had a lower number on it? 8 A. Could you tell me again which DX this is, please? 9 Q. Sure. This is DX-64. 10 A. I did not. I used DX-61 -- 59, 60, and 61 to derive these 11 figures, so no, I can't tell you. 12 Q. Okay. Would this be one difference that perhaps you were 13 not aware of today and you hadn't discussed with Dr. Scheuren or 14 Ms. Herman? 15 A. This would be. 16 Q. Okay. Thank you. 17 If we could go back to DX-483 and look at page four, 18 which is -- and focus in on 1972. And just looking at this 19 period on this page, you have numbers for funds in banks, 20 numbers for government securities, but you have nothing for 21 funds held in Treasury. Is that correct? 22 A. That's correct. 23 Q. And does that mean that there was no money in Treasury, or 24 you just couldn't find numbers to fill in the blank? 25 A. May I see the document? Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 907 1 Q. Well, let's give an example, DX-77. This appears to be the 2 document that you used for that particular point in 1972, and 3 correct me if I'm wrong, but it seems to be a document that 4 discusses the status of investments as of June 30, 1972. 5 A. That's correct. 6 Q. And that particular document would not discuss what funds 7 were actually held in Treasury and not invested. Is that fair? 8 A. That's fair. 9 Q. So this is one place where you were not able to determine 10 what funds were held in Treasury and were not able to fill in 11 that particular blank. Is that correct? 12 A. That's correct. 13 Q. Okay. When we're looking at funds that are invested, were 14 you able to determine how they value these bonds? Are they 15 valuing them at face value, at purchase price? Were you able to 16 determine that? 17 A. I took the information directly from the chart. 18 Q. Do you know how they valued them? 19 A. No, I don't. 20 Q. And obviously we're looking at, for example, Treasury 21 securities at one point time, on one day during a whole year. 22 Is that fair? 23 A. These are supposed to be year-end account balances, correct. 24 Q. And during that intervening year, presumably securities were 25 being purchased and redeemed? Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 908 1 A. Correct. 2 Q. And were you able to ever tell, during the intervening 3 period, the volume of purchases and redemptions of securities? 4 Did you find any documents on that? 5 A. I did not analyze those documents, but I did see there's 6 very heavy documentation in Central Office Albuquerque records 7 that are currently at the American Indian Records Repository. I 8 believe I've reviewed them at the Office of the Trust Records in 9 Albuquerque when they were there. 10 But there's fairly heavy documentation on that. I 11 didn't feel qualified, obviously, to analyze it. That's not a 12 specialty of mine. 13 Q. Are those documents that you copied and brought with you, or 14 did you leave them at the repository? 15 A. There's many, many boxes. No, those aren't ones. I 16 certainly may have copied a sample document or two, but I made 17 no effort to collect a whole run. 18 Q. So those are documents that to your knowledge nobody has 19 reviewed today? 20 A. I don't know whether anyone has or not. 21 Q. Okay. Good. 22 By the way, the AR-171 has an interest column. Were 23 you involved at all in the determination of what the interest 24 factor was that was used? 25 A. We had documentation on the interest factors, which, you Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 909 1 know, I sent to Ms. Herman, but I don't know how much of it she 2 used in AR-171. 3 Q. Okay. Is it fair to say that for most of the life of the 4 trust, individual Indian funds earned interest? 5 A. For the most part. Sometimes they did not, but for the most 6 part that was the goal. 7 Q. Right. 8 A. There were years that we know that cash sat in Treasury. 9 Q. Okay. But basically, since the inception of the trust, for 10 the most part Individual Indian Trust funds were invested in -- 11 A. Yes. 12 MR. SIEMIETKOWSKI: Objection, Your Honor. Asked and 13 answered. 14 THE COURT: I'll allow it. 15 A. Yes. 16 BY MR. SMITH: 17 Q. One final line of questions. You were asked yesterday 18 regarding internal controls, and I believe you testified in your 19 opinion as a historian, the processing seemed reasonable? 20 A. Yes. 21 Q. And you've actually written a paper on the audit procedure 22 at Interior. Is that fair? 23 A. I wrote a -- 24 THE COURT: A what? 25 MR. SMITH: On the history of the audits at -- Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 910 1 THE COURT: Oh, the audits. 2 A. That's correct. In 2000. 3 BY MR. SMITH: 4 Q. Just briefly, my recollection is you looked at the history 5 of the audits from roughly 1910 through 2000. Is that correct? 6 A. 1940, I believe, was when the report started. 7 Q. Okay. My recollection is that back in 1910 there was a 8 commissioner by the name of Valentine. Is that correct? 9 A. That's correct. 10 Q. And Commissioner Valentine at that time determined that 11 there were problems with the accounting system. Is that 12 correct? 13 A. That's correct. 14 Q. And that he asked Congress to fund money to do a study of 15 the accounting system? 16 A. That's correct. 17 Q. And that led to what we've referred to as the 1914 report 18 done by the New York Bureau of Municipal Research. Is that 19 correct? 20 A. That's correct. 21 Q. And they found significant problems with the accounting 22 system. Is that correct? 23 A. That's correct. 24 Q. And so they created a new accounting system under 25 Cato Sells' administration. Is that correct? Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 911 1 A. Yes. In 1917. 2 Q. And is it fair to say, then, by the 1920's, they found 3 problems with that system as well? Is that correct? 4 A. It's under review continually. I mean, it's being audited 5 by -- the system itself is under frequent review, through its 6 history, by GAO, internal reviews for Interior Department. It's 7 a system of audit, reform, audit, reform. 8 Q. Right. So you go through these cycles; they put in a new 9 system, they find problems with it, they put in a new system, 10 they find problems with it? 11 A. Or refine -- you know, address the problems that have been 12 found, correct. 13 Q. Are you aware of a single independent evaluation prior to 14 1953 that said BIA's accounting system with respect to the 15 IIM Trust was working effectively? 16 A. I'm aware of a number of reviews that found some problems 17 with it, but nothing between 1917 and '53 that saw irresolvable 18 problems. I know that, for example, in the '30s and '40s, the 19 Bureau of Indian Affairs worked with the General Accounting 20 Office in revising its regulations; I know that in the '40s the 21 General Accounting Office said that there were problems, and 22 then I know that by I believe 1953, the General Accounting 23 Office approved the BIA accounting system. 24 Q. Okay. And my question was specifically an independent 25 audit, an audit by a third party that up to 1953 said that the Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 912 1 systems were working effectively? 2 MR. SIEMIETKOWSKI: Objection, Your Honor. Asked and 3 answered. 4 MR. SMITH: Your Honor, I don't think he answered that. 5 THE COURT: Overruled. 6 A. Again, I wasn't seeing them with major problems, I was 7 seeing audits that were -- you know, that were addressing 8 individual issues and specific issues, and attempting to reform 9 the system. 10 BY MR. SMITH: 11 Q. Perhaps I'm not being clear enough. When I'm referring to 12 an independent audit, someone other than Interior or someone not 13 associated with the government. Are you aware of an independent 14 audit prior to 1953 that said the systems were working 15 effectively? 16 A. I'm sorry, by independent audit you mean one that wasn't a 17 federal audit? 18 Q. Yeah. 19 A. No, I'm not aware of an independent audit prior to the 20 Andersen audit, I don't believe. 21 Q. Okay. Other than the one in 1914? 22 A. Yes. 23 Q. Now, you've talked a little bit about the GAO studies that 24 were done in the 1920's to the 1950's. And if we could look at 25 Exhibit 95, and you've seen this document before, it's the Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 913 1 Comptroller General's report from 1929. 2 A. That's correct. 3 Q. And if we could look at page six of that document, please, 4 and look at the highlighted language. 5 A. (Witness complies.) 6 Q. Have you had a chance to look at that? 7 A. Yes. 8 Q. So at this point in time in 1929, the comptroller said that 9 "no detailed check could be made of all revenues accruing to the 10 individual Indians to determine that each received all to which 11 he was entitled." Is that correct? 12 A. Yes, that's what the document says. 13 Q. Right. And that's during the period that you contend that 14 the GAO was doing their audits? 15 A. Yes. May we scroll down a bit? 16 Q. Sure. 17 A. If you look at the next paragraph, certainly the next 18 paragraph beginning, "The Indian fiscal agents render to the 19 General Accounting Office a monthly accounting for all funds," 20 that looks to me to be a fairly detailed, a fairly detailed 21 analysis. 22 "Schedules of monies collected and all disbursements 23 are supported by vouchers or other documents showing the 24 expenditure to have been properly authorized. These accounts 25 are audited by the General Accounting Office, and the balances Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 914 1 reported verified." 2 Q. Right. Okay. You are aware, are you not, of the letter 3 from Gene Dodaro, the principal assistant to the comptroller, 4 regarding the GAO studies? 5 A. Yes, I've read it. 6 Q. If we could show Exhibit 96, and look at page two. 7 MR. SIEMIETKOWSKI: Objection, Your Honor. Relevance. 8 THE COURT: Well, I was waiting for that objection. 9 Mr. Smith, you know, the reason we're here today is 10 because of all these old reports that say that the Indian 11 accounting system was no good. I think that's been established 12 from the get-go. What we're trying to do in this proceeding is 13 fix a number. I'm not sure what this quotation of a paragraph 14 from this report and that report and another report is doing for 15 us in that regard. 16 MR. SMITH: Your Honor, if this is of no benefit to the 17 Court, I'll move ahead. 18 THE COURT: I won't say no benefit. I would never say 19 that to you, Mr. Smith. 20 MR. SMITH: Thank you, Your Honor. 21 THE COURT: I would say limited benefit. 22 MR. SMITH: I understand. 23 BY MR. SMITH: 24 Q. Is it fair to say that in the 1940's, BIA was complaining to 25 Congress that they didn't have enough money to do audits? Do Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 915 1 you recall that? 2 A. There have been reports to that effect. I have seen 3 Congressional hearings to that effect. 4 Q. And they complained that their audits were seriously in 5 arrears. Is that correct? 6 A. Please say that again. 7 Q. They complained that their audits were seriously in arrears 8 because of lack of money? 9 A. They wanted -- I remember specifically requesting more 10 auditors. 11 Q. Because they were in arrears, they were behind? 12 A. Correct. 13 Q. And it was in the 1950's that they started -- let's say they 14 increased the internal audits, the audits by BIA? 15 A. Yeah. BIA audit division began making annual trips to every 16 agency beginning in 1956. 17 Q. And you've noted in your own writing a lot of the problems 18 that were encountered during those audits. Is that fair? 19 A. That's fair. 20 Q. Unauthorized payments, balance problems. Correct? 21 A. That's correct. 22 Q. And is it fair to say that the audit division of BIA 23 expressed its dismay over the fact that they were finding the 24 same problems over and over again? 25 A. That's correct. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 916 1 Q. And then by the 1960's, any audits the GAO had done stopped. 2 Is that correct? The GAO discontinued any audits in 1960. Is 3 that fair? 4 A. I believe that they made occasional -- I know they slowed 5 down very much in the 1960's, but I don't believe that they 6 stopped until some point after that. But I can't remember. 7 They've slowed down very, very much. 8 Q. Okay. One last question. Is it fair to say, in all the 9 documents that you've reviewed, that money in fact was being 10 caught up in the system and was not being disbursed to 11 beneficiaries? 12 MR. SIEMIETKOWSKI: Objection, Your Honor. To the form 13 of the question. 14 THE COURT: That's a pretty broad question, Mr. Smith, 15 but I'll allow it. If the witness wants to answer it, he can 16 answer it. 17 "Is fair to say, in all the documents that you've 18 reviewed, that money in fact was being caught up in the system 19 and was not being disbursed to beneficiaries?" 20 THE WITNESS: Well, yes. Because money is entering the 21 system that's not meant for beneficiaries. That's the point of 22 the buckets that we've talked about. 23 THE COURT: I don't think that's what he meant. 24 BY MR. SMITH: 25 Q. Let me ask the question differently. That money intended Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 917 1 for disbursement for beneficiaries wasn't going to them. Isn't 2 that reflected in the audits you've reviewed? 3 A. There have been audits that have talked about that. 4 Q. Let's look at one. If we could look at DX-10, please. And 5 again, this appears to be a trust fund task force study compiled 6 May 20, 1975. Do you see that? 7 A. I do. 8 Q. If you look at the bottom of the page, does it have your 9 Bates stamp on it? 10 A. I know it's one of our documents. 11 Q. Okay. If we could look at I believe it's 10-9, and if you 12 could look at the highlighted language. 13 A. (Witness complies.) 14 Q. Do you recall reading that language when you were looking at 15 the audits? 16 A. I do. 17 Q. In fact, it indicates that as of 1975, they were still 18 retaining checks in the millions going back to the 1880's? 19 A. Could we scroll up, please? 20 Q. Sure. 21 MR. SMITH: If you could scroll up. 22 A. All right. Yes. Thank you. 23 BY MR. SMITH: 24 Q. And my question is simple. This would be one example, would 25 it not, where trust funds were intended for beneficiaries, but Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 918 1 for whatever reason they were not being distributed. Is that 2 fair? 3 A. That would be, yes. 4 MR. SMITH: Your Honor, I have no further questions. 5 THE COURT: All right. Thank you. Redirect, 6 Mr. Siemietkowski? 7 MR. SIEMIETKOWSKI: Yes, Your Honor. Good morning, 8 Your Honor. 9 THE COURT: Good morning. 10 REDIRECT EXAMINATION 11 BY MR. SIEMIETKOWSKI: 12 Q. Good morning, Dr. Angel. 13 A. Good morning, Mr. Siemietkowski. 14 Q. Dr. Angel, how long have you been reviewing Indian 15 documents? 16 A. About 25 years, total. 17 Q. And how many Indian documents are in Morgan Angel's 18 collections? 19 A. Literally thousands. There's 10,000 in our Cobell 20 collection. 21 Q. I'm sorry, how many? 22 A. There's about 10,000, roughly, in our Cobell collection. 23 Q. I'm going to show you DX-72, Dr. Angel, and specifically 24 page four from DX-72. Once enlarged, I'll ask if you recognize 25 that from your cross-examination of yesterday. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 919 1 A. I do. 2 Q. Now, do you recall Mr. Smith asking you why you did not 3 include the $121 million per year on your total IIM chart for 4 1968? 5 A. I do. 6 Q. And do you recall saying something along the lines that you 7 missed that one? 8 A. I do. 9 Q. Now, looking at the language as highlighted there, is there 10 anything about that language that would explain your not 11 including that for 1968 or any year on your chart? 12 A. I reviewed that last night at my office, and in reviewing 13 it, it appeared to me that it was an average over years; in 14 other words, cash receipts running at a rate of 121 million per 15 year. 16 Q. Having had a chance to reflect on that in your office last 17 night, would you use that figure now if you were updating your 18 chart today? 19 A. No. 20 Q. Do you recall Mr. Smith yesterday, Dr. Angel, showing you a 21 1915 CIA report and asking you where direct pay was addressed in 22 that report? 23 A. I do. 24 Q. Now, do you recall whether direct pay was addressed in that 25 1915 CIA report? Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 920 1 A. It wasn't. 2 Q. Were there any earlier years, Dr. Angel, in which direct pay 3 was addressed in CIA reports? 4 A. There were. Last night I went back to my office and I found 5 a 1912 annual report of the Commissioner of Indian Affairs and a 6 1913 annual report of the Commissioner of Indian Affairs, which 7 broke out direct pay from -- which broke out the issue of direct 8 pay on allotted lands. 9 MR. SIEMIETKOWSKI: May I approach the witness, Your 10 Honor? 11 THE COURT: Yes. 12 MR. SIEMIETKOWSKI: Your Honor, I'm handing the witness 13 what will be marked as Defense Exhibits 505 and 506, and I would 14 ask permission to provide hard copies to the Court as well as to 15 opposing counsel. I'll hand the court clerk copies, as yet 16 unmarked, of 505 and 506, and do the same for opposing counsel. 17 BY MR. SIEMIETKOWSKI: 18 Q. Dr. Angel, 505 is going to be marked the 1912 report. Do 19 you have that in front of you? 20 A. I do. 21 Q. Could you please identify that report for the Court? 22 A. That's the annual report of the Commissioner of Indian 23 Affairs for the fiscal year ended June 30, 1912. 24 Q. In this report, which is DX-505, have you been able to 25 identify any particular page that indicates direct pay? Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 921 1 A. Pages 247 to 251, table 60 -- excuse me, table 50, "Allotted 2 Lands Under Lease During Fiscal Year Ended June 30, 1912," 3 you'll see that the reservation is listed, or I'm sorry, the 4 agency is listed, and then they discussed how leased. And 5 you'll see that much of it is through departmental control, but 6 you'll also see "By Indian Direct With Departmental Permission," 7 and you'll see also "By Indian Direct." 8 There's a recapitulation on page 251. 9 Q. On page 251 of the 1912 report, DX-505, where do you see an 10 indication specifically there of direct pay? 11 A. "By Indians Direct Without Permission." 12 Q. And Dr. Angel, just for the record, since we don't have this 13 on the screen, could you please direct the Court to the left or 14 the right side of the chart? 15 A. I beg your pardon. At page 251 you'll see total lease 16 through departmental control by Indians direct with permission 17 to lease, and by Indians direct without permission from the 18 department. 19 Q. And what do those two lines mean, Dr. Angel? 20 A. That means that leases were made directly without money 21 coming into the department. 22 Q. Turning your attention next to DX-506, which is the 1913 23 report, do you recognize that document? 24 A. I do. It's the annual report of the Commissioner of Indian 25 Affairs for the fiscal year ended June 30, 1913. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 922 1 Q. Using DX-506, Dr. Angel, are you able to point the Court to 2 any place in this particular document that indicates direct pay? 3 A. Pages 216 to -- pages 216 to 219, table 49, "Allotted Lands 4 Under Lease." 5 Q. Table 49 is on page 219? 6 A. I beg your pardon. Table 49 is from pages 216 to 219. 7 Q. And which particular language in those pages, Dr. Angel, 8 indicates a direct pay? 9 A. "By Indians direct with permission to lease without 10 departmental control; by Indians direct without permission 11 without departmental control." 12 Q. Now, to your knowledge, Dr. Angel, when did this breakdown 13 of direct pay in the CIA reports end? 14 A. I found it just for these two years. 15 Q. These two years. And do you recall which years, again, that 16 Mr. Smith showed you the CIA report? 17 A. I believe almost 1915. 18 Q. Now, Dr. Angel, do you recall yesterday in cross-examination 19 Mr. Smith showing you investment statistics prior to 1928? 20 A. I do. 21 Q. Do you recall him asking you why you did not include the 22 statistics on your chart? 23 A. I do. 24 Q. Do you recall whether those statistics he showed you prior 25 to 1928 were aggregate for the IIM system or for a particular Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 923 1 agency? 2 A. Could you please repeat that? 3 Q. Yes. The statistics that Mr. Smith showed you yesterday in 4 cross-examination regarding investments -- 5 A. Uh-huh. 6 Q. -- do you recall whether those were aggregate statistics or 7 just for a particular agency? 8 A. I don't recall. I know that we've not found aggregate 9 statistics on an annual basis that we would have been able to 10 put into the chart prior to 1928. 11 Q. Thank you. 12 MR. SIEMIETKOWSKI: And if I could please have DX-483 13 on the screen. 14 BY MR. SIEMIETKOWSKI: 15 Q. And once on the screen, I'll ask Matthew to show DX-583, 16 which is your total IIM chart, Dr. Angel, and to take a look 17 specifically at 1953. 18 MR. SIEMIETKOWSKI: If Matthew could zoom in a bit on 19 the far right column for 1953, if that's possible. 20 THE COURT: Good solution, Matthew. 21 MR. SIEMIETKOWSKI: Well done, Matthew. 22 BY MR. SIEMIETKOWSKI: 23 Q. During your cross-examination this morning by Mr. Smith, 24 Dr. Angel, do you recall Mr. Smith showing you your figures for 25 1952, 1953, and 1954? Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 924 1 A. I do. 2 Q. Do you recall him showing you a different figure for 1953 3 taken from DX-64? 4 A. I do. 5 Q. Would you please indicate to the Court whether DX-64 was the 6 basis for your figure for 1953? 7 A. No, the basis for my figure was DX-61. 8 Q. Is it true, as Mr. Smith asked you, that at times you chose 9 the better data when several data points existed? 10 A. No, that's not true. That's not true. I tried to make this 11 as representative as possible, which is another reason why I 12 included that 1922 receipt and disbursement figure that we've 13 talked about. I've tried to be as complete as possible with 14 this chart. 15 Q. Well, that's what I want to ask you last about, Dr. Angel, 16 the 1922 disbursement figure. Do you recall talking about that 17 yesterday with Mr. Smith? 18 A. I do. 19 Q. Do you recall him asking you about the 1922 CIA report? 20 A. I do. 21 Q. Do you recall telling Mr. Smith that you viewed that data 22 point for 1922 as an outlier? 23 A. I do. 24 Q. What did you mean by that? 25 A. Statistical anomaly, something that didn't strike me as Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 925 1 making sense in view of what I had seen before that period and 2 what I saw after that period. 3 Q. Now, on your direct yesterday you had discussed at times 4 examples of qualified data that you had found. Do you remember 5 that? 6 A. I do. 7 Q. Is this 1922 outlier an example of qualified data? 8 A. Yes. 9 Q. And did you convey this outlier qualification for 1922 to 10 NORC? 11 A. I did. 12 Q. Thank you, Dr. Angel. 13 MR. SIEMIETKOWSKI: No further questions, Your Honor. 14 THE COURT: All right. Dr. Angel, I think that 15 completes your testimony. You're excused, sir. You may step 16 down. 17 THE WITNESS: Thank you. 18 MR. SIEMIETKOWSKI: Your Honor, I would like to move 19 the admission of DX-505 and 506. Those are the two CIA reports 20 that we just discussed. 21 THE COURT: 505 and 506 are received. 22 (DEFENDANT EXHIBIT Numbers 505, 506 were moved into 23 evidence.) 24 THE COURT: Mr. Warshawsky? 25 MR. WARSHAWSKY: Good morning, Your Honor. Our next Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 926 1 witness will be Dr. Fritz Scheuren, and he's being taken out of 2 the witness room right now. 3 (Oath administered by Courtroom Deputy.) 4 MR. WARSHAWSKY: May I approach the witness? 5 THE COURT: Yes. 6 (DR. FREDERICK SCHEUREN, DEFENDANT witness, having been duly 7 sworn, testified as follows:) 8 DIRECT EXAMINATION 9 BY MR. WARSHAWSKY: 10 Q. Good morning, Dr. Scheuren. 11 A. Hi. 12 Q. Would you please state your name for the record? 13 A. Frederick, usually go by the name of Fritz, Scheuren. 14 Q. And Dr. Scheuren, where do you reside? 15 A. I live in Alexandria, Virginia. 16 Q. What do you do for a living? 17 A. I'm a statistician. 18 Q. And with whom are you employed? 19 A. I work for National Opinion Research at the University of 20 Chicago. 21 MR. WARSHAWSKY: Your Honor, just to give a brief 22 synopsis of Dr. Scheuren, as the Court requested. As the Court 23 is aware from Dr. Scheuren's testimony from last October, he is 24 one of the leading statisticians in the world -- maybe the 25 world, certainly in the United States. I don't know. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 927 1 But in any event, he is a senior fellow and vice 2 president with the National Opinion Research Center affiliated 3 with the University of Chicago, and he's been there since 2001. 4 He's the author of numerous books, articles, papers about the 5 subject of statistics; former president of the American 6 Statistical Association; well decorated, highly honored, all of 7 that, in the field of statistics for his work in pro bono human 8 rights efforts. 9 Dr. Scheuren is going to be offered as an expert today 10 to provide opinions in the area of statistics, and specifically 11 in two areas; first with regard to the results of a multiple 12 imputation effort undertaken by NORC; and the second thing will 13 be to provide some analysis of Dr. Cornell's single variate 14 model which the plaintiffs provided in their case-in-chief. 15 THE COURT: All right. He may certainly give that 16 testimony. He was qualified previously. 17 MR. WARSHAWSKY: And just for the record, would you put 18 up Defendant's Exhibit 458? 19 BY MR. WARSHAWSKY: 20 Q. Dr. Scheuren, do you recognize DX-458? 21 A. I do. 22 Q. Would you please tell Judge Robertson what it is? 23 A. That's my resume', updated slightly since last fall. 24 Q. And can you summarize very briefly how it's been updated 25 since you testified in October? Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 928 1 A. Well, since last fall I actually have produced two more 2 books than I had here. I'm very busy in other things, but I 3 haven't produced any more papers at this point. 4 Q. Dr. Scheuren, would you provide a general overview of what 5 NORC did in connection with the analysis prepared for this 6 hearing? 7 A. Certainly. We took to heart the judge's request for 8 providing information about inputs and outputs, how much -- cash 9 in, cash out, I guess is the phrase that I believe the judge has 10 used. And we attempted to look at the results that we had 11 obtained from Dr. Angel and from Michelle Herman's team, 12 Michelle particularly, in order to try to make an assessment of 13 what we would be able to conclude about that matter using our 14 statistical tools. 15 Q. Now, you referred to information that you received from FTI 16 and Dr. Angel's group. Did they provide you complete data? 17 A. Well, no, they didn't. They couldn't. We carefully 18 examined what they gave us to make sure that it was the final 19 numbers insofar as they could give them to us. We were very -- 20 wanted to take -- did not want to take the -- have them do any 21 imputation at all because we were attempting to do that. 22 Q. Now, so I take it from your answer there was missing data 23 in -- 24 A. Yes, quite a bit. 25 Q. Just so we don't have problems with the court reporter, Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 929 1 please let me finish my question and I'll let you finish your 2 answer. Okay, sir? 3 A. (Indicating.) 4 Q. Excellent. Thank you. 5 Is missing data a problem statisticians deal with 6 periodically? 7 A. Yes, very common problem. 8 Q. And is it unusual to encounter missing data when you're 9 dealing with an issue dating back, say, several decades? 10 A. Well, no, I wouldn't have thought it was unusual, and it 11 isn't in this case unusual. 12 Q. Well, how do statisticians deal with the issue of missing 13 data? 14 A. Well, you develop an understanding of the data set and the 15 environment the data was in, as this court is doing, and then 16 you employ your past experience with similar situations, you let 17 the data speak to you and suggest to you how it might be filled 18 in. 19 Q. Are there any techniques, terms -- 20 A. Well, yes -- 21 Q. I'm sorry. Are there any statistical techniques utilized in 22 your field that -- 23 A. There are quite a few techniques utilized. The one that we 24 thought would be appropriate here was a technique called 25 multiple imputation. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 930 1 Q. What are some of the other techniques that you could utilize 2 to supply missing data? 3 A. Well, you can use some forms of substitution, which turn out 4 to be what was done in the earlier data that Michelle talked 5 about a couple of days ago. Those kinds of techniques don't 6 allow you to see the uncertainty in the data. The multiple 7 imputation technique was specifically developed so you could see 8 the uncertainty. 9 Q. And when you're referring to substitution, that was with 10 respect to the Chavarria Dunne -- 11 A. That's correct. Yes, we looked at that data, again, once it 12 was pointed out to us. We had not known about that until it was 13 brought up in cross-examination. We looked at that, and it does 14 affect the uncertainty in the data, and I'll come back to how I 15 think -- to what degree I think it affects the uncertainty at 16 the end. 17 Q. Very good. Substitution, is there anything improper about 18 using substitution as a means to supply missing data input? 19 A. It's very common. When I was at the IRS, I used to head the 20 statistics operation at the IRS for many years, and we used 21 substitution there when we didn't get late returns from 22 taxpayers. 23 THE COURT: Do you let taxpayers fill in the blanks the 24 same way? 25 THE WITNESS: No, sir. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 931 1 THE COURT: Sauce for the goose, huh? 2 THE WITNESS: No, let me explain. My particular 3 operation at the IRS was to produce statistics which could be 4 used for the national accounts in the U.S., and for tax policy 5 simulation. And they had to be in some sense representative. 6 If we were missing some major corporations, for example, because 7 they were late filers, then we had to find a way to introduce 8 that information in order to produce the records for that year. 9 Not that they didn't file eventually. They did. But we usually 10 used data from a previous year. 11 BY MR. WARSHAWSKY: 12 Q. Dr. Scheuren, I would like to have you look now at -- 13 MR. WARSHAWSKY: Would you put up Defendant's 14 Exhibit 459? 15 BY MR. WARSHAWSKY: 16 Q. And I'm actually holding a copy of a book called "Rubin 17 Multiple Imputation For Nonresponse in Surveys." The cover page 18 is displayed as DX-459. Are you familiar with this book, sir? 19 A. Oh, yes, I am. 20 Q. And is this book considered a reliable authority in the 21 field of statistics? 22 A. I think it is, yes. 23 Q. Who is Professor Rubin? 24 A. Well, he is a professor at Harvard. He just was the chair 25 at the statistics department at Harvard, now passed on. Chairs Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 932 1 rotate at places like that. He's really well-known. All your 2 nice things about me earlier really do apply to him. He's an 3 outstanding individual. 4 Q. Well, why don't we refer to page two of DX-459. There's 5 Dr. Rubin. And then move on to page four. 6 MR. WARSHAWSKY: And why don't you blow up the preface 7 section that we've got highlighted here? 8 BY MR. WARSHAWSKY: 9 Q. In the first paragraph, first two sentences, Professor Rubin 10 wrote, quote, "Multiple imputation is a statistical technique 11 designed to take advantage of the flexibility in modern 12 computing to handle missing data. With it, each missing value 13 is replaced by two or more imputed values in order to represent 14 the uncertainty about which value to impute," end quote. 15 Dr. Scheuren, do you understand what Professor Rubin is 16 talking about there? 17 A. Yes, I do. 18 Q. And do you agree with that statement? 19 A. Absolutely. That's why it was developed. There were a lot 20 of good techniques to fill in missing data or to impute data, 21 but there were very few techniques at that time, and still very 22 few, that really allow you to see the uncertainty once you have 23 to deal with missing data. 24 Q. And do you have an understanding as to why Professor Rubin 25 referred to the flexibility of modern computing in that Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 933 1 statement? 2 A. Amazingly -- that was a long time ago. Amazingly, he 3 anticipated the day we're in now, where computing is virtually 4 free and we can do many, many imputations. He was looking at 5 only a handful in those days because nobody thought it could be 6 done. 7 Q. Now, in the second sentence, Professor Rubin talks about 8 doing this imputation process, quote, "in order to represent the 9 uncertainty about which value to impute," end quote. 10 Why is Professor Rubin -- based on your understanding, 11 why is he referring to uncertainty with respect to multiple 12 imputation? 13 A. Well, that was the issue that I brought to him - and you 14 will get us to that next paragraph - because there are a lot of 15 good techniques used at the Census Bureau, hot deck being one of 16 them, and many others -- 17 Q. I'm sorry, you referred to hot deck? 18 A. Hot deck is a technique, I think it really started -- I 19 think it started in the 1940 census, but became widely used in 20 the 1950 census to replace or fill in data that -- and complete 21 data that individuals had not completed on their census 22 questionnaires. 23 Q. So getting back to the earlier question, explain why 24 uncertainty is relevant when you're talking about multiple 25 imputation. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 934 1 A. Because it affects the variance of your estimates. You have 2 to widen the confidence intervals that you're using to represent 3 the phenomenon. It's not free. Missing data is not free. 4 Q. What does that mean, I'm sorry, "it's not free"? 5 A. Let's suppose that you have a sample -- I'll use a polling 6 example, since I just finished a book on exit polling. If you 7 have a sample of 1,500, and there's no missing data, and you're 8 trying to make an estimate of what the proportion -- of a 9 50 percent proportion, then the margin of error, as you see in 10 the newspapers all the time, all of you, is plus or minus three 11 percent. 12 But if in fact you didn't have 1,500, you had a much 13 smaller number, perhaps because the data was missing, then that 14 margin of error would be much wider. 15 Q. Now, you sat in the gallery during Dr. Cornell's testimony? 16 A. I did. 17 Q. And was Dr. Cornell's analysis one where you could 18 utilize -- one that you could predict or provide estimates of 19 uncertainty for? 20 A. He didn't do that. He -- I think he had a good approach in 21 a lot of ways. There were some issues with the data he used, 22 but we'll come back to that later. But I thought he 23 fundamentally did not address this issue of uncertainty. 24 Q. If we can get to the second paragraph now on page four, it 25 states that "the real impetus for multiple imputation, however, Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 935 1 came from work encouraged and supported by Fritz Scheuren," and 2 continues on. That's you. Right? 3 A. Yes, it is. 4 Q. And how did you provide the impetus for multiple imputation? 5 A. Well, I had a problem that I was looking at the Census 6 Bureau data, where -- we working, so long ago now, working on 7 the war on poverty, and I was just after that looking at how to 8 deal with missing income data. And I saw the techniques that 9 the Census Bureau had advocated back in the '50s but were no 10 longer using, and which were in any case only of limited value, 11 and I felt we needed to look at that because it was very 12 important to understand the uncertainty in the data. 13 Q. And getting back to the first paragraph in that second 14 sentence, Professor Rubin spoke about replacing a missing value 15 with two or more imputed values. 16 A. Correct. 17 Q. How does that work, generally? 18 A. Well, what you want to do is you want to -- one of the 19 things that we do in statistics, we operate nearly all the time 20 as if we had complete data sets. In introductory statistics 21 courses, in many advanced courses we don't assume that we have 22 any missing data, we assume that we have all the data we need. 23 And what Don said was, let's complete the data to 24 complete the data matrix, as he called it, and then use the 25 standard techniques that we all learned long ago. And in order Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 936 1 to -- but in order to deal with that, the missing data part, we 2 have to correct the variance estimate, the uncertainty measures, 3 and then we can in fact use all our tools that we've learned way 4 back, both in intermediate courses, advanced courses, and so 5 forth. It was a great idea. 6 Q. What did you mean about completing the data matrix? 7 A. Well, I think in a few minutes we're going to look at the 8 data matrix from this data set that we're examining for this 9 trial, and we'll talk about how multiple imputation completed 10 that. 11 Q. But just on a very general level, what does it mean to 12 complete the data matrix? 13 A. Let's say I have five variables and I have them for 14 100 observations. So 100 observations; each observation has 15 five things that I want to get from that person or that 16 organization or whatever it is. And so I just look at it as an 17 array, 100 rows long, five columns wide, okay. And some of 18 those values are missing. They're not there. They're blank. 19 Okay? And they shouldn't be blank. 20 And so what Don said was, well, we'll come in and find 21 a way to fill in those missing -- fill in those holes, and then 22 we'll have a complete data set and then we can go on and do what 23 we intended to do with a complete data set, whatever that was. 24 Q. How do you fill in those holes? 25 A. Well, we use an imputation technique. There are lots of Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 937 1 them, but we use an imputation technique that in this context 2 comes out of a Bayesian approach to statistics, B-A-Y-E-S-I-A-N. 3 Q. And what is a Bayesian approach? 4 A. A long time ago, a minister in England, an Anglican minister 5 in England wrote about an idea of using information that he had 6 as a prior, and bringing that prior information along with data, 7 and constructing a way of doing an analysis with the prior and 8 the data itself into what has been called now a posterior 9 distribution. 10 So we constructed posterior distributions using 11 information that we either knew about the data but wasn't in the 12 data itself, plus the data, the incomplete data we had, to 13 develop posteriors. 14 Q. So you're taking information that you know and using it to 15 predict the unknown, or the missing information? 16 A. And that allows -- 17 Q. Is that correct? 18 A. That's correct. And that allows us to go outside of the 19 data set, the incomplete data set, to complete it, and to 20 complete it not just in a way that, say, subject matter experts 21 would complete it, but to complete it in a way that would allow 22 us to measure its uncertainty as well. 23 Q. Dr. Scheuren, have you written any articles about multiple 24 imputation? 25 A. I have done a lot of work on missing data problems, and I Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 938 1 recently wrote an article in November 2005 on the experience of 2 working with Don Rubin in those days. 3 MR. WARSHAWSKY: And Your Honor, I'm going to refer to 4 this very briefly. I have not marked it as an exhibit, but if I 5 may approach the witness? 6 THE COURT: Yeah, it's a little late for me to announce 7 the rule. I thought I had done it before. You never need 8 permission to approach a witness in my courtroom, as long as 9 you're doing it for a benign purpose. 10 THE WITNESS: Thank you, Your Honor. 11 MR. DORRIS: Your Honor, since I don't know how benign 12 it is, I would ask that it at least be marked so we can refer to 13 it in the record as we move forward. 14 BY MR. WARSHAWSKY: 15 Q. Dr. Scheuren, I've placed before you what's been marked 16 Defendant's Exhibit 507. Would you describe what this is, 17 please? 18 A. This is a recollection of the work I did with Don and others 19 in those days long ago which led to the book that you've 20 referenced in this setting. 21 Q. You're referring to the Rubin book on multiple imputation? 22 A. That's right. And my recollection of how it began and how 23 it continues. Because it does continue. The missing data 24 problem is probably never going to end. When statistics ends, 25 it will end. Until then, it won't. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 939 1 Q. Dr. Scheuren, what types of missing data problems can be 2 addressed through multiple imputation? 3 A. Well, arguably the proponents say there's nothing you can't 4 do with this. My use of it is primarily in a survey setting 5 formerly, the original setting, and now in this setting. 6 Q. Dr. Scheuren, I would like to have you look now what's been 7 marked Defendant's Exhibit 460. 8 A. Yes. 9 Q. And could you generally describe what this document is, 10 please? 11 A. This is an outline of the things that I think maybe I could 12 describe to the Court about the approach we took. 13 Q. And so this is a summary of NORC's approach? 14 A. Yes. Yes. And how we examined the data provided by FTI and 15 Morgan Angel, and how we applied multiple imputation and how we 16 dealt with the problems of the reported data, too. Because the 17 reported data has problems, as we've been hearing in this court 18 I guess since the inception of this case. And how we calculated 19 the difference between the collection and disbursements, which 20 is the key central statistic in this case, and what uncertainty 21 we calculated for that difference. 22 Q. Okay. Let's look at the first bullet point, please; 23 "examine existing data, identify missing data, identify outliers 24 and treating them as missing." 25 Again, which data did you examine? What are we Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 940 1 referring to in this bullet point? 2 A. I guess we'll have a picture of this pretty soon. But we 3 had the data that Ed Angel provided and the data that Michelle 4 Herman provided, and we looked at it from our perspective as 5 statisticians to see if there were instances where some points 6 just seemed not to fit within the time -- over time or in 7 relationships to each other. 8 At the very end of my -- when I came back in here, I 9 heard redirect on one of those points being mentioned, one of 10 those years being mentioned, and Ed Angel acknowledging that I 11 had asked him about this. Because I had. One always goes to 12 the subject matter person. 13 MR. WARSHAWSKY: Why don't we pull up Defendant's 461? 14 This is actually a three-page document. And just flip through 15 all three pages real briefly. Okay. Why don't you go back to 16 the first page? 17 BY MR. WARSHAWSKY: 18 Q. Okay, Dr. Scheuren. We see lots of blanks and lots of 19 numbers. What's depicted on this exhibit? 20 A. Well, the rows are years. 21 THE WITNESS: Can you bring this up so I can read this? 22 A. The rows are years. The first row is fiscal year, the 23 second -- the first column is fiscal year, the second column is 24 a collection figure, which in this part that we've brought up is 25 empty. There was no data we had in 1887 for that. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 941 1 The second column is disbursements. Both of those are 2 reported in millions. And the third column is the balances, and 3 the fourth column is the Osage per share. And it's important 4 that I emphasize the per share aspects of this. It is not the 5 headright count, because the count of people who have headrights 6 was not used by us. We brought the per share numbers up not 7 because we were going to impute them, but because we needed them 8 to help guide the work we're going to do later on. Okay? 9 Because, of course, as you can see by looking at this screen, 10 there's nothing there. How can we possibly impute that? 11 Q. Dr. Scheuren, what is a multivariate model? 12 A. A multivariate model is one where we look at all the 13 variables - in this case there are five of them - we look at all 14 the variables together and we use the relationships between them 15 to enhance our understanding of any one of them. 16 Q. Now -- go ahead. 17 A. In Professor Cornell's model he had a univariate model. He 18 just looked at one variable. 19 Q. Now back on Exhibit 460, which we don't need to pull up, but 20 your first bullet point talked about identifying missing data. 21 Is that what the yellow boxes are here? 22 A. The yellow was a color code. I'm actually close to color 23 blind, but others are not, so it was a code to indicate that 24 data was not available at all. 25 The other color, which I guess is purple -- Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 942 1 Q. Good guess. 2 A. -- is to identify the outliers that we found in the data we 3 were given, and that we decided to treat as missing. 4 Q. Now, just to bring us back, then, we were talking about 5 multivariate models. What are the variables in your analysis? 6 A. The ones we have in this display, a year, which goes from 7 1887 to 2007, and the collections, disbursements, balances, and 8 the Osage per share figures. 9 MR. WARSHAWSKY: Would you move it back so we can see 10 the whole page? 11 BY MR. WARSHAWSKY: 12 Q. Now, in some cases you've got cells or boxes with numbers in 13 them? 14 A. Correct. 15 Q. And what do those represent? 16 A. That's the data that we obtained from, in this case, in this 17 display, from Morgan Angel. 18 Q. And continuing through page two, page three, same story 19 where you've got cells with numbers? 20 A. Correct. At a point in 1972 we shifted from Ed Angel's data 21 to the data we got from Michelle Herman's team. 22 Q. Let's go back to the first page. There were some cells on 23 the first page that have numbers in them and are colored. You 24 correctly identified it as being the purple ones. 25 A. That's correct. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 943 1 Q. What did you mean by outlier? 2 A. Well, we have a lot of methods in statistics for identifying 3 something that doesn't look like it belongs in the same data 4 set, and we used those methods here to identify something that 5 looked like it could have been -- 6 THE WITNESS: And I should make a distinction, Your 7 Honor, between a representative and a nonrepresentative outlier. 8 A representative outlier is something that is real, okay, but is 9 unusual. A nonrepresentative outlier is something that's a 10 mistake, a punch error, key punch error, very common in the old 11 days. 12 A. So you look for the outlier and then you say, well, was it a 13 representative outlier or a nonrepresentative outlier? And this 14 is a long time ago, so we're not really able to determine 15 whether it's a representative or nonrepresentative outlier. And 16 what we did, because it's a more conservative thing, more 17 favorable to the plaintiff, we treated it as missing. 18 Q. Let me ask you to look at the entries for 1922. Now, you 19 weren't in the courtroom yesterday afternoon when Mr. Smith 20 asked about the 1922 entries, were you, sir? 21 A. I left after the break or something. I was not here for the 22 whole day, no. 23 Q. Well, during the cross-examination by Mr. Smith, Dr. Angel 24 was asked about 1922 and a $5.5 million figure. 25 A. Yes. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 944 1 Q. Assuming that's the same $5.5 million figure represented in 2 your boxes here, did you have a conversation with Dr. Angel 3 about these -- 4 A. I did. 5 Q. -- numbers and whether they were outliers? 6 A. I did. We didn't use the word "outlier," because he's a 7 complete human being and I'm just a statistician, but yes, we 8 talked about that these were unusual. 9 Q. And what was it about those numbers that you felt -- 10 A. Well, they don't fit in the time series, and this is the 11 only instance where they're exactly equal between collections 12 and disbursements. 13 Q. What did you mean when you said it doesn't fit in a time 14 series? 15 A. Look at the years afterwards and look at what's going on 16 with the balances at the end. These just don't look like they 17 belong here. They look like the data is incomplete. There's 18 something missing. I don't know what it was; Ed didn't know 19 either, to my recollection. 20 But they don't look like they belong, and so we 21 eliminated them. We treated them as missing. 22 Q. Okay. 23 MR. WARSHAWSKY: Would you show us the second bullet 24 point now on DX-460? 25 BY MR. WARSHAWSKY: Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 945 1 Q. Sir, the second bullet point reads, "Use Multiple Imputation 2 to Generate Estimates of Missing Data and to Assess Missing Data 3 Uncertainties." 4 This refers to the process NORC undertook? 5 A. That's correct. To fill in the data matrix - I'll use the 6 language again - the data matrix that you've just looked at. 7 Q. Okay. And what did you mean, or what does it mean here 8 about assessing missing data uncertainties? 9 A. When we don't know the answer, we have to pay a price for 10 that. Okay? And I'm talking about uncertainties in that sense. 11 We produce a distribution of answers based on the information we 12 have, and then we look at how widely apart the distribution is. 13 And that is a way to look at the uncertainty. 14 And in the context of a government versus a plaintiff, 15 the uncertainty has to be scored to the plaintiffs. If we 16 cannot come up with a good point estimate, then that needs to be 17 brought here and examined, and the judge needs to make a 18 determination based on that. 19 Q. And that's your opinion as a statistician? 20 A. That is my opinion as a statistician. 21 Q. Now, how do you go about generating estimates, or how did 22 you go about generating estimates in this case using multiple 23 imputation? 24 A. Well, we'll use the word "Bayesian" for Reverend Bayes. We 25 had a prior distribution and we made some assumptions about the Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 946 1 nature of the underlying processes. We assumed a multivariate 2 normal, assuming multivariate normal is not a particularly 3 serious weakness, because most things can be transformed into 4 multivariate normal if you're looking at the details of this. 5 So we assumed -- I'm sorry, am I going into too much 6 detail? 7 Q. Let me just interrupt you right there. What do you mean by 8 a multivariate normal? 9 A. A multivariate normal is something that we take account of 10 all the variables here. 11 Q. You're talking about the five variables -- 12 A. That's correct. And we treat them -- in using the ideas 13 of -- they come from normal distributions. Everyone who has 14 taken a basic stat course knows about normal distributions, but 15 the basic stat course they took is univariate normal, one 16 variable. Here we are looking at a vector of five variables. 17 Q. How do you determine whether to utilize a particular 18 variable in your analysis? 19 A. Well, we're obviously interested in the difference between 20 collections and disbursements. We have to get to that point, so 21 we have to include those two. The balances that are recorded in 22 the system are enormously important, too, even though they don't 23 seem to have been obtained -- or I'm not sure why, but they 24 don't always agree. They don't foot, in other words. 25 In an accounting system, the beginning balance plus Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 947 1 collections minus disbursements should equal the ending balance. 2 And that's not true until 1996, when the data were audited. 3 Q. Now, in this case you utilized these five variables. How 4 did you make a decision that these five variables would be the 5 ones that you would use for your analysis? 6 A. Well, there is an exploratory data analysis step that 7 precedes what we're talking about here, which is classically 8 called a confirmatory step where we actually calculate 9 confidence intervals and do statistical inference. And the 10 confirmatory step is usually, as I say, preceded by an 11 exploratory step where we look at all the data, we talk to the 12 experts. And there are a lot of experts here. 13 Q. Describe the exploratory step that you undertook in this 14 case. 15 A. We looked at relationships between these variables. 16 Typically you take two variables and you would calculate a 17 scatterplot of those two variables. That's one word, 18 scatterplot, S-C-A-T-T-E-R-P-L-O-T. 19 Q. And what is a scatterplot? 20 A. A scatterplot, in deference to the judge not wanting a board 21 like last time -- 22 THE WITNESS: Sorry, judge. 23 A. A scatterplot is -- you put collections on the vertical and 24 disbursements on the horizontal for a year, one of these years, 25 for each of the years you have the both of them, and you simply Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 948 1 plot the points, and you see what relationship there is between 2 those two points, between those two ideas, collections and 3 disbursements. 4 BY MR. WARSHAWSKY: 5 Q. How do you determine if there's a relationship? 6 A. Well, typically, when you live in my world, you use 7 regression, you use correlation, you use other techniques like 8 that. Those are ones that are familiar to I think nearly 9 everyone here. 10 Q. Did you consider any other variables besides the ones listed 11 on Defendant's Exhibit 461? 12 A. I think we looked at more widely than this, but one of the 13 problems we had was that we wanted to use data -- we wanted to 14 use variables that would support the imputation that we had for 15 the whole period, and the Osage variable was really the main 16 variable we had. 17 If you look -- excuse me. 18 Q. Let me ask you about that. Because there's been a lot of 19 testimony during the trial about Osage, and you made a point 20 early on in your testimony about the fact that you used per 21 share. 22 A. Correct. 23 Q. Explain why you used per share in your multiple imputation 24 analysis. 25 A. Because it had been represented to me that the headright Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 949 1 number, which includes not only the per share but how many 2 people were getting it, was not always available. And so we 3 didn't want to use that. 4 I don't want to -- I'm not trying to impute this 5 number, but I don't want this number to have missing data 6 problems in it. Then it adds to the uncertainty in a way that 7 is not fair. 8 Q. Okay. So you weren't imputing the Osage per share number? 9 A. No, we did not. 10 Q. Obviously not imputing the fiscal year? 11 A. No. 12 Q. Did you impute the other numbers? 13 A. We did. 14 Q. Okay. 15 A. All of them that were missing. 16 Q. And what do you do after you determine that variables appear 17 to be related? You were talking about the scatterplot and all 18 that. What's the next step? 19 A. The next step is to employ techniques like -- any of them, 20 to employ techniques to make estimates of what those 21 relationships are. Okay? We did that. 22 And then once we made those estimates and 23 relationships, we then used that model, okay, to do the 24 imputation. 25 Q. And was this something you did on paper? Did you have a Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 950 1 computer? 2 A. Oh, no. Very intensive, very computer intensive. 3 Q. Took advantage of modern computing, did you? 4 A. Yes, I did. 5 Q. What did you utilize in your computing? 6 A. We used a procedure, PROC MI, which is a SAS procedure. SAS 7 is a very well-known system for doing statistical analysis. 8 Q. And MI, is that the name of the application? 9 A. That's correct. 10 Q. By the way, do you know if that application was provided to 11 the plaintiffs in this case? 12 A. We gave them all the data, yeah, all the imputations. 13 Q. Did you provide them a copy of the application -- 14 A. And the software, yes. 15 Q. And when was that done, do you remember? 16 A. When did we provide that information? 17 Q. Yeah. 18 A. I think we provided it last Friday. 19 THE COURT: Last Friday? 20 THE WITNESS: Or last Thursday. Actually, we agreed to 21 provide it last Thursday and we provided it last Friday about 22 noontime. 23 BY MR. WARSHAWSKY: 24 Q. And can you generally describe what the SAS application does 25 with the data? Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 951 1 A. What it does is it brings in the information that you've 2 seen on the screen here within a model, the multiple imputation 3 model that's been constructed by a prior distribution, sometimes 4 called the non-informative prior, and the multivariate normal 5 idea. 6 Q. Now, I have to tell you, you're talking over my head. 7 A. I'm sorry. 8 Q. What's a non-informative prior? 9 A. We have to have some kind of a prior distribution, okay, to 10 say statements about the parameters, okay, and in order to begin 11 the process of doing the imputation. The prior in the end 12 doesn't usually become important at all, because we're going to 13 do this thing over and over again, we're going to use the 14 posterior. The posterior distribution is the thing that you get 15 after you take the prior and the data itself, which we're 16 assuming is multivariate normal, and calculate the posterior. 17 The posterior eventually becomes free, virtually free 18 of the prior when you do it many, many times. 19 Q. Becomes free of what? 20 A. It becomes free of the prior. It doesn't matter very much. 21 The process -- the whole machinery falls away at some point. 22 Multivariate normal is a very friendly world to do this in. I'm 23 sure if you use other methods, maybe it wouldn't fall away, but 24 in the world we're in, we use multivariate normal. 25 Q. Well, you refer to multiple imputation. What's involved in Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 952 1 running an imputation or making an imputation? 2 A. What we do is we draw -- first of all we make a first, I'll 3 use the word "guess," but it's an estimate, a first estimate of 4 what we think the mean and the variants are for this data, 5 relying only on the data that's available. And we use that 6 guess plus this prior, which allows us to construct the 7 posterior, we use that guess and the prior to develop a method 8 of sampling from the prior. 9 Actually, the posterior is very hard to describe 10 analytically, typically. But you can sample from it, and if you 11 can sample from it, then you sample from it multiple times and 12 then use the samples that you construct to look at the shape of 13 the resulting data. And that's what we did. 14 Q. Let me ask you this: How many imputations did you run in 15 this case? 16 A. 10,000. 17 Q. And is there a rule of thumb for how many imputations to 18 run? 19 A. In the old days, when computing was expensive, three to five 20 was what they were hoping to get people to do. Of course they 21 were trying to sell an idea in an age when we didn't have the 22 computing, didn't have PC's, didn't have anything like what we 23 have today. 24 10,000 is a large number, probably larger than normal. 25 Q. Is there any value in running a larger number than a smaller Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 953 1 number of imputations? 2 A. The larger the number is, the simpler the analysis becomes 3 later. 4 MR. WARSHAWSKY: And let's pull up Defendant's 5 Exhibit 462, please. And why don't you flip through the three 6 pages? 7 BY MR. WARSHAWSKY: 8 Q. Dr. Scheuren, what is Defendant's Exhibit 462? 9 A. This is the completed data matrix that I talked about awhile 10 ago where we've put in averages for the imputed values. And 11 we've also done something else here which we haven't described 12 yet, which is we dealt with the problems in the reported data. 13 Q. Okay. 14 MR. WARSHAWSKY: Let's very quickly go back to 460, and 15 let's pull up the third bullet point. 16 BY MR. WARSHAWSKY: 17 Q. The third bullet point on Defendant's Exhibit 460 reads, 18 quote, "Identify Reported and Imputed Data to Be Modeled Because 19 of Other Identified Uncertainties." What is that referring to, 20 Dr. Scheuren? 21 A. Well, I've been working on this case for a long time, and 22 I've been listening to the plaintiffs and I've been listening to 23 the data, and there are uncertainties in this data that deserve 24 attention. 25 And when we did the imputation, we're doing it - and Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 954 1 I'll use the word "conditional" - conditioning on the data we 2 have that's reported, which is to say we're treating it as 3 fixed. The data is fixed. 4 Well, the data has its own problems, and so it deserves 5 to be -- those problems deserve to be addressed, too. And that 6 uncertainty needs to be incorporated into this calculation as 7 well, so we went ahead and did that. 8 Q. Now, what types of problems are you referring to when you 9 say that the data have their own problems? 10 A. Well, you've asked me to -- and we've talked about these 11 outliers already. Those would be the most obvious instances of 12 that. But there are other instances that are not so obvious and 13 may not be so easily grasped. 14 One of the things that -- one more moment. One of the 15 things that is essential is to notice that the thing does not 16 foot. Except for the years when it was audited, 1996 beyond, it 17 didn't foot. That suggests there's some issues. 18 MR. WARSHAWSKY: Let's pull Defendant's 462 back up, 19 then. 20 THE WITNESS: You're going to have to bring it up for 21 me. 22 MR. WARSHAWSKY: He means blow it up a little bit. 23 I know I'm not supposed to ask, but if I may approach 24 the witness. 25 THE WITNESS: I would love to have a paper copy. I'm Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 955 1 before cheap computing. 2 MR. WARSHAWSKY: Now, I'm going to depend on the 3 TelePrompTer here, whatever we call this thing, the screen. 4 BY MR. WARSHAWSKY: 5 Q. Okay. Dr. Scheuren, referring to Exhibit 462, we'll go back 6 to -- 7 MR. WARSHAWSKY: Let's go to the first page, please. 8 BY MR. WARSHAWSKY: 9 Q. You've got numbers now in the yellow boxes -- 10 A. Correct. 11 Q. -- for example, the 1887 entry. 12 What is represented -- what's the significance of the 13 numbers that are -- 14 A. Those are the averages of the 10,000 imputations after we 15 have made a further adjustment for the fact that the reported 16 data had its own problems. You remember what I said? We 17 started out doing the imputations using the reported data as 18 fixed. The word "fixed," if you've done regression analysis, 19 you're fixing -- in order to estimate Y, the dependent variable, 20 you fix X, which is the independent variable. But X could have 21 error too. 22 So we started out with that, and then we went on -- and 23 we haven't described this still yet. We then went on and did an 24 adjustment for the reported errors as well. 25 And this results -- after we've done all that, this Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 956 1 results in these values, the yellow values. 2 Q. And just to be clear, then, multiple imputation, is that a 3 form of regression analysis? 4 A. Oh, in a broad sense, I guess you could say that. I 5 wouldn't say it that way. But we did a time series analysis 6 after we did the imputation to get rid of our concerns about the 7 remaining problems in the reported data. 8 MR. WARSHAWSKY: Show the full page, would you, please? 9 BY MR. WARSHAWSKY: 10 Q. So now you've got a number of different types of boxes. In 11 addition to the yellow ones, I see the purple box for 1908 under 12 "balance." Do you see that, sir? 13 A. Correct. Yes. 14 Q. That was one previously identified in -- 15 A. As an outlier, yes. 16 Q. In 461 as an outlier? 17 A. Yes. And the original value was 4, I believe. That's now 18 12. 19 Q. And you've got 461 in front of you. You might want to 20 confirm that. 21 A. Correct. And that is confirmed, yes. 22 Q. How did it go from 4 to 12.7? 23 A. Well, we went through this process that I described of doing 24 these imputations, of treating this value as missing, and 25 because of the relationships that exist in the period -- we're Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 957 1 at a transition period here when we have some missing data -- 2 some data and then a lot of missing data. In this era here, we 3 changed over from one method to another. 4 I have to correct myself. May I? The second time 5 series analysis was only done on the collection and disbursement 6 data, it was not done on the balances. 7 Q. So, for example -- 8 A. Those are the values from the imputation. 9 Q. Let's look at the entries for 1909 to 1911, collections and 10 disbursements. Do you see? 11 A. Yes. 12 Q. And those have a different type of box around them. 13 Correct? 14 A. That's correct. They were model adjusted, yes, by the 15 model -- the time series model. 16 Q. Describe the time series model that you've been referring 17 to. 18 A. We're looking at collections -- this time period, time T, is 19 usually related to time T minus one, T minus two, and so forth. 20 And disbursements similarly to itself in previous time periods. 21 And collections and disbursements are also related to each other 22 typically, especially in modern times when collections -- when 23 disbursements have to be given out once a certain collection 24 threshold has been met. But there was a relationship all the 25 way back, too, although it's statistical and not procedural. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 958 1 In any event, we had -- we did an ARIMA model -- 2 Q. Which is? 3 A. A model of -- a time series model which looks at how current 4 time is related to past periods. 5 Q. How many periods, prior periods, did you look at? 6 A. We ended up using seven. And if you look at the top of the 7 chart, the chart on the first page here, we actually have Osage 8 data that goes back earlier than this, okay, in this period, and 9 we were able to use that to help us -- 10 Q. Approximately seven years back? 11 A. Yeah, I don't remember how many years back that the Osage 12 data goes, but we used seven years back. 13 Q. And then what did you do with the seven years of data? 14 A. We used it to start the process. So the first data point 15 was fit with -- the first collection and disbursement data point 16 was fit with models that went back seven prior years and so 17 forth, and in the more recent time, when we get down into the 18 data itself after the starting point, we're using all the data 19 that we have, imputed data and reported data. 20 Q. Are you basically rolling down -- 21 A. That's correct. That's the way to look at it. It's kind of 22 like a moving average, which I think is pretty well understood. 23 Q. So you utilized the seven-year -- this time series analysis, 24 you certainly did it, according to Defendant's Exhibit 462, for 25 example, 1909, 1910, 1911, 1923 to 1926 -- we're on the first Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 959 1 page now. 2 A. Yes. 3 Q. Did you also do that for other periods on this page? 4 A. We did it for the imputed value, too. We just haven't 5 re-marked those because -- we had to look at the whole series, 6 because remember we had imputed the data conditional on the 7 reported data? So the imputed data has the error or the 8 uncertainty in the reported data carried forward into the 9 imputed data. We had to deal with that, too. So we had to deal 10 with both. 11 Q. Now, you were in the courtroom a few days ago when 12 Ms. Herman was testifying on cross about the GLDL data. 13 A. That's right. I am familiar with that, yeah. 14 Q. Is the time series analysis that you're talking about 15 relevant in any fashion to the data that you received from 16 Ms. Herman? 17 A. We assumed that the data that we got from Ms. Herman was -- 18 did not have any missing data in it, but we did have this basic 19 idea that there were issues with the data already. We didn't 20 know what they were, we were rather cautious in our use of data, 21 but when we found out about this issue, which we hadn't known 22 about, we looked at it to see how much of it there was, how much 23 missing there was in that period. It's relatively small, under 24 five percent. 25 And then we went back to my usual example I gave Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 960 1 earlier about how we would handle this at the IRS, because this 2 is the kind of approach we took at the IRS. But we have to 3 score that decision for the uncertainty. There has to be some 4 uncertainty added as a result of that. I think the estimates 5 are probably sound the way they are. I'm very familiar with the 6 way careful subject matter experts make these estimates. I have 7 not gone back and rechecked those myself, I'm not an expert on 8 subject matter, either, but I'm very sure that they're very 9 good. The point estimates, in other words, are very good. The 10 uncertainty, we have to factor that in. 11 Q. Let's look through real quickly -- we've looked at page one 12 of Defendant's 462. 13 MR. WARSHAWSKY: Why don't you go to page two? 14 BY MR. WARSHAWSKY: 15 Q. Now, in this instance you have -- slightly over half of it 16 is boxed as being model adjusted? 17 A. These were originally reported data, right. 18 Q. And under "collections and disbursements" on this second 19 page, the yellow boxes, those are the imputed values? 20 A. Those are the imputed, but they've also been adjusted. 21 Q. Okay. That was my question. 22 MR. WARSHAWSKY: And then go to the third page of 23 Defendant's 462. 24 BY MR. WARSHAWSKY: 25 Q. The same basic thing through 1995. Is that correct? Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 961 1 A. That is correct. 2 Q. So you did model adjust 1968 to 1971? 3 A. Uh-huh. 4 Q. Yes? 5 A. Yes, we did. The very area that you have been talking 6 about, yeah. 7 Q. And then 1972 to 1995, why did you model adjust that data? 8 That was already provided data. Right? 9 A. That's correct. But it had the general problems that this 10 data set has, in that it's a system that was not designed for 11 this analysis. And one has to be careful with such data. And 12 we felt that it was irregular enough -- we had done this time 13 series analysis and we looked for irregularity, deviations from 14 the time series pattern, and we scored the deviations as 15 uncertainties, and then we used that to add uncertainty to the 16 process. 17 Q. How much did the reported and imputed data change after you 18 modeled it? 19 A. I don't know exactly what that number is. We've changed 20 everything. It changed it quite a bit. The uncertainty went up 21 maybe about a third. About a third. I'm sorry, I don't have 22 that number. 23 Q. Did the point -- were the point estimates changed much? 24 A. Not too much, no. 25 Q. Dr. Scheuren, you're aware that there was a Defendant's Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 962 1 Exhibit 371 version produced on May 30th, 2008? 2 A. Yes, I am. 3 Q. And then there was another version produced June 4th, 2008? 4 A. Yes, I know about both of those. Yes. 5 Q. And during this trial there's been some examination 6 regarding differences between the May 30th and the June 4th 7 numbers in your estimates. 8 A. That's correct. 9 Q. Do you have any insights as to why the numbers changed from 10 May 30th to June 4th? 11 A. Actually, I'll give you three -- 12 MR. DORRIS: Your Honor, I'm going to object until he 13 ties a foundation that this man was involved in those 14 calculations and personally knows why those numbers changed. 15 MR. WARSHAWSKY: I can do that. 16 BY MR. WARSHAWSKY: 17 Q. Dr. Scheuren, were you involved in the process that led to 18 the generation -- 19 A. Yes, I was. 20 Q. Let me get the question out. Process that led to the 21 generation of estimates in the May 30, 2008 document? 22 A. Yes, I was. 23 Q. And how were you involved? 24 A. Well, we were looking at -- we're still trying to prove in 25 our imputation model, and we thought we had something pretty Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 963 1 close at the end of May. It turned out that we couldn't prove 2 and didn't prove it later. But we had not done at that point -- 3 introduced the uncertainties due to reporting issues. 4 Q. And when you say "we," you're talking about NORC? 5 A. The team of people I work with, yes. 6 Q. And were you a participant in that team? 7 A. I certainly was. 8 Q. Were you a participant on the team that generated the 9 June 4th estimates? 10 A. Absolutely. 11 Q. And how were you a participant there? 12 A. Well, I'm the one who said, we have to do more. Okay? And 13 we did do more, and I was involved in the analysis of that 14 process. We looked at and added the outlier analysis. The 15 outlier analysis was not in the May 30th estimate. 16 Q. What else was different about the May 30th estimates and the 17 June 4th estimates, if you recall? 18 A. Well, we improved the model. We improved the model. We had 19 taken an assumption from the plaintiffs that the world started 20 in 1887, and we put a zero in for 1887. And that was not 21 consistent with the Osage data, but we didn't notice that right 22 away. And that led to very unstable estimates, and then we 23 fixed that. 24 Q. Had you done your 10,000 imputations as of May 30th? 25 A. I don't think so. I think we did 1,000, but I don't Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 964 1 remember exactly how many we did. We hadn't done 10,000. 2 Q. Given that, do you have any insights as to why the numbers 3 in the May 30th version of DX-371 changed to the numbers that 4 are reflected in the estimates of June 4th? 5 A. Well, I just indicated to you that the uncertainty went up, 6 okay, as a result of the difference in time periods. I don't 7 have the May 30th numbers here, so if you want me to comment on 8 them specifically, I will. But I've already made the point 9 about the uncertainty. We did not complete the work until 10 June 4th. 11 MR. WARSHAWSKY: Let's go back to exhibit -- 12 Defendant's Exhibit 460. We'll look at the fourth bullet point, 13 please. 14 BY MR. WARSHAWSKY: 15 Q. Dr. Scheuren, the fourth bullet point reads, quote, 16 "Calculate Difference Between Total Collections and Total 17 Disbursements (Calculated Balance)," end quote. 18 Dr. Scheuren, why did you calculate the difference 19 there? 20 A. This is the point of the trial, it seemed to me, is 21 whether -- what is the amount of money that's in the system and 22 how much of it is accounted for. 23 Q. How much? 24 A. How much of it has been accounted for by the existing 25 records in the system. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 965 1 Q. And is that what you meant by calculated balance? 2 A. What we did was of course we took these 10,000 data sets 3 that we had created, and for each data set we calculated the 4 collection and subtracted the disbursement, and with that 5 difference we plotted the data. And I assume we're going to see 6 a picture of that, a histogram, in a moment. 7 MR. WARSHAWSKY: Yeah, why don't we look at Defendant's 8 Exhibit 463, please. 9 BY MR. WARSHAWSKY: 10 Q. Dr. Scheuren, what is Defendant's 463? 11 A. This is a histogram which plots one point for each of the 12 10,000 imputations. Notice that it's based on the 10,000 13 imputations and uncertainty adjustments. This "uncertainty 14 adjustments" is a phrase that I use to deal with the fact that 15 we had to address the reporting issues as well with the model. 16 Q. Now, is this a document that NORC prepared? 17 A. Yes. 18 Q. How did NORC go about preparing this histogram? 19 A. We simply took the collection value that we had obtained 20 from the process for each of the 10,000, and we subtracted it 21 from the disbursement value from the same run, and we took the 22 difference. And then we plotted all the differences; you know, 23 we binned them together. These are sometimes called bins, these 24 slats. We bin them together from the smallest to the largest, 25 and that's what we have here. Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 966 1 Q. Why does -- on the Y axis it reads "frequency"? 2 A. Correct. Counts, yeah. 3 Q. And what is that referring to? 4 A. We have 10,000 observations here, one for each of these 5 calculations. And those are the counts -- those are what those 6 counts are. 7 Q. So for example -- first of all, there's a red dashed line 8 down the middle, and next to it it says "mean, $583.6 million"? 9 A. That's the overall average, arithmetic average, of all the 10 10,000 observations. 11 Q. And what is the bin - for example, the blue bin to the 12 immediate left of that red line - what does that mean? 13 A. The bins are all the same width, and so you can make the 14 visual connection between the count and the number in the bin. 15 If the bins were different lengths, this wouldn't be a good 16 display device. 17 Q. So I'm just eyeballing this, but would it be fair to read 18 that one we were just referring to as meaning that the 19 calculated balance in that bin was, I don't know, something less 20 than $600 million observed between six and seven hundred times? 21 A. That's a cumulative value. Okay? We're adding up all the 22 values here, and it turns out that the mean shows up in that 23 particular bin. It's not a count of what's in that bin, or not 24 just in that bin. 25 Q. Now, what is the blue dashed line on the right-hand side? Rebecca Stonestreet (202) 354-3249kingreporter2@verizon.net 967 1 A. This is a 95 percent upper confidence bound. 2 THE WITNESS: Your Honor will remember from last fall, 3 when we were talking about the litigation support data, we 4 provided 95 and 99 percent upper bounds to you because those are 5 valuable in understanding the extent to which there's an 6 uncertainty in this data. And 95 percent being the standard. 7 BY MR. WARSHAWSKY: 8 Q. Well, what inference do you want the judge to draw from 9 what's on the left side of the blue dash versus what's on the 10 right side of the blue dash? 11 A. That's a little too general for me. I would probably have 12 to work on that a little bit. Let me ask a related -- answer a 13 related question, which is: Where is the average, where is 14 the -- what does the system say is the balance, which is a 15 cumulative balance? What does the system say the cumulative 16 balance is as of this point in time? 17 Q. To help you with that, why don't you pull up Defendant's 18 Exhibit 464. 19 MR. WARSHAWSKY: And Your Honor, if I may, I would like 20 to just give Dr. Scheuren a copy of DX-463. 21 BY MR. WARSHAWSKY: 22 Q. Dr. Scheuren, what is Defendant's Exhibit 464? 23 A. We're analyzing data that