More Thoughts on ESAS Appendix Flaws

A few weeks ago, when I first discussed the ESAS Appendices on this blog, I had to be somewhat vague about my concerns, since the information wasn’t out in the public, and I didn’t want to get Chris in trouble with his sources. However, since nobody has commented further on the appendices since they went public, I figured I ought to do a follow-up post. While I haven’t had a chance to dive much more into that 630pg document, I wanted to elaborate on some of the flaws I’ve seen so far, as many of them demonstrate fundamentally rookie engineering mistakes.

Gaming the Assumptions to Give You the “Right” Results
Probably the most egregious of the flaws I saw was the exception given in the “ground rules and assumptions” to the Stick concepts. Now, as I’ve discussed before on this blog, tweaking assumptions to make sure that “the right answer” looks favorable is standard fare for trade studies. Now, this isn’t always a case of deliberate larceny. Many times honest engineers can have an opinion about the best route, and when the numbers don’t come out quite they way they expected, they go back and try and see if they “made a mistake” in their assumptions. That said, typically when a honest person is unintentionally massaging the assumptions–or when a dishonest but competent person is intentionally gaming the assumptions–they aren’t as blatant as this (found on page 28, my emphasis added):

Max dynamic pressure = 800 psf (undispersed), except for certain In-line Crew (ILC) configuration-Solid Rocket Motor (SRM)-In-line cases where the limit was raised to 1,000 psf due to very high accelerations early in the ascent profile.

Max dynamic pressure = 1000 psf (dispersed), except for certain ILC-SRM-In-line cases where the limit was raised to 1,200 psf due to very high accelerations early in the ascent profile.

Dynamic pressure (or “q”) is the component of total pressure due to fluid kinetic energy, ie in this case it is pressure felt by trying to shove a high speed rocket through the air. Dynamic pressure is proportional to the air density and the square of the vehicle’s velocity. The maximum dynamic pressure (“max-q”) is an important factor for designing manned launch vehicles, and there are legitimate safety reasons for wanting to keep max-q within reasonable levels. Higher max-q can rapidly drive up the required thrust of an LAS, it drives up the structural stress, especially bending loads, and in the case of a controls problem could lead to tumbling and rapid failure of the stack.

So, what is it about an SRM-based vehicle that makes it safer to fly at higher max-q? Why do they get an exception, when no other approaches do? To me, the fact that an exception was given only to “ILC-SRM In-line” (aka variants on der Griffenshaft) is what really marks this as an amateur job of assumptions gaming. A clever assumptions gamer would’ve just upped the max-q value for everyone. After all, even though no other alternatives come anywhere close to needing an exception, at least by giving some technobabble excuse like “on further analysis we found that the original GR&A numbers were too conservative” it looks like you’re being honest.

The other thing I find amusing about this exception, is that after carving out a special exemption just for the boss’s preferred option, someone felt they needed to at least give some sort of rationalization. Of course, you’d think that when making an exception to a safety-based requirement, you’d come up with, you know, a safety-based justification. But, the justification given was that this safety requirement should be modified–because the preferred vehicles weren’t capable of meeting the requirement! Note to future assumptions gamers: if you’re going to blatantly carve out an exception like this, don’t openly admit that your ideas would’ve failed without the exception!

It’s worth noting that of all the options compared on pages 3-11, all of the vehicles requiring this exception are Stick-like vehicles, and that every one of the 5-segment SRB designs requires the exemption. In fact, some of the 5-segment designs closest to the one NASA’s trying to build now almost fail even this exception.

Now, there’s another relatively innocuous-sounding assumption on page 29 that, when combined with this exception, really biases things against EELVs and in favor of the stick. The assumption about LES mass:

LES mass = 9,300 lb for vehicles sized under Block 2 analysis. 9,172 lb for vehicles analyzed under Block 1.

Now, ignore the Block 1 vs. Block 2 bit, as that is fairly irrelevant. While it sure makes calculations easier, why should all LES’s weigh the same? In reality, vehicles with higher max-qs, first stages or boosters that can’t be turned off easily, and with stages whose failures tend to be more catastrophic (like solids), would tend to require more beefy LES systems. The thrust requirement of the LES is driven by a transonic abort (typically around max-q). In this situation, you have the maximum aerodynamic force trying to shove your capsule back into the first stage, especially since a low-pressure region tends to form between the capsule and the stage during separation. The higher the max-q, the more force you need to provide just to stand-still relative to the first stage. If the first stage also can’t be shut down, you now need to provide more acceleration…

It’s illustrative to note that current LES weights for Ares-I are somewhere between 1.5-2x heavier than the number used in the ESAS study.

There are also a few other examples of GR&A’s that may favor Shuttle Derived Vehicles at the expense of existing boosters. On page 30, there’s a long list of propellant-related margins and reserves. In each case, SDVs get to use a number from the “MPS propellant inventory from STS-117 TDDP”, while the non-SDV systems are given an equation to use (which produces numbers significantly higher than those measured in their flights). Not having access to that Trajectory Design Data Package, I have no way of telling how the two compare, but it’s a potential source of bias.

There may be others that I’ve also missed, but that’s enough on that topic.

Why It’s Important to Validate Models
One of the most common rookie engineering mistake when using computational models is to trust the pretty pictures without doing proper validation. The scary thing about things like CFD (which a cynical friend calls “Colorful Fluid Drawings”) is that the computer will often give you a plausible-looking, bug completely bogus answer if you screw up in your assumptions. One of the keys to properly using FEA or CFD is that you need to validate your code and your processes. If you can’t get your analysis to match up with a known, physically measured result, either your code is broken, or you’re making bad assumptions. I ought to repeat that–If Your Model and Reality Don’t Agree, Your Model Is Wrong. Now, in school, when the deadlines are near, professors will often give partial credit even if you screwed up and your model doesn’t match reality…but in the real world, a little bit more is expected.

So, coming back to ESAS, the a group at Marshall had developed a cool software tool for predicting the performance of new rocket stages, called INTROS (INTegrated ROcket Sizing program), that they decided to use for doing an apples-to-apples comparison of all the vehicles. Now, being from Marshall, a lot of assumptions were built into the design based on how Marshall would design a stage, and based on shuttle experience. The problem is that when you ran an existing launch vehicle designed by another group through the wringer, the result that came out was often…slightly reality impaired.

Take the Dual Engine Centaur (DEC) upper stage as an example. On pages 35-37, the appendix goes into details about the “existing” Atlas V and Delta IV stages (as replicated by INTROS), as well Atlas V’s and Delta IVs with a new-and-improved MSFC designed, man-rated upper stage. Here’s the relevant stats according to them:

Usable propellant: 43,840lbm
Dry Mass: 7,159 lbm
Burnout Mass: 9,222 lbm

Now, if you add the difference between the burnout mass and the dry mass (2063lbm) to the “Usable Propellant” number, you get about 45,903lbm, which is right in the ballpark of the numbers typically given for a Centaur stage. The problem is with the Dry Mass figure. 7,159 is substantially higher than any number I’ve ever seen given for a dual engine centaur. Most numbers I’ve seen (including the ones on the Lockheed Martin Atlas V page) give a dry weight for the DEC in the 4700-5400lbm range. The DEC that flew on Titan IV, and which had a slightly higher propellant load (and a more complicated shape) was on the order of 6,600lb. When your model mispredicts the weight of existing stages by almost 50%, it suggests that the model wasn’t really ready for primetime. More importantly, it suggests that the model probably shouldn’t have been used to evaluate existing stages since it was so far out of whack with even publicly available numbers for those stages. I know that some publicly available stage numbers might be inaccurate (notice the range given for DEC weights), but the mere fact that the model was that far off should’ve led to more investigation.

This, by the way, perfectly corroborates the testimony of several people on the LM (now ULA) side of the story, as well as some people working on the NASA side of the EELV program–that the ESAS guys really badly botched the dry weight numbers for the existing EELV upper stages. The explanation I heard at the time was that the NASA guys were given insufficient time during the prep-work for ESAS to get the model calibrated, and once the study started they weren’t allowed to communicate with the EELV companies. The sad thing is that when the ULA guys pointed out that the numbers were wrong, the MSFC manager really believed his own models more than the “as-weighed” numbers from ULA.

When you combine unrealistically high ullage/margin/residual numbers with unrealistically high stage dry masses, it’s unsurprising that the estimated performance to high altitude, high inclination orbits (ISS orbits) was substantially lower than what ULA (and the Air Force) estimate the performance at.

But just in case this travesty weren’t humorous enough, take a look at the bottom of page 35, where they give the numbers for the “new upper stage” that they are always claiming the EELVs need in order to be man-rated. Their 4x RL-10 stage is estimated to weigh 15,115lbm! And that in spite of carrying only about 30% more propellant. I’m not even sure how they can make the stage performance that lousy. We’re talking nearly three times the dry mass of the DEC stage, for only 30% more propellant, and 2 extra engines (that each weigh about 400-500lb each). Even assuming that everything was only designed to a FOS of 1.25 instead of the magical 1.4 (which by the way the Shuttle ET doesn’t meet), that wouldn’t account for much of the weight.

I know that Mike Griffin was claiming that part of the reason for doing Ares-I was to teach NASA how to design launch vehicles again, and I guess we have documented proof of the need.

Seriously though, this is a common rookie mistake. You don’t go basing decisions worth tens of billions of dollars on an unvalidated design tool. Now, this isn’t saying that INTROS is a useless tool, just that it obviously doesn’t capture all the state of the art in stage design, and until it does, it’s results ought to be taken with the appropriate sized (apparently multi-ton) grain of salt.

Anyhow, that’s all for now, but I figured that my claims from my last post deserved some actual documentation and discussion.

The following two tabs change content below.
Jonathan Goff

Jonathan Goff

President/CEO at Altius Space Machines
Jonathan Goff is a space technologist, inventor, and serial space entrepreneur who created the Selenian Boondocks blog. Jon was a co-founder of Masten Space Systems, and is the founder and CEO of Altius Space Machines, a space robotics startup in Broomfield, CO. His family includes his wife, Tiffany, and five boys: Jarom (deceased), Jonathan, James, Peter, and Andrew. Jon has a BS in Manufacturing Engineering (1999) and an MS in Mechanical Engineering (2007) from Brigham Young University, and served an LDS proselytizing mission in Olongapo, Philippines from 2000-2002.
This entry was posted in ESAS. Bookmark the permalink.

22 Responses to More Thoughts on ESAS Appendix Flaws

  1. Dave says:

    I’m no rocket scientist. Heck, I’m not even an engineer. I do however investigate peoples actions, behavior, and statements in an attempt to try and find motive and bias on a daily basis. I would ask the following question. How many errors were made that fall in favor of the existing NASA plan and how many fall in favor of an alternative?

  2. gravityloss says:

    Sad indeed. Could there be some payload adapters or fairings (4 m vs 5 m difference, bottom part etc) included that are normally left off?

  3. Jonathan Goff Jonathan Goff says:

    Dave,
    I don’t think I’ve seen any errors or assumptions that specifically fell in the favor of alternatives. But that doesn’t mean there aren’t any, just that I didn’t notice. The problem with parsing assumptions like this is that it often isn’t 100% obvious what the impact of a given assumption is until you’ve looked through a lot of the analysis. To be fair, there’s a good chance that a lot of these biased assumptions were of the “unintentionally tweaked, because the results didn’t match their intuition” sort, as opposed to intentionally screwing the alternatives sort.

    ~Jon

  4. Jonathan Goff Jonathan Goff says:

    Gravityloss,
    It’s possible. But fairings are usually jettisoned early enough in the flight (typically right after staging) that they are more factored into the first stage mass. Not sure though. It would be interesting to see how/why they came up with the mass numbers they did. Unfortunately many of those details didn’t even make it into the Appendices.

    ~Jon

  5. anon says:

    make sure the centaur weight includes the rl-10s

  6. Anon,
    Well, the as-measured weight for one of the recent Atlas V launches was ~4800lbm, so I’d be surprised if that second RL-10 and it’s support hardware weighs another 2700lbm (380lbm without thrust structures and stuff, and I’ve heard ~500lbm all figured).

    ~Jon

  7. anonymous says:

    Note that even this level of calculation was not required. Inserted mass for existing launch vehicles is published by the vendors to various orbits in their payload user’s guides. For an example, see:

    http://www.lockheedmartin.com/data/assets/ssc/Rev10a_0107.pdf

    Which summarizes performance of various configurations on page 2-31.

  8. David says:

    Crazy!

    So do you know if others have noticed the wildly incorrect dry mass numbers being used for the alternative upper stages?

    Just my opinion, but this doesn’t look like an “accidental error” to me.

  9. Jonathan Goff Jonathan Goff says:

    David,
    To be fair, they still didn’t show enough of their work to be really sure if this was incompetence, hastiness, or dishonesty. But at least now that it’s out in the public that their numbers diverge so far from reality, the burden of proof can be shifted back to them–where it has always belonged.

    ~Jon

  10. Jonathan Goff Jonathan Goff says:

    Anonymous,
    Thanks for the link! If you’re saying that ESAS didn’t need to do the calculations because that data was already public…I can actually see a legitimate reason for wanting to be able to compare data *as produced by your own modeling system* for different options. My main fault is that their numbers didn’t line up with publicly available information, and they didn’t bother to do much validation or to go back and reexamine their conclusions when it was pointed out that they had goofed.

    ~Jon

  11. MG says:

    Regrettably, our government and its agencies often does whatever it wants, and explains it by pointing to rational sounding decisions or methods that are flawed.

    The example of “NASA needs launcher design expertise in house” as a justification for its INTROS. It sounds good, but the launcher design expertise development, if it were really needed “in house”, really needed to be complete before Ares startup, or needed to be completely disconnected from Ares, so that its in-expertise would not impair NASA’s execution of VSE.

  12. Karl Hallowell says:

    The thing that continues to bug me is the safety numbers. For example, they claim that the Shuttled-derived SRM for the Ares I has a roughly 1 in 3700 chance of causing an LOM. The estimate may be right (though historical record of somewhere around 1 in 250 indicates not), but I see no expertise on the planet capable of backing these numbers up. Glancing at other solids in that section (the Atlas V’s solid boosters were claimed to be somewhere under 1 in 400 chance of LOM if I recall correctly), I see a bunch of fantasy numbers.

    The icing on this cake is that even if NASA is right about its safety numbers and Ares I starts as the safest of the options, it won’t stay that way. Sooner or later someone will authorize a compromise of safety and reliability in order to save money. It’s not just what NASA has done historically (just look at the current Ares I for some examples of safety/reliability compromises), but it makes a lot of economic sense too. Saving 10 million over 100 launches while doubling the LOM to 1 in 200 from 1 in 400, economically about breaks even given the current NASA process for recovering from manned vehicle accidents and the assumption that there’s an 80% chance the accident is not fatal. In other words, even if the safety and reliability numbers were correct, the target platform didn’t make sense because that isn’t the vehicle that NASA would build.

    Finally, it has always been my view that the single most important number to look at for determining the safety of a vehicle is the lifetime number of launches. For the Ares I, that would be up to 100 launches while it’d be more for EELV competitors that would also handle private and DoD flights. In my view, that means in practice the EELVs will probably be more reliable and safer than the Ares I no matter how well the Ares I is designed.

  13. Pingback: ESAS Study Proves Biased « The Four Part Land

  14. Axel says:

    Let me play the devils advocate… err, I mean, let me give Griffin and his team the benefit of the doubt: would those other approaches have benefited from allowing higher max-q? Or could it be that these max-q number are meant to be assumptions instead of rules or limits?

    Page 26 explains: “Based upon […] Ground Rules and Assumptions (GR&A) established, a preliminary concept is sized using the Mass Estimating Relationships (MERs) in the INTegrated ROcket Sizing Program (INTROS). An initial trajectory is flown of this vehicle in the Program to Optimize Simulated Trajectories (POST) […] and then the initial vehicle weights and trajectory outputs are sent for more detailed structural sizing with Launch Vehicle Analysis (LVA). Loads, forces, material properties, and design techniques are all considered within the LVA analysis[…]”

    If this is so, it seems only fair to assume higher max-q for those concepts that have only SRM as first stage. And as a consequence the mass penalties you were talking about should then have been included in the LVA analysis automatically.

  15. Karl,
    I agree with most of your points. And in fact there are several other reasons why an ELV that only flies 2-3 times per year is never likely going to demonstrate a 99.25% reliability. But for this post I wanted to stick with things that couldn’t be just spun as opinion. I *think*, and have good reasons why I think the Ares-I reliability numbers are BS. But it’s hard to blow-off as opinion when ESAS makes factually incorrect claims about EELV dry masses.

    ~Jon

  16. Axel,
    Let me play the devils advocate… err, I mean, let me give Griffin and his team the benefit of the doubt: would those other approaches have benefited from allowing higher max-q?

    I don’t think so. Higher max-q doesn’t actually give you any benefits as far as I understand. It drives up structural mass, decreases safety, makes the GN&C harder, and in general doesn’t help. But why do you ask? How would the question of “whether or not other stages would benefit from the exception” make the way they did that exception any less dishonest looking?

    Or could it be that these max-q number are meant to be assumptions instead of rules or limits?

    Seriously, do you think for half a second that if the EELVs failed one of the GR&As that they wouldn’t have been tossed out in a heartbeat? Do you think that they would’ve been given a special exception, or that they would have been treated as mere guidelines?

    If this is so, it seems only fair to assume higher max-q for those concepts that have only SRM as first stage.

    Um…are you saying that you think they meant that GR&A to mean they would start with those max-q numbers as a guess for their analyses? That’s not how I read it at all. Remember, these are *Ground Rules* and assumptions. It’s pretty clear that this was meant as a number that vehicles were not supposed to be allowed to exceed (much like the 4G ascent limit), not that the analysis should assume those max-q numbers.

    And as a consequence the mass penalties you were talking about should then have been included in the LVA analysis automatically.

    If you look at the data shown for the various launch vehicles, it pretty clearly shows that they all used the same LES/LAS mass, regardless of what the max-q for that booster is.

    Am I understanding your questions/statements right? You phrased things in a way that didn’t make a lot of sense to me.

    ~Jon

  17. Dart says:

    On the subject of max q, I really think the GR&A are meant to be the bases for analyses like drag for ascent performance, not assessments of the structural or controllability of the stage. That is the way it has been worked for most of my many years of working with NASA studies like this one.

    Many of the other things you point out are pretty discouraging, particluarly managers who use only what they want to hear. But I think the analytical team did their best with what they had available.

  18. phillip says:

    If the design team did there best…I would hate to think what their worst performance would be like!!! I would not want them to design my next car.

    OK–let’s assume they made mistakes–after the study why did not NASA publish them? We all know why? They knew what a bad and hacket job they did on the EELV’s.

    let’s move to COTS!! Guess what COTS vircheals that were not selected?? Any company that descided to use an EELV.

  19. Axel says:

    Jon,

    You wrote: “Um…are you saying that you think they meant that GR&A to mean they would start with those max-q numbers as a guess for their analyses? That’s not how I read it at all. Remember, these are *Ground Rules* and assumptions.”

    You read them as “Ground Rules”, but to me it would make some sense to interpret them as “Assumptions” for analysis instead, like Dart says.

    I see no reason to rule out the possibility that Griffin and his team did this analysis with best intentions. I agree with those who say that the resulting Ares program is not very much of a vision. But since after Apollo all visionary programs of NASA didn’t go far, so a traditional and down-to-earth approach might have been a sensible management decision. Other political aspects sure did influence the decisions. Us engineers don’t like it, but it’s the way a bureaucracy like NASA works (and has to work to keep itself alive).

    Having said this, let me add that firing engineers who dare to dissent with those management decisions is no way of leadership I sympathize with.

  20. Jonathan Goff Jonathan Goff says:

    Axel,
    I guess my issue is that if that max-q thing was just a “here’s a starting number for analysis”…it seems kind of silly. You can easily get max-q numbers from your first simulation run. Assume no PLF mass for the first run, get the max-q number. Calculate estimated PLF mass to handle that max-q number. Iterate. You don’t need to make some starting assumption. And if you aren’t using the actual max-q numbers from the simulation to drive your structural loading for your final model, you’re also doing it wrong…

    I guess I really don’t see that as being a legitimate interpretation of what they were saying. I really think it’s pretty clear that they were saying that the Stick was given a higher max-q limit than the other vehicles. Any other answer is kind of silly.

    Having said this, let me add that firing engineers who dare to dissent with those management decisions is no way of leadership I sympathize with.

    That we can agree on.

    ~Jon

  21. kit says:

    Regarding the Max-Q objection…

    I can think of a plausible reason a higher max-q may be acceptable for an SRM based rocket, which is simply that they have a more rigid construction than the large fuel tanks that make up the early stages on smaller rockets.

    Much of the max-q is applied to the nose end of the rocket, whereas there is a large amount of thrust coming from the tail end. The result is that the tank walls have to bear very large loads during the ascent in order to stop the rocket from buckling in the middle.

    In a flying-fuel tank design, the tanks are pressurised to provide rigidity, but there are limits on how much you can do this before you have to make the walls thicker to bear the weight. In an SRM-based booster, the walls have to be much thicker in order to contain the combustion pressure. On the other hand, the SRM motor is longer and thinner, and so may resist buckling less well. The bottom line is that you have to do the sums to work out when each design would be likely to fail.

    Why not write to them and see if you get a response? Or perhaps I’ve been in academia for too long.

  22. Eric Collins says:

    In the Ares I design, they are still flying a fuel tank inline with the SRM. So, unless they decided to beef up the second stage tank as well, I don’t think that particular argument would apply.

    In reality, I think the higher max-Q numbers are due to the fact that the SRM cannot be throttled back while the vehicle is moving through the regime with the greatest aerodynamic loading. Currently the SSMEs provide this capability for the Space Shuttle. Hence the “go with throttle up” command heard on every shuttle launch.

Leave a Reply

Your email address will not be published. Required fields are marked *